fix: clean up table_version entries on drop_table to fix cascade drop_namespace#6290
fix: clean up table_version entries on drop_table to fix cascade drop_namespace#6290XuQianJin-Stars wants to merge 1 commit intolance-format:mainfrom
Conversation
PR Review: fix: clean up table_version entries on drop_table to fix cascade drop_namespaceThe fix correctly identifies the root cause: orphaned table_version entries preventing drop_namespace from succeeding. A few issues to address: P0: No tests Per project policy: All bugfixes and features must have corresponding tests. We do not merge code without tests. This PR needs at minimum:
P1: Unnecessary pre-scan doubles I/O The new method does a full scan just to count rows before deleting. delete_from_manifest (line 991) does not do this -- it just runs the DeleteBuilder directly. The count is only used for a log::info message. This doubles the I/O cost for no functional benefit. Suggest removing the pre-scan and just executing the delete unconditionally (matching the existing delete_from_manifest pattern). P1: Code duplication -- extract a shared helper The delete pattern (get dataset -> DeleteBuilder -> set_latest -> run_inline_optimization) is now copy-pasted in three places: delete_from_manifest, delete_all_table_versions_for_table, and the batch delete method above. Consider extracting a private delete_by_filter helper and having all three call it. This would also eliminate the pre-scan issue above for free. P1: deregister_table not updated deregister_table (line ~2437) also removes a table from the manifest but does not call delete_all_table_versions_for_table. If a table is deregistered and the namespace is subsequently dropped, the same orphan problem will occur. Is this intentional? If so, please add a comment explaining why. |
…_namespace When table_version_storage_enabled is true, version records are stored in the __manifest table with object_type='table_version'. Previously, drop_table only deleted the table entry itself but left these version entries behind. This caused drop_namespace with Cascade behavior to fail because it counted these orphaned version entries as child objects of the namespace. This commit adds delete_all_table_versions_for_table() which removes all table_version entries matching the table's object_id prefix during drop_table, ensuring the namespace can be cleanly deleted afterward.
68d8d58 to
019175b
Compare
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
What
Clean up
table_versionentries from the__manifesttable duringdrop_tableto fix cascadedrop_namespacefailures.Why
When
table_version_storage_enabledistrue, version records are stored in the__manifesttable withobject_type='table_version'andobject_id='{table_object_id}${version}'. Previously,drop_tableonly deleted the table entry itself (matchingobject_idexactly) but left all associated version entries behind.This caused
drop_namespacewithCascadebehavior to fail because it counted these orphanedtable_versionentries as child objects of the namespace — the namespace appeared non-empty even after all tables had been dropped.How
1. Refactor: Extract
delete_by_filterhelperThe existing
delete_from_manifestmethod contained inline logic for executing a delete against the manifest dataset and running subsequent inline optimization. This logic was extracted into a shared private methoddelete_by_filter(&self, filter: &str, error_context: &str)so it can be reused.2. New method:
delete_all_table_versions_for_tableAdded
delete_all_table_versions_for_table(&self, table_object_id: &str)which constructs a filter:This matches all version entries for the given table using the
starts_withpredicate on theobject_idcolumn (since version object IDs are formatted as{table_object_id}${zero_padded_version}). The method delegates todelete_by_filterand is a no-op when no matching rows exist.3. Call cleanup in both
drop_tableandderegister_tabledrop_table: after deleting the table entry from the manifest,delete_all_table_versions_for_tableis called to remove all version records before deleting the physical data directory.deregister_table: same cleanup is performed to prevent orphaned version entries from blocking futuredrop_namespacecalls.4. Tests
Added two new parameterized tests (
#[case::with_optimization(true)]/#[case::without_optimization(false)]):test_drop_table_with_versions_then_drop_namespace: End-to-end scenario — creates a namespace, creates a table, inserts atable_versionentry, drops the table, then verifiesdrop_namespacesucceeds (previously it would fail).test_delete_all_table_versions_for_table: Unit-level verification thatdelete_all_table_versions_for_tableremoves all version entries without affecting the table entry itself.Testing
#[rstest]parameterization