Skip to content

fix: clean up table_version entries on drop_table to fix cascade drop_namespace#6290

Open
XuQianJin-Stars wants to merge 1 commit intolance-format:mainfrom
XuQianJin-Stars:bugfix/drop-namespace-cascade
Open

fix: clean up table_version entries on drop_table to fix cascade drop_namespace#6290
XuQianJin-Stars wants to merge 1 commit intolance-format:mainfrom
XuQianJin-Stars:bugfix/drop-namespace-cascade

Conversation

@XuQianJin-Stars
Copy link
Copy Markdown
Contributor

@XuQianJin-Stars XuQianJin-Stars commented Mar 25, 2026

What

Clean up table_version entries from the __manifest table during drop_table to fix cascade drop_namespace failures.

Why

When table_version_storage_enabled is true, version records are stored in the __manifest table with object_type='table_version' and object_id='{table_object_id}${version}'. Previously, drop_table only deleted the table entry itself (matching object_id exactly) but left all associated version entries behind.

This caused drop_namespace with Cascade behavior to fail because it counted these orphaned table_version entries as child objects of the namespace — the namespace appeared non-empty even after all tables had been dropped.

How

1. Refactor: Extract delete_by_filter helper

The existing delete_from_manifest method contained inline logic for executing a delete against the manifest dataset and running subsequent inline optimization. This logic was extracted into a shared private method delete_by_filter(&self, filter: &str, error_context: &str) so it can be reused.

2. New method: delete_all_table_versions_for_table

Added delete_all_table_versions_for_table(&self, table_object_id: &str) which constructs a filter:

This matches all version entries for the given table using the starts_with predicate on the object_id column (since version object IDs are formatted as {table_object_id}${zero_padded_version}). The method delegates to delete_by_filter and is a no-op when no matching rows exist.

3. Call cleanup in both drop_table and deregister_table

  • In drop_table: after deleting the table entry from the manifest, delete_all_table_versions_for_table is called to remove all version records before deleting the physical data directory.
  • In deregister_table: same cleanup is performed to prevent orphaned version entries from blocking future drop_namespace calls.

4. Tests

Added two new parameterized tests (#[case::with_optimization(true)] / #[case::without_optimization(false)]):

  • test_drop_table_with_versions_then_drop_namespace: End-to-end scenario — creates a namespace, creates a table, inserts a table_version entry, drops the table, then verifies drop_namespace succeeds (previously it would fail).
  • test_delete_all_table_versions_for_table: Unit-level verification that delete_all_table_versions_for_table removes all version entries without affecting the table entry itself.

Testing

  • New unit tests cover the fix scenario
  • Both inline-optimization-enabled and disabled paths are tested via #[rstest] parameterization

@github-actions github-actions bot added the bug Something isn't working label Mar 25, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR Review: fix: clean up table_version entries on drop_table to fix cascade drop_namespace

The fix correctly identifies the root cause: orphaned table_version entries preventing drop_namespace from succeeding. A few issues to address:

P0: No tests

Per project policy: All bugfixes and features must have corresponding tests. We do not merge code without tests.

This PR needs at minimum:

  1. A test that creates a table with table_version_storage_enabled, drops the table, then successfully drops the namespace (the scenario described in the PR body).
  2. A unit test for delete_all_table_versions_for_table verifying version entries are removed and the table entry itself is not affected.

P1: Unnecessary pre-scan doubles I/O

The new method does a full scan just to count rows before deleting. delete_from_manifest (line 991) does not do this -- it just runs the DeleteBuilder directly. The count is only used for a log::info message. This doubles the I/O cost for no functional benefit. Suggest removing the pre-scan and just executing the delete unconditionally (matching the existing delete_from_manifest pattern).

P1: Code duplication -- extract a shared helper

The delete pattern (get dataset -> DeleteBuilder -> set_latest -> run_inline_optimization) is now copy-pasted in three places: delete_from_manifest, delete_all_table_versions_for_table, and the batch delete method above. Consider extracting a private delete_by_filter helper and having all three call it. This would also eliminate the pre-scan issue above for free.

P1: deregister_table not updated

deregister_table (line ~2437) also removes a table from the manifest but does not call delete_all_table_versions_for_table. If a table is deregistered and the namespace is subsequently dropped, the same orphan problem will occur. Is this intentional? If so, please add a comment explaining why.

…_namespace

When table_version_storage_enabled is true, version records are stored in the
__manifest table with object_type='table_version'. Previously, drop_table only
deleted the table entry itself but left these version entries behind. This caused
drop_namespace with Cascade behavior to fail because it counted these orphaned
version entries as child objects of the namespace.

This commit adds delete_all_table_versions_for_table() which removes all
table_version entries matching the table's object_id prefix during drop_table,
ensuring the namespace can be cleanly deleted afterward.
@XuQianJin-Stars XuQianJin-Stars force-pushed the bugfix/drop-namespace-cascade branch from 68d8d58 to 019175b Compare March 25, 2026 07:07
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 25, 2026

Codecov Report

❌ Patch coverage is 93.33333% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance-namespace-impls/src/dir/manifest.rs 93.33% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant