Time-Travel Queries¶
Every AVU change made through MESA is recorded as a versioned snapshot in the DuckLake catalog. This means you can ask "what did this look like on date X" of your metadata, not just your data.
Why time-travel?¶
- Reproducibility. A paper submitted in 2027 may reference a dataset with annotations that evolve afterwards. Time-travel lets readers see the dataset exactly as it stood at submission.
- Audit. Who tagged this dataset with sensitive metadata, when, and why? — answered in one query.
- Debugging. When an analysis result changes unexpectedly, time-travel shows whether the underlying metadata changed too.
Example prompts¶
What was the biome AVU on
/iplant/home/alice/study1/site42.csvon 2026-09-01?Who removed the GO:0008152 tag from this collection, and when?
Show me every metadata change actor "bob" made in October 2026 across the entire study1 project.
Direct Python access¶
For SQL-level access, use the DuckLakeClient API:
from datetime import datetime, timezone
from mesa_ducklake import DuckLakeClient
with DuckLakeClient(postgres_dsn=..., irods_session=...) as client:
snapshot = client.get_avus_as_of(
project_id=project.project_id,
irods_path="/iplant/home/alice/study1/site42.csv",
timestamp=datetime(2026, 9, 1, tzinfo=timezone.utc),
)
See the upstream time-travel guide for more patterns: range queries, diff between two timestamps, attribution queries, and DuckDB SQL examples that join AVU history with the iRODS path hierarchy.
Limitations¶
- Time-travel only sees AVUs that were written through MESA. Direct
iRODS
imeta addcalls bypass the catalog (unless you set up an iRODS rule that delegates tomesa-ducklake record— see the operator guide). - The catalog grows over time. Project administrators can prune very old snapshots when they're no longer needed for reproducibility (configurable retention).