Skip to content

Time-Travel Queries

Every AVU change made through MESA is recorded as a versioned snapshot in the DuckLake catalog. This means you can ask "what did this look like on date X" of your metadata, not just your data.

Why time-travel?

  • Reproducibility. A paper submitted in 2027 may reference a dataset with annotations that evolve afterwards. Time-travel lets readers see the dataset exactly as it stood at submission.
  • Audit. Who tagged this dataset with sensitive metadata, when, and why? — answered in one query.
  • Debugging. When an analysis result changes unexpectedly, time-travel shows whether the underlying metadata changed too.

Example prompts

What was the biome AVU on /iplant/home/alice/study1/site42.csv on 2026-09-01?

Who removed the GO:0008152 tag from this collection, and when?

Show me every metadata change actor "bob" made in October 2026 across the entire study1 project.

Direct Python access

For SQL-level access, use the DuckLakeClient API:

from datetime import datetime, timezone
from mesa_ducklake import DuckLakeClient

with DuckLakeClient(postgres_dsn=..., irods_session=...) as client:
    snapshot = client.get_avus_as_of(
        project_id=project.project_id,
        irods_path="/iplant/home/alice/study1/site42.csv",
        timestamp=datetime(2026, 9, 1, tzinfo=timezone.utc),
    )

See the upstream time-travel guide for more patterns: range queries, diff between two timestamps, attribution queries, and DuckDB SQL examples that join AVU history with the iRODS path hierarchy.

Limitations

  • Time-travel only sees AVUs that were written through MESA. Direct iRODS imeta add calls bypass the catalog (unless you set up an iRODS rule that delegates to mesa-ducklake record — see the operator guide).
  • The catalog grows over time. Project administrators can prune very old snapshots when they're no longer needed for reproducibility (configurable retention).