Skip to content

Metadata with Ontologies

MESA's metadata workflow is ontology-aware. Instead of free-text tags like "soil", MESA encourages you to use ontology terms like ENVO:00001998 ("soil"), which are stable, machine-readable, and interoperable across domains.

Why ontologies?

  • Disambiguation. "Bear" can mean a mammal, a stuffed toy, or "to carry." NCBITaxon:9632 (the species Ursus) is unambiguous.
  • Cross-domain integration. A genomicist's GO:0008152 ("metabolic process") and an environmental scientist's ENVO:00002259 ("soil environment") can be linked through shared ontology relationships.
  • Search and inference. Asking "which datasets are sampled from any type of forest?" works only if your tags are linked to a hierarchy (ENVO:00000111 and its descendants).

How mesa-mcp helps

The mesa_ols_* tool family is backed by the EMBL-EBI Ontology Lookup Service (OLS) and exposes:

Tool What it does
mesa_ols_search Free-text search over OLS — returns candidate terms
mesa_ols_get_term Fetch a term's metadata (label, definition, parents)
mesa_ols_get_descendants Get all descendants — useful for "any kind of X"
mesa_ols_suggest_avu Heuristic mapping from a free-text suggestion to a canonical term

A typical agent prompt:

I want to tag these soil-microbiome files with the right ENVO biome. Help me pick.

The agent calls mesa_ols_search("biome") → presents candidates → after you pick one (e.g. ENVO:00000446 "sandy desert"), calls ds_add_avu to write it as {attribute: "ENVO:biome", value: "sandy desert", unit: "ENVO:00000446"}. The change is mirrored into the Lakehouse catalog automatically.

Common ontologies

Ontology Use for
ENVO — Environment Ontology Biomes, habitats, environmental contexts
GO — Gene Ontology Biological processes, molecular functions, cellular components
ChEBI Chemical compounds
NCBITaxon Organism taxonomy
EDAM Bioinformatics data formats, operations
PATO Phenotypic qualities
DOID Human diseases
UBERON Cross-species anatomy

The OLS hosts hundreds more. Use mesa_ols_search to discover.

AVU conventions

MESA standardizes on a simple AVU schema for ontology terms:

Field Convention Example
attribute <ONTOLOGY>:<predicate> ENVO:biome
value Human-readable label sandy desert
unit Ontology IRI or CURIE ENVO:00000446

This keeps free-text human labels searchable while preserving the canonical IRI for machine reasoning.

Browsing AVU history

After tagging, you can ask:

Show me how the biome tags on this dataset have changed over the last three months.

The agent calls mesa_ducklake_get_avu_history and renders a timeline. See Time-Travel Queries for more patterns.