Skip to content

MESA — Metadata-Enabled Scientific Agents

An open-source Data Lakehouse, federated Data Mesh, and Agentic AI orchestration layer on leveraged NSF cyberinfrastructure.

MESA is a 2-year NSF IDSS prototype led by the University of Arizona with subaward to UNC Chapel Hill / RENCI. The platform helps researchers across NSF directorates discover, integrate, and reason over heterogeneous scientific data using AI agents grounded in metadata.

Get started

  •   Early adopters

    ESIIL, NCEMS, AIIRA and other NSF science teams onboarding to the prototype.

    Onboarding guide →

  •   Get access

    Authentication via CyVerse / CILogon / ORCID, plus ACCESS-CI for Jetstream-2 GPUs.

    Access & allocations →

  •   Run your first analysis

    Connect an MCP-aware AI client to MESA and run an end-to-end discovery workflow.

    Quick start →

  •   Get help

    Slack, GitHub issues, and the project mailing list for early-adopter support.

    Contact →

What MESA delivers

MESA validates three innovations at prototype scale:

  •   Goal 1 — Data Lakehouse

    DuckLake + Apache Iceberg + Parquet with time-travel versioning and sub-second analytical queries on Jetstream-2 / CyVerse / UA HPC.

    Lakehouse component →

  •   Goal 2 — Federated Data Mesh

    Policy-driven iRODS Data Mesh with CILogon / Globus / ORCID federation, led by RENCI under Co-PI Russell.

    Data Mesh component →

  •   Goal 3 — Agentic AI Orchestration

    vLLM serving on Jetstream-2 GPUs, retrieval-augmented generation (RAG), Model Context Protocol (MCP) servers, sandbox security.

    Agentic AI component →

Science use cases

  • UC1 — Agentic Discovery + Multi-Omics

    AI-driven discovery across genomic, transcriptomic, proteomic, and metabolomic data.

    Read more →

  • UC2 — Geospatial + EHT

    Geospatial analytics plus a sub-PB Event Horizon Telescope pipeline demonstration.

    Read more →

  • UC3 — IoT / Edge

    Edge-computing pipeline for precision agriculture sensor networks.

    Read more →

Project at a glance

Program NSF IDSS (Innovations in Data-Intensive Science and Systems)
Period June 1, 2026 – May 31, 2028 (24 months)
Lead University of Arizona — PI Tyson L. Swetnam
Subaward UNC Chapel Hill / RENCI — Co-PI T. Russell
Partners ESIIL · NCEMS · AIIRA · ACCESS-CI / Jetstream-2 · CyVerse · TACC · OSN

Documentation status

This site is under active development through Phase 1 (Months 1–12). Sections marked Draft are placeholders being filled in by the team — expect rapid iteration.