Mission and Scope¶
MESA delivers an open-source data-systems prototype that lets researchers across NSF directorates discover, integrate, and reason over heterogeneous scientific data using AI agents grounded in metadata.
The problem¶
Scientists currently spend up to 80 % of their time managing data rather than doing science. Despite massive investments in research computing, most researchers cannot easily apply modern AI tools to their data, and different research communities cannot share data or methods effectively.
The MESA prototype¶
MESA validates three innovations at prototype scale on leveraged NSF cyberinfrastructure:
- AI-powered metadata generation over a DuckLake / Apache Iceberg Lakehouse (Goal 1).
- Cross-domain data integration via a policy-driven iRODS Data Mesh (Goal 2, led by RENCI).
- Agentic AI orchestration with self-hosted vLLM models and Model Context Protocol (MCP) servers (Goal 3).
The 2-year project produces the artifacts, science cases, user community, and empirical hardware specifications required for a follow-on Category I or II NSF operations proposal.
In scope¶
- Prototype implementation of all three innovation goals on leveraged Jetstream-2, CyVerse, and UA HPC cost-recovery infrastructure.
- Three science use cases: UC1 (Agentic Discovery + Multi-Omics), UC2 (Geospatial + sub-PB EHT pipeline demonstration), UC3 (IoT / Edge prototype).
- Early-adopter onboarding for ESIIL, NCEMS, AIIRA — target 100–200 users in Year 1, 300–500 by Year 2.
- 8 community workshops, Educator Fellows program, reproducible tutorials.
- Empirical hardware-spec derivation, sustainability model, and a submission-ready follow-on operations-proposal package by Month 24.
Out of scope / deferred to follow-on¶
- All capital equipment — NVIDIA H200 GPU servers, database servers, OSN Gen-2 storage pods, network switching.
- Network and rack infrastructure.
- 24×7 production operations: continuous monitoring, NOC, security operations, incident response on-call, disaster-recovery testing.
- Production-scale multi-zone iRODS deployment and TACC Corral mirroring at full scale.
- Production-scale 100 PB EHT pipeline (replaced by sub-PB prototype demonstration of the same architecture).
- Workforce expansion to 1,500–2,500 users.
Acceptance criteria¶
The project is complete when:
- Phase 1 prototype (Goals 1–3) is operational on leveraged infrastructure with reproducible IaC templates published under permissive OSS license (Month 12).
- UC1, UC2, UC3 deliver published reproducible workflows with DOIs (Month 18).
- All Year-2 quantitative performance objectives are met or exceeded.
- Phase 2 (continued, Months 18–24) deliverables: empirical hardware spec, sustainability plan, ≥8 letters of support, and submission-ready follow-on Category I/II operations proposal package (Month 24).
- All open-source software repositories, documentation, training materials, and DOIs are hosted on GitHub / CyVerse and accessible after award close.
Assumptions and constraints¶
- ACCESS-CI Jetstream-2 GPU allocation remains available for 24 months (secured in writing).
- CyVerse production iRODS, Data Store (5.2 PB), and cost-recovery infrastructure remain under existing institutional support.
- No capital equipment may be procured under this award; any hardware need is filled by leveraged resources or a modest commercial-cloud overflow buffer.
- The follow-on operations proposal is contingent on NSF program receptivity in late Year 2 and is not a deliverable of this award; the prototype and science cases stand alone as deliverables.