About WOBD

Research infrastructure for queryable, AI-ready biomedical data.

The Web of Biological Data federates harmonized biomedical dataset metadata with Proto-OKN knowledge graphs so researchers can move from a question to mechanisms, diseases, exposures, genes, and supporting datasets in one reproducible workflow.

WOBD is supported by the U.S. National Science Foundation under award #2535091 and is part of the Proto-OKN federation.

The value proposition

Biomedical data resources are often funded, curated, and queried separately. WOBD makes those investments work together by giving datasets and knowledge graphs a shared query plane that both humans and AI assistants can use.

Why continued support compounds

WOBD is not a single-purpose application. Each graph, repository, identifier bridge, and metadata extension expands the set of questions that can be asked across the entire federation without rebuilding a bespoke integration.

Growth plan

The next stage of WOBD should focus on durable infrastructure: more sources, richer metadata, stronger cross-graph identity, and evaluation that keeps AI-assisted discovery inspectable.

Broader metadata ingestion

Extend beyond the current NDE-centered layer into additional domain-specific and generalist repositories, especially clinical, environmental, and multi-omic resources that users otherwise search separately.

Richer dataset descriptions

Capture sample-level annotation, study provenance, assay context, contrasts, and analysis-ready relationships so WOBD can return more actionable answers without pushing users back into every source portal.

Cross-graph identifiers and evaluation

Strengthen mappings across genes, diseases, chemicals, organisms, datasets, and publications, then benchmark recurring workflows so AI-mediated answers remain auditable and reproducible as the federation grows.

What support unlocks

More repositories become discoverable through one query plane.
More scientific questions become reusable workflows instead of one-off integrations.
More AI-assistant answers carry graph provenance, query traces, and source records.
More program investments become interoperable with the broader Proto-OKN ecosystem.

Current user-facing surfaces

Guided query templates

Researchers fill in terms for dataset discovery, drug-related datasets, and gene-expression questions. WOBD generates validated graph queries and returns table or dataset-card results with query traces.

Unified MCP server

AI assistants can discover relevant graphs, inspect schemas, bridge identifiers, run SPARQL, and synthesize answers across the wider Proto-OKN federation from one conversation.

Team

Scripps Research

  • Trish Whetzel
  • Ben Good
  • Andrew Su
  • Ginger Tsueng

RENCI

  • Chris Bizon
  • Jim Balhoff
  • Yaphet Kebede

UCSD / UCSF

  • Peter Rose

Technical foundation

The primary structured dataset metadata layer comes from the NIAID Data Ecosystem Discovery Portal (NDE), which harmonizes dataset records from domain-specific and generalist repositories. Metadata harvested from that pipeline is published as the NDE graph and loaded alongside other graphs in the federation.

WOBD also uses data from the EMBL-EBI Gene Expression Atlas (GXA), emitting study metadata, contrasts, genes, and pathway enrichment as linked data so expression evidence can be queried with dataset metadata and other knowledge graphs.

Knowledge graphs are listed in the OKN Registry, and the OKN Fabric exposes those graphs through a SPARQL federation so they can be queried individually or together.

Discuss WOBD growth or collaboration

For programmatic questions, collaboration opportunities, or support discussions, contact the WOBD team.

Contact the team