Accelerating discovery through every stage of the data lifecycle

Homepage

The Discovery Environment for Relational Information and Versioned Assets (DERIVA) is a data-centric platform that treats every file, table row, and analysis result as a first-class, version-controlled asset from the moment it is generated in the lab to its final citation in a paper. By combining a model-driven relational catalog with a versioned object store and open APIs, DERIVA lets scientists curate evolving datasets, automate quality checks, and publish continuously FAIR collections.

FAIR Lifecycle icon

Continuous FAIR Data Lifecycle

Every asset — raw, intermediate, or published — is instantly Findable, Accessible, Interoperable, and Reusable, with schema evolution tracked from experiment design to publication.

Model-Driven UI icon

Reproducible ML and Informatics Workflows

A Python library plus GPU-enabled JupyterHub pulls curated datasets, tracks executions, captures configs & results, and pushes them back with provenance — ideal for reproducible AI pipelines

 logo

Sophisticated search

Faceted, full-text, and model-aware search lets scientists pinpoint records across billions of rows in milliseconds.

 logo

Data visualizations

Built-in dashboards render interactive plots, heat maps, and dimension-reduced embeddings directly from live project data—no exporting required.

 logo

Self-Service Curation at Scale

Scientists can load and publish their own data while hub curators review; FaceBase used this model to surpass 1,000 datasets and 30 projects in two years.

 logo

Flexible Ingest & Metadata QC

Command-line and Python tools bulk-load any file types, attach controlled-vocabulary metadata, and trigger automated QC dashboards for “self-curation” at scale.

 logo

Versioned, Provenance-Aware Storage

Hatrac object store + BagIt/BDBag packages + Minid persistent IDs guarantee fixity, trace every revision, and make dataset exchange reproducible and cache-friendly.

 logo

Fine-Grained Federated Access Control

[Globus Auth & Groups] let projects enforce reader/writer/curator roles, embargoes, and single-sign-on with ORCID, Google, or campus IDs—critical for cross-institution work.

 logo

Model-Driven Interface (Chaise)

DERIVA introspects an ER model and auto-generates rich search, edit, and visualization pages. Define or tweak your entity-relationship model and DERIVA auto-generates a rich web UI.

 logo

Loosely-Coupled, Brandable Architecture

Micro-services with public APIs, style/theming hooks, and adaptive UI let each consortium stand up a branded portal yet share the same battle-tested core.

 logo

Proven Multi-Domain Track Record

Deployed in neuroscience, craniofacial biology, ophthalmology ML, and more—demonstrating adaptability and accelerated discovery across disciplines.