The Discovery Environment for Relational Information and Versioned Assets (DERIVA) is a data-centric platform that treats every file, table row, and analysis result as a first-class, version-controlled asset from the moment it is generated in the lab to its final citation in a paper. By combining a model-driven relational catalog with a versioned object store and open APIs, DERIVA lets scientists curate evolving datasets, automate quality checks, and publish continuously FAIR collections.
Every asset — raw, intermediate, or published — is instantly Findable, Accessible, Interoperable, and Reusable, with schema evolution tracked from experiment design to publication.
A Python library plus GPU-enabled JupyterHub pulls curated datasets, tracks executions, captures configs & results, and pushes them back with provenance — ideal for reproducible AI pipelines
Faceted, full-text, and model-aware search lets scientists pinpoint records across billions of rows in milliseconds.
Built-in dashboards render interactive plots, heat maps, and dimension-reduced embeddings directly from live project data—no exporting required.
Scientists can load and publish their own data while hub curators review; FaceBase used this model to surpass 1,000 datasets and 30 projects in two years.
Command-line and Python tools bulk-load any file types, attach controlled-vocabulary metadata, and trigger automated QC dashboards for “self-curation” at scale.
Hatrac object store + BagIt/BDBag packages + Minid persistent IDs guarantee fixity, trace every revision, and make dataset exchange reproducible and cache-friendly.
[Globus Auth & Groups] let projects enforce reader/writer/curator roles, embargoes, and single-sign-on with ORCID, Google, or campus IDs—critical for cross-institution work.
DERIVA introspects an ER model and auto-generates rich search, edit, and visualization pages. Define or tweak your entity-relationship model and DERIVA auto-generates a rich web UI.
Micro-services with public APIs, style/theming hooks, and adaptive UI let each consortium stand up a branded portal yet share the same battle-tested core.
Deployed in neuroscience, craniofacial biology, ophthalmology ML, and more—demonstrating adaptability and accelerated discovery across disciplines.