Skip to main content

Data Discovery Studio

Licensed according to this deed.

Published on


Data Discovery Studio indexes over 1.6 million geoscience resources from 40+ academic, government and international scientific data repositories and catalogs. It works across geoscience domains, letting users find the data they need via text-based, spatio-temporal and faceted search. The system relies on the CINERGI (Community Inventory of EarthCube Resources for Geoscience Interoperability) metadata augmentation pipeline which uses text analytics and a large geoscience ontology and gazetteers to automatically improve resource metadata, adding keywords with ontology references, missing spatial and temporal extents, organization identifiers, and more. Besides search, Data Discovery Studio offers a working environment for researchers who want to contribute their own resource descriptions, edit existing metadata, and trace metadata provenance. In addition, it represents a gateway to additional computational resources: any found resource metadata can be passed to a collection of Jupyter notebooks residing on several JupyterHub servers for further data exploration, visualization, and modeling. Data Discovery Studio re-publishes all harvested resource metadata in standard formats, including ISO-19115 and markup, and makes them available via a standard API for inclusion in other systems. Data Discovery Studio has been funded through the NSF EarthCube initiative.



JAVA, Javascript, Python

Email; @DataDiscStudio




JAVA, Javascript, Python



Cinergi Organizational Source Code Repository:



Cite this work

Researchers should cite this work as follows:

  • (2024), "Data Discovery Studio,"

    BibTex | EndNote