Skip to main content

Data Discovery Studio

Licensed according to this deed.

Published on

Abstract

Data Discovery Studio indexes over 1.6 million geoscience resources from 40+ academic, government and international scientific data repositories and catalogs. It works across geoscience domains, letting users find the data they need via text-based, spatio-temporal and faceted search. The system relies on the CINERGI (Community Inventory of EarthCube Resources for Geoscience Interoperability) metadata augmentation pipeline which uses text analytics and a large geoscience ontology and gazetteers to automatically improve resource metadata, adding keywords with ontology references, missing spatial and temporal extents, organization identifiers, and more. Besides search, Data Discovery Studio offers a working environment for researchers who want to contribute their own resource descriptions, edit existing metadata, and trace metadata provenance. In addition, it represents a gateway to additional computational resources: any found resource metadata can be passed to a collection of Jupyter notebooks residing on several JupyterHub servers for further data exploration, visualization, and modeling. Data Discovery Studio re-publishes all harvested resource metadata in standard formats, including ISO-19115 and schema.org markup, and makes them available via a standard API for inclusion in other systems. Data Discovery Studio has been funded through the NSF EarthCube initiative.

Integrations

Languages

JAVA, Javascript, Python

Email

datadiscoverystudio@gmail.com; @DataDiscStudio

Site

Integrations

Languages

JAVA, Javascript, Python

Sponsors

Software

Cinergi Organizational Source Code Repository: https://github.com/cinergi

Users

cyoun

Cite this work

Researchers should cite this work as follows:

  • (2024), "Data Discovery Studio," https://sciencegateways.org/resources/datadiscoverystudio.

    BibTex | EndNote

Tags