1) Repository interoperability challenges (Jones) 20 minutes
technical: identifier practices, mutability, duplication, versioning and derived data variants, built infrastructure
socio-cultural: open source & open communities, NIH syndrome, tech leapfrogging, so many standards to choose from
DataONE crosswalk/integration experiences
2) Case studies in interoperability challenges
EDI / BCO-DMO (Gries) (10 minutes)
BCO-DMO / R2R / NCEI (Shepherd) (10 minutes)
Arctic Data Center / IARC/ EDI / LTER (Jones) (10 minutes)
3) Brainstorming, Discussion and Q&A (Shepherd moderates) (40 minutes)
What are the easy interoperability wins?
What are the hard interoperability challenges?
What does it take to build an open community where:
Many repositories implement the same API, share identifier and versioning models, and can replicate content without creating new identifiers, and can be searched from a common system like DataONE?
The Environment Ontology (ENVO) is a community ontology for the machine-readable representation of environmental entities. ENVO has been built along the best practices of the Open Biological and Biomedical Ontology Foundry and Library, thus reuses and aligns to a suite of existing ontologies to express environmental entities such as geographic, astronomical, and anthropogenic features as well as the processes they participate in. The ontology’s initial uses were in the life sciences, and thus focused on entities such as biomes and ecosystems. It has become a standard resource in the genomes and microbiome communities, and is steadily being adopted in other disciplines. Most recently, ENVO has seeded and interoperates with ontologies in the domains of agronomy, food science, and - in collaboration with UN Environment - the Sustainable Development Goals. It also is providing semantic expression for a number of existing and emerging standard vocabularies, extending their functionality.
Pier Luigi will discuss typical usage scenarios for the Environment Ontology, including its recent deployment in the UNESCO/IOC-IODE Ocean Best Practice repository and an example of combining ENVO and Gene Ontology to mobilise data in environmental genomics.
Jupyter Notebooks have been developed by the HDF Group and many others to help scientists and other users understand how to use HDF to create and access datasets in many disciplines. HDF Lab is a tool for bringing these resources together with data in the cloud. The Lab will include sample datasets and notebooks that use them to demonstrate HDF capabilities at many levels. It will also be a place for sharing data examples and related notebooks from users in many disciplines. ESIP members will play an important role in building this resource and ensuring that it is a useful forum for sharing community expertise. Please join us at the ground level to make sure it works.