Request for Comments for Information Management Code Registry

Request for Comments (RFC) — 11/01/2017

Implementing an Information Management Code Registry

To: Ecological Information Managers

From:  Environmental Data Initiative (EDI)

Background:

A code registry for information management (IM) in the environmental sciences would be a valuable community resource.  While substantial code has been written for cleaning, manipulating, formatting, documenting, and archiving environmental datasets, this work has most often been done in isolation and with idiosyncratic application thereby duplicating effort and creating non-generalized software tools.  A code registry would help developers share their work and encourage creation of robust, generalized, and shareable software tools for others.  Access to these resources would improve information processing efficiency of projects and organizations across the environmental sciences.

Recommendations:

  1. An online registry of data management code should be created.  The registry should link to the code wherever it is stored in github or other location.
  2. Code included in the registry may range from code snippets (e.g., an R function that will query the ORCID API for ID’s for a list of scientists) to software with multiple functions (e.g., GCE LTER Matlab Toolbox).
  3. The IM Code Registry should be implemented using existing registry software to avoid duplicating efforts.
  4. Code should be useful for conducting tasks related to processing environmental data, not including ‘omics’ data, for which other code registries exist.
  5. Discoverability and usability of code should be ensured through appropriate documentation in the registry.
  6. A committee should be established whose goal would be to answer IM community questions about submitting code and also to ensure that code submissions are appropriate for the IM Code Registry.

Implementation:

  1. Establish Information Management Code Registry as a portal using the Ontosoft architecture and website (http://www.ontosoft.org/portals).
  2. Code contributors will document their code using the Ontosoft Ontology.  The OntoSoft ontology is an ontology for scientific software metadata (http://ontosoft-earthcube.github.io/ontosoft/ontosoft%20ontology/v1.0.1/doc/)
  3. Extend the keywords used in Ontosoft to include terms related to IM tasks (e.g., quality assurance).
  4. Collaborate with the Ontosoft project to develop a crosswalk/implementation of the emerging codemeta standard (https://github.com/codemeta/codemeta).
  5. Sponsor hackathon to kick off the population of the IM Code Registry.
  6. Establish a Google group  in order to build an IM Code Registry community.
  7. Advertise the IM Code Registry through ESIP and other venues.

RFC Process

Please send comments to Kristin Vanderbilt (krvander@fiu.edu) and Colin Smith (colin.smith@wisc.edu) before Nov. 15, 2017.