A first step on harmonizing meteorological and hydrological data in the EDI data repository was a workshop on the “Next generation climate/hydrological data products” (March 12-14, 2019) at the University of New Mexico in Albuquerque, NM. The workshop was jointly organized by EDI, LTER and the Forest Service with the goal of developing a strategy for harmonizing weather, climate and hydrological data that are currently located in the EDI data repository and in the ClimDB/HydroDB (a centralized server to provide open access to long-term meteorological and streamflow records from a collection of research sites).
The necessity arose from the fact that the ClimDB/HydroDB software is aging and too difficult to maintain. Also, available data are limited in terms of parameters and time resolution. The group decided to archive all data currently residing in ClimDB/HydroDB in the EDI data repository, to convert (harmonize) the data into a common data model and continue the ClimDB/HydroDB functionality into the future with other software products that support visualization, filtering and analysis of the data packages.
A data harmonization framework, developed by EDI will be used that is currently successfully applied to designing data packages for community survey data: ecocomDP. Figure 1 shows a schematic of the concept. Archived raw data (level 0 – L0) are converted to a common harmonized data model (level 1 – L1). The L1 data allow for a straightforward data discovery and conversion into derived data products (level 2 – L2) in support of synthesis and other cross site studies.
A number of data models commonly used in the research community for harmonizing meteorological and hydrological data were reviewed and discussed. The group suggested to evaluate the ODM data model for time series data as the L1 data model. The ODM was developed and is widely used by the Consortium of Universities for the Advancement of Hydrologic Science (CUHASI). CUAHSI is a data platform with a workspace that provides tools for visualization, analysis and might provide some of the ClimDB/HydroDB plotting functionality. The groups intent is to develop other software products for visualization through online hackathons.
A first draft of a workflow was designed for converting all ClimDB/HydroDB products (L0) as well as meteorological and hydrological data in the EDI repository (raw L0) archive those in the EDI data repository as L1 data packages. If the ODM data model is adopted, the data packages will also be available in CUAHSI, hopefully with comparable functionality to ClimDB/HydroDB (see figure 2 for the conceptual workflow).
Wade Sheldon demonstrated how the GCE Data Toolbox can be applied for the conversion of L0 data packages to the L1 data model. Margaret O’Brien led the discussion on semantic mappings between important terms in different vocabularies used for archiving meteorological and hydrological data (ClimDB/HydroDB, LTER, AMS, ENVO, CF, EnvThes, ODM, NCEI), in order to pick a candidate vocabulary for the L1 data model (initially CUAHSI ODM vocabulary). Vocabularies are important for defining suitable keywords at the data package level and thereby enhance data discoverability in the EDI repository and via Google’s data search.
- Strategic workshop on “Next generation climate/hydrological data products” (organized by EDI, LTER, Forest Service) for developing a strategy on harmonizing weather, climate and hydrological data in the EDI data repository and ClimDB/HydroDB (University of New Mexico, Albuquerque, NM, March 2019).
- The results of the workshop were presented at an LTER Information Manager Water Cooler, April 9, 2019.
- EDI webinar on June 18, 2019 “CUAHSI Tools for Data Management”, speaker: Martin Seul, Technical Director CUAHSI.
- Working Session at ESIP Summer Meeting (Tacoma, WA, July 2019): “Preparing climate and hydrological time series data for submission to CUAHSI”. Moderator: Corinna Gries (EDI); Speakers: M. Seul, W. Sheldon, M. O’Brien (EDI), S. Remillard.