The Environmental Data Initiative is continuing development of the Information Management Code Registry (IMCR) with a hackathon at UNM in Albuquerque, NM on June 11 – June 13, 2019.
The theme of this hackathon is “Data Visualization for Assessing Data Fitness for Use.” During this hackathon we will collaboratively develop applications that will accept as input an EML document along with a dataset (a complete data package) and will output visualizations of the data. We may approach this problem in a couple of different ways:
- As a starting point, the application could produce a web page that shows all variables plotted against each other.
- We could analyze the set of terms used in the corpus of EDI metadata to label “datetime” variables, or variables designating treatments, or other common categorical variables, and make the application “smart” enough to use these variables appropriately as X axis variables.
- We could develop a web page that reads the EML document and outputs all the variables in the dataset to the web page, from which the user could choose variables to plot on X and Y axes. Categorical variables could be selected for the X axis for bar graphs, for instance. The web page would then produce the desired plots.
The plots returned by any of these mechanisms will allow a user to see outliers, large gaps in the data, and other issues that affect the dataset’s fitness for reuse for a particular purpose.
While code development is the primary goal of this workshop, we also aim to build a community that supports the idea of sharing code and enabling information managers to work more efficiently.
Questions? Please contact us at email@example.com.