- All data package “landing pages” in our test environment Data Portal are now labeled with a “Test Data Package” watermark so they will not be mistaken for the real McCoy. Only data packages displayed on the primary Data Portal (https://portal.edirepository.org/nis) should be considered bone-fide and a published (with a “real” DOI) data package.
The EDI data portal has added a new informational check to the suite of data congruence checks it runs on every data package submitted. This check is information only and the community felt it was valuable, so we released it outside the regular twice-yearly schedule on July 11th, 2018. For detailed information contact email@example.com. Continue reading “New informational check released for EDI data portal”
PASTA’s date and time format quality check took a step out of the box this past month and used Python to assist in parsing preferred date and time formats and generating regular expression strings that are used for validating date and time data. The date and time format quality check initially relied on Java 8’s new date and time library for interpreting and validating data documented by the Ecological Metadata Language date and time schema, but inconsistencies in how Java 8 handled the ISO 8601 standard required an innovative approach to the problem. Instead of relying strictly on the Java date and time library to validate data, EDI software developers used Python’s Parsimonious package to parse and generate the preferred date and time format strings into Java 8 usable regular expressions. These Python generated “reg-exs” are used by Java to validate date and time data that are being uploaded to the EDI Data Repository. This out-of-the-box solution provides a unique, but simple solution for handling one of the most common data formats seen by EDI.
A new feature has been developed for the EDI Repository that supports the association of data packages in the repository with journal articles that cite them. For a given data package (as specified by its package identifier), a logged-in user of the EDI Data Portal can enter information on a web form about a journal article in cases where the data package or its associated data are cited by the article. Journal articles that used that dataset, but did not cite the dataset DOI, may also be included. Only the Digital Object Identifier (DOI) or the URL of the journal article needs to be entered, though a user may optionally include the article title and the journal title. This information is permanently stored within the EDI Repository (unless later deleted by the same logged-in user) as well as incorporated into the DOI metadata for a data package that is sent to DataCite. It is also displayed on the summary page for the data package in the EDI Data Portal, e.g., https://portal.edirepository.org/nis/mapbrowse?scope=knb-lter-sbc&identifier=74, which has been used by three papers.
In 2016, EDI reconstituted a working group to define, prioritize, review and test EML Congruence Checks, and to serve as the information conduit for the larger community. A report was given to the LTER IMC at their annual meeting, in Bloomington IN, in the summer of 2017. All ECC material is stored in github: https://github.com/EDIorg/ecc. This update describes our experiences with checks introduced or considered in 2017. Continue reading “EML Congruence Checker (ECC)”