A pair of new EML Congruence Checks will ensure the intended data file(s) are supplied by comparing precise file size in bytes to that specified in the metadata. Already the checker compares checksums (as documented in the optional EML authentication element) to ensure the correct file was received. But for some sites that check was not useful because authentication checksums were not practical to obtain as part of the data package metadata generation process. The new file size checks are offered as a more convenient option. Continue reading “EML Congruence Checker (ECC) adds new checks for file size”
The EDI Repository Data Package Manager and Audit Manager Application Programming Interfaces (APIs) facilitate automated data publication and science workflows. Thirty-six new R functions, supporting 60% of the available API calls, have been added to the EDIutils R library. The remainder will be added soon! Learn more about the functions and associated use cases at the EDIutils website.
The EDI Repository API facilitates automated data processing and publication workflows, thereby enabling reproducible and efficient data package management. Four new R functions have been added to the EDIutils R library supporting package reservation (pkg_reserve_id.R), evaluation (pkg_evaluation.R), upload (pkg_upload.R), and update (pkg_update.R). The full suite of PASTA+ API calls from the R environment will be available soon!
EDI technical staff recently upgraded our virtualization infrastructure to the latest version of VMware’s ESXi software (from version 5 to 6.5). All of EDI’s servers, including those that run the PASTA+ data repository software, operate as virtual clients across six ESXi host systems. These virtual hosts are configured to operate between 6 and 12 clients at one time, with some room left over to shuffle systems and for testing. The ESXi host systems are located on the campus of the University of New Mexico and connect directly to a dedicated 10Gb/s connection using UNM’s Science DMZ research network. Wide-area Internet connectivity to UNM includes 100Gb/s connections to the DOE Energy Sciences Network (ESNet) and the Western Regional Network, both through the Albuquerque Gigapop. EDI’s data storage capacity is currently at 30TB, with an equivalent 30TB mirror storage device for near-time backups and smaller SSD disks that are used for off-site backup purposes. EDI also uses the AWS Glacier storage as a long-term “cold data” archive.
The Environmental Data Initiative (EDI) has just released an experimental implementation of the sitemaps.org and schema.org metadata to support search engine discovery and indexing (often called Search Engine Optimization). Sitemaps metadata serves as a table of contents for high-value information found on websites so that search engines may more easily discover relevant web pages to index. For EDI, the sitemaps metadata points to the most recent data package landing pages accessible through the EDI Data Portal and is refreshed hourly. Continue reading “Experimental use of Sitemaps.org and Schema.org metadata for Search Engine Optimization”