|OVERVIEW||1. ORGANIZE||2. CLEAN||3. DESCRIBE||4. UPLOAD||5. CITE|
Phase 3: Describe (create EML metadata)
Ecological Metadata Language (EML) is a metadata specification developed by the ecology discipline and works well for many types of environmental data. EML is highly structured, easily parsed by computers, and essential for entering your data into the EDI Repository. EML was developed specifically to allow researchers to document a typical dataset in the ecological sciences.
The EML schema is available at the Knowledge Network for Biocomplexity (KNB) and is actively maintained/developed by a core group of users on GitHub. The best practices guide for making high-quality EML metadata for a dataset was created by the LTER information management community and is now maintained by EDI. The best practices guide as well as its history is detailed here.
We recommend learning how to create EML metadata if you have several datasets to publish or are expecting data publication to become a common part of your data management routine. Creating EML is essential to self-managing your data packages in the EDI Data Repository, reduces data publication wait time, and improves metadata capture for your site/lab. Below are a set of resources to help you create EML.
- R to EML (EMLassemblyline R package) A user-friendly workflow to help craft high-quality EML metadata with R. Requires little operational knowledge of the R programming language and requires no familiarity with the EML schema or EML best practices. EMLassemblyline is useful for publishing one-off datasets or for setting up automated data publication workflows.
- R to EML (EML R package) The library of functions underlying the EMLassemblyline.
- Database to EML (PostgreSQL and R) Organize metadata in a PostgreSQL database and output EML with a R based application. This method is great for research sites/programs with large data management needs and expecting to publish many datasets. This method is underdevelopment.
- Excel to EML (Excel and R) Supply your metadata in an Excel spreadsheet and convert EML through a R based application. This method is great for publishing one-off datasets. This method is underdevelopment.
Get help from EDI
If you have few datasets to publish or don’t have the time/need to learn how to create EML metadata yourself, we will create EML metadata for you. Simply fill out the EDI metadata template and send us your data. The template contains basic information about your data including a title, an abstract, who is involved with collecting the data, where and when the study was done, and detailed information about each attribute (column) in your data file(s).
- Download EDI’s metadata template (GitHub)
- Overview presentation of EDI’s metadata template (video)
- Please contact Kristin Vanderbilt (firstname.lastname@example.org) and/or Colin Smith (email@example.com) and let us know of your intention to submit data and metadata. We may be able to give you some pointers relative to your particular datasets that will save you time.
- EML schema
- EML schema development
- EML best practices
- Controlled vocabularies for describing data
- Controlled vocabulary best practices
- What are metadata and structured metadata? (video)
- Oxygen XML editor to view EML file