Please contact Kristin Vanderbilt (firstname.lastname@example.org) and/or Colin Smith (email@example.com) and let us know of your intention to submit data and metadata. We may be able to give you some pointers relative to your particular data sets that will save you time.
Ecological Metadata Language (EML) is a metadata specification developed by the ecology discipline and for the ecology discipline. EML is highly structured, easily parsed by computers, and essential for entering your data into the EDI Repository.
EDI personnel will generate EML from metadata that you provide to us, or we will teach you how to generate EML files yourself. In either case, the process starts with describing your data set using the EDI Metadata Word Template. In this document, you will enter a title, an abstract, who is involved with collecting the data, where and when the study was done, and detailed information about each attribute (column) in your data file(s).
Although we accept most file formats, we highly recommend that you use tabular data (comma or tab delimited ASCII text) and geospatial data types. For multi-year observations, we strongly encourage you to compile your tabular data into a single file delimited by commas or tab spaces. If you are having trouble with this, we will be here to help you out. Your geospatial data files should be compressed into a single or multiple .zip directories.
Here are some best practices for preparing your data to archive:
- Use consistent data organization. You may be submitting your data to us in several tables with one year of data in each table. It is essential that each table have the same structure; that is, the attributes must have the same order and identical names in all the tables. This allows us to write code to process your data.
- Format attributes consistently. Attributes must have the same units and formats across and within tables. If one table has the ‘date’ attribute formatted as YYYY-MM-DD, then all should have this format.
- Specify in the metadata how missing values are treated in your tables. Missing data can be coded as -9999 (for numeric variables), NA for character variables, or table fields may be left blank.
- Do not mix character and numeric data in a column of data. Do not, for instance, enter ‘trace’ in a column that otherwise contains numbers representing nitrate measured in precipitation. Your data will go into a database, and databases will reject attributes containing mixed data types.
- Remove character formatting (e.g. superscript) or symbols (e.g. degree) within the data table. These may produce nonsensical characters when converted from your file type to another.
- If you are converting your data from a spreadsheet to a comma-delimited file format, please ensure that you do not have commas within cells of your spreadsheet. Commas are often used in ‘Comments’ fields or to format numbers, and this wreaks havoc when the file is saved in a comma-delimited format.