The process of data curation and publishing can be divided into five phases, beginning with planning and organization, followed by creation of data tables, metadata and packaging, and ending with the submission to a repository and citation. EDI provides advice and help with all phases.
Phase 1. Organize data and metadata
- Data should be organized into logical units based on structure and theme. EDI has recommendations for organizing data by theme.
- More about organizing different types of data
Phase 2. Clean and format data tables
- Some data types benefit from a specific format and by having specific types of quality control (QC) checks run. You’ll want to keep track of the QC steps you take so others will know what’s been done to these data.
- More about formatting and QC
Phase 3. Create EML metadata
|Data managers are generally comfortable with basic metadata concepts and writing code. There are numerous resources, including “best practice” guides, templates, and code libraries.||If you don’t have access to coding or metadata expertise, contact EDI. We can provide you with a metadata template, and instructions for data table assembly, to get your dataset into our queue.|
|Metadata resources for DIY EML creation||Metadata resources for scientists|
Phase 4. Upload data package to EDI repository
|Data managers generally have submitted data to a repository before, and are comfortable with the technical aspects||Scientists do not need to interact directly with the repository, but can work with EDI data managers (humans)|
|Instructions for EDI repository submission||Instructions for sending data/metadata and other inquiries|
Phase 5. Cite it
- You’re done. You can retrieve your data citation, and/or its DOI, or to link the data package to your web page
- More about citing and linking data packages