Final Checklist
You can click on the assessment report
on the website to for a general check. Fix anything you see there.
Send the link over slack for peer review by your fellow datateam members. Usually we look for the following (the list is not exhaustive):
General EML
- Ethical Research Practice section is completed
- Publisher and system information have been added:
Title
- Include geographic and temporal coverage (when and where data was collected)
- Abbreviations are removed or defined
Abstract
- longer than 100 words
- Abbreviations are removed or defined. No garbled text
- Tags such as
<superscript>2</superscript>
and<subscript>2</subscript>
can be used for nicer formatting
DataTable / OtherEntity / SpatialVectors
- In the correct one: DataTable / OtherEntity / SpatialVector / SpatialRaster for the file type
- entityDescription - longer than 5 words and unique
- physical present and up-to-date and format is correct
Attribute Table
- Complete
- attributeDefinitions longer than 3 words
- Variables match what is in the file
- Measurement domain - if appropirate (ie dateTime correct)
- Missing Value Code - accounted for if applicable
- Semantic Annotation - appropriate semantic annotations added, especially for spatial and temporal variables: lat, lon, date etc.
- Custom units were created if necessary
People
- Complete information for each person in each section
- includes ORCID and e-mail address for all contacts
- people repeated across sections have consistent information
Geographic region
- the map looks correct and matches the geographic description
- check if negatives (-) are missing
Project
- If it is an NSF award you can use the helper function:
doc$dataset$project <- eml_nsf_to_project(awards)
- for other awards that need to be set manually, see the set project page
Check EML Version
- Currently using:
eml-2.2.0
(as of July 30 2020) - Review to see if the EML version is set correctly by reviewing the
doc$`@context`
that it is indeed 2.2.0 under eml - Re-run your code again and have the line
emld::eml_version("eml-2.2.0")
at the top - Make sure the system metadata is also 2.2.0
Access
- If necessary, granted access to PI using
set_rights_and_access()
- make sure it is
http://
(no s)
- make sure it is
- note if it is a part of portals there might be specific access requirements for it to be visible using
set_access()
SFTP Files
- If there are files transferred to us via SFTP, delete those files once they have been added to the dataset and when the ticket is resolved
Updated datasets
All the above applies. These are some areas to do a closer check when users update with a new file:
- New data was added
- Temporal Coverage and Title
- Confirm if new data used same methods or if methods need updating
- Files were replaced
- update physical and entityName
- double-check attributes are the same
- check for any new missing value codes that should be accounted for
- Was the dataset published before 2021?
- update project info, annotations
- Glance over entire page for any small mistakes (ie. repeated additionalMetadata, any missed &s, typos)