This function checks the congruence of data and metadata attributes for a tabular data object. Supported objects include dataTable, otherEntity, and spatialVector entities. It can be used on its own but is also called by qa_package() to check all tabular data objects in a data package.

qa_attributes(entity, data, eml = NULL)

Arguments

entity

(eml) An EML dataTable, otherEntity, or spatialVector associated with the data object.

data

(data.frame) A data frame of the data object.

eml

(S4) The entire EML object. This is necessary if attributes with references are being checked.

Value

NULL

Details

This function checks the following:

  • Names: Check that column names in attributes match column names in data frame. Possible conditions to check for:

    • attributeList does not exist for data frame

    • Some of the attributes that exist in the data do not exist in the attributeList

    • Some of the attributes that exist in the attributeList do not exist in the data

    • Typos in attribute or column names resulting in nonmatches

  • Domains: Check that attribute types in EML match attribute types in data frame. Possible conditions to check for:

    • nominal, ordinal, integer, ratio, dateTime

    • If domain is enumerated domain, enumerated values in the data are accounted for in the enumerated definition

    • If domain is enumerated domain, enumerated values in the enumerated definition are all represented in the data

    • Type of data does not match attribute type

  • Values: Check that values in data are reasonable. Possible conditions to check for:

    • Accidental characters in the data (e.g., one character in a column of integers)

    • If missing values are present, missing value codes are also present

See also

Examples

# NOT RUN {
# Checking a .csv file
dataTable <- eml@dataset@dataTable[[1]]
data <- readr::read_csv("https://cn.dataone.org/cn/v2/resolve/urn:uuid:...")

qa_attributes(dataTable, data)
# }