Create a new data package

adapted from the dataone and datapack vingettes

library(dataone)
library(datapack)
library(uuid)

Create a new data package - data package is a class that has slots for relations (provenance), objects(the metadata and data file(s)) and systemMetadata.

dp <- new("DataPackage")

Upload new data files

Create and add a metadata file

In this example we will use this previously written EML metadata. Here we are getting the file path from the dataone package and saving that as the object emlFile.

This is a bit of an unusual way to reference a local file path, but all this does is looks within the R package dataone and grabs the path to a metadata document stored within that package. If you print the value of emlFile you’ll see it is just a file path, but it points to a special place on the server where that package is installed. Usually you will just reference EML paths that are stored within your user file system.

emlFile <- system.file("extdata/strix-pacific-northwest.xml", package="dataone")

Create a new DataObject for the metadata and add it to the package.

metadataObj <- new("DataObject", format="https://eml.ecoinformatics.org/eml-2.2.0", filename=emlFile)
dp <- addMember(dp, metadataObj)

Check the dp object to see if the metadata was added correctly.

dp

Add some additional data files

sourceData <- system.file("extdata/OwlNightj.csv", package="dataone")
sourceObj <- new("DataObject", format="text/csv", filename=sourceData)
dp <- addMember(dp, sourceObj, metadataObj) # The third argument of addMember() associates the new DataObject to the metadata that was just added.

If you want to change the formatId please use updateSystemMetadata (more on this later in the book)

Upload the package

d1c <- dataone::D1Client("STAGING", "urn:node:mnTestARCTIC")

Make sure to give access privileges to the ADC admins:

myAccessRules <- data.frame(subject="CN=arctic-data-admins,DC=dataone,DC=org", permission="changePermission") 

Get necessary token from test.arcticdata.io to upload the dataset prior uploading the datapackage:

packageId <- uploadDataPackage(d1c, dp, public=TRUE, accessRules=myAccessRules, quiet=FALSE)