Replicate Datasets

Sometimes we get requests to have their datasets replicated to ADC. Usually it is a dataset on EDI where you can ge the Digital Object Identifier: which serves as the resource map on DataOne

If a link isn’t given, you can use the package ID (i.e. knb-lter-arc.20129.1) and add it to the end of this link: https://portal.edirepository.org/nis/mapbrowse?packageid=

If we are unsure about the identifer you can try querying for it on the CN:

cn <- CNode('PROD')

#find the DOIs
result <- query(cn, list(q = paste0("(id:*10.6073/pasta/d4f567844673857239eec0cb61c6f543","* *:* NOT obsoletedBy:*)"),
                         fl = "identifier,rightsHolder,formatId, fileName, dateUploaded, authoritativeMN, replicaMN",
                         sort = 'dateUploaded+desc',
                         start ="0",
                         rows = "1500"),
                as="data.frame")

Cloning the packages

  1. Try cloning to the test node first to see how it will look. Set up your nodes
from <- dataone::D1Client("PROD", "urn:node:LTER")
to <- dataone::D1Client("STAGING", "urn:node:mnTestARCTIC")

For LTER datasets on EDI, there are two possibilities for the mn: urn:node:LTER and urn:node:EDI. Try the other one if one isn’t working

  1. Use clone_package to copy it over to the test node
clone_package("doi:10.6073/pasta/d4f567844673857239eec0cb61c6f543", #example doi to replicate
              from = from,
              to = to,
              add_access_to = "http://orcid.org/0000-0001-8888-547X",
              change_auth_node = F,
              new_pid = T)
  1. Once you have verified that it clones properly to the test node, do the same in production but with a couple of minor modifications:
from <- dataone::D1Client("PROD", "urn:node:LTER")
to <- dataone::D1Client("PROD", "urn:node:ARCTIC")

clone_package("doi:10.6073/pasta/0af82d3c3d9d1710775cf9b1464ce70b",
              from = from,
              to = to,
              add_access_to = "http://orcid.org/0000-0001-8888-547X",
              change_auth_node = F,
              new_pid = F) #uses the same pid

make sure the new_pid argument is set to F when you publish to production