Navigate through EML
The first task when editing an EML file is navigating the EML file. An EML file is organized in a structure that contains many lists nested within other lists. The function View
allows you to get a crude view of an EML file in the viewer. It can be useful for exploring the file.
# Need to be in this member node to explore file
d1c_test <- dataone::D1Client("STAGING", "urn:node:mnTestARCTIC")
doc <- read_eml(getObject(d1c_test@mn, "urn:uuid:558eabf1-1e91-4881-8ba3-ef8684d8f6a1"))
The complex EML document is represented in R as as series of named, nested lists. We use lists all the time in R! A data.frame
is one example of a special kind of list that we use all the time. You may be familiar with the syntax dataframe$column_name
which allows us to select a particular column of a data.frame
. Under the hood, a data.frame
is a named list of vectors with the same length. You select one of those vectors using the $
operator, which is called the “list selector operator.”
Just like you navigate in a data.frame
, you can use the $
operator to navigate through the EML structure. The $
operator allows you to go deeper into the EML structure and to see what elements are nested within other elements. However, you have to tell R where you want to go in the structure when you use the $
symbol. For example, if you want to view the dataset element of your EML you would use the command doc$dataset
. If you want to view the creators of your data set you would use doc$dataset$creator
. Note here that creator
is contained within dataset
. If you aren’t sure where you want to go, hit the tab button on your keyboard after typing $
and a list of available elements in the structure will appear (e.g., doc$<TAB>
):
Note that if you hit tab, and nothing pops up, this most likely implies that you are trying to go into an EML element that can take a series items. For example doc$dataset$creator$<TAB>
will not show a pop-up menu. This is because creator
is a series-type object (i.e. you can have multiple creator
s). If you want to go deeper into creator
, you first must tell R which creator
you are interested in. Do this by writing [[i]]
first where i
is the index of the creator
you are concerned with. For example, if you want to look at the first creator
i = 1. Now doc$dataset$creator[[1]]$<TAB>
will give you many more options. Note, an empty autocomplete result sometimes means you have reached the end of a branch in the EML structure.
Below is the structure of doc$dataset
. There are a series of multiple creator
s, which can be accessed individually by index: doc$dataset$creator[[#]]
.
At this point stop and take a deep breath. The key takeaway is that EML is a hierarchical tree structure. The best way to get familiar with it is to explore the structure. Try entering doc$dataset
into your console, and print it. Now make the search more specific, for instance: doc$dataset$abstract
.