Skip to contents

Introduction to vegbankr

This package is an R client for VegBank, the vegetation plot database of the Ecological Society of America’s Panel on Vegetation Classification, hosted by the National Center for Ecological Analysis and Synthesis (NCEAS). VegBank contains vegetation plot data, community types recognized by the U.S. National Vegetation Classification and others, and all ITIS/USDA plant taxa along with other taxa recorded in plot records.

As a VegBank API client, the vegbankr package currently supports querying and downloading vegetation plot records, in addition to validating and uploading new data to the VegBank database.

Summary of Functions

Function Description
vb_get_plot_observations() Plot observation data
vb_get_community_concepts() Community concepts (assertions) linked to community names through usages
vb_get_community_classifications() Community classification events wherein one or more community concepts were applied to a plot observation
vb_get_community_interpretations() Assignments of community names and authorities (i.e., community concepts) to specific plot observations, as part of a community classification event
vb_get_plant_concepts() Plant concepts (assertions) linked to plant names through usages
vb_get_taxon_observations() Data provider’s determination of taxa observed on a plot, and the overall cover of those taxa
vb_get_taxon_interpretations() Assignments of taxon names and authorities (i.e., plant concepts) to specific taxon observations
vb_get_cover_methods() Information about registered coverclass methods
vb_get_stratum_methods() Information about registered strata sampling protocols
vb_get_references() Information about references cited within VegBank
vb_get_projects() Information about projects established to collect vegetation plot data
vb_get_parties() Information about people and organizations who have contributed to the collection or interpretation of a plot

Many of the functions utilize the following arguments:

Argument Description
vb_code A VegBank code. For example, an observation code (ob.*) or project code (pj.*)
detail Level of detail returned: "full" includes all available fields; the default returns a summary subset
with_nested If TRUE, nested child records (e.g. taxon observations, stratum data) are included as list columns
limit Maximum number of records to return (default: 100)
offset Number of records to skip before returning results; useful for pagination
sort Field name to sort by; prefix with - for descending order (e.g. "-obs_count")
search Optional search string for filtering results

Plot Observation Data

The vb_get_plot_observations() function allows users to download vegetation plot data from VegBank.

Single Plot Observation by Plot Code

To retrieve a specific plot observation record using its “ob” code, use the following command:

# Retrieve a specific plot observation
ob.135454 <- vb_get_plot_observations(vb_code = "ob.135454", detail = "full",
  with_nested = TRUE)

# Preview the downloaded data
head(ob.135454)
## # A tibble: 1 × 122
##    area author_datum   author_e       author_location author_n   author_obs_code
##   <dbl> <chr>          <chr>          <chr>           <chr>      <chr>          
## 1   100 <confidential> <confidential> <confidential>  <confiden… DEWA.1.2003    
## # ℹ 116 more variables: author_plot_code <chr>, author_zone <chr>,
## #   auto_taxon_cover <lgl>, azimuth <lgl>, basal_area <lgl>,
## #   bryophyte_quality <lgl>, cm_code <chr>, confidentiality_status <int>,
## #   country <chr>, cover_dispersion <lgl>, cover_method_name <chr>,
## #   date_accuracy <chr>, date_entered <chr>, disturbances <lgl>,
## #   dominant_stratum <lgl>, dsg_poly <lgl>, effort_level <lgl>,
## #   elevation <dbl>, elevation_accuracy <lgl>, elevation_range <lgl>, …

Omitting detail = "full" or setting with_nested = FALSE returns a smaller summary better for quick browsing or larger downloads.

Plot Observations for a Project

You can also download multiple plot observations for a specific project. For example, to retrieve 100 plot observations from the Southwest GAP, Nevada Project (pj.10510):

# Retrieve all plot observations for a specific project
pj.10510 <- vb_get_plot_observations(vb_code = "pj.10510")

# Preview the data
head(pj.10510)
## # A tibble: 6 × 12
##   ob_code  author_obs_code pl_code  author_plot_code has_observation_synonym
##   <chr>    <chr>           <chr>    <chr>            <lgl>                  
## 1 ob.51505 NV010603LH01    pl.51570 NV010603LH01     FALSE                  
## 2 ob.51506 NV010603LH02    pl.51571 NV010603LH02     FALSE                  
## 3 ob.51507 NV010603LH03    pl.51572 NV010603LH03     FALSE                  
## 4 ob.51508 NV010603LH04    pl.51573 NV010603LH04     FALSE                  
## 5 ob.51509 NV010603LH05    pl.51574 NV010603LH05     FALSE                  
## 6 ob.51510 NV010603LH06    pl.51575 NV010603LH06     FALSE                  
## # ℹ 7 more variables: latitude <dbl>, longitude <dbl>, area <int>,
## #   elevation <dbl>, country <int>, state_province <chr>, year <dbl>

Project Data

vb_get_projects() returns information about projects established to collect vegetation plot data.

Search by Project Name

This example retrieves all projects whose name contains “GAP”, sorted in descending order by observation count so that the most data-rich projects appear first.

vb_get_projects(search = "GAP", sort = "-obs_count")
## # A tibble: 6 × 8
##   pj_code  project_name       project_description start_date stop_date obs_count
##   <chr>    <chr>              <chr>                    <int>     <int>     <dbl>
## 1 pj.10510 Southwest GAP, Ne… http://earth.gis.u…         NA        NA     17326
## 2 pj.10507 Southwest GAP, Ar… http://earth.gis.u…         NA        NA     12082
## 3 pj.10511 Southwest GAP, Ut… http://earth.gis.u…         NA        NA      9781
## 4 pj.10509 Southwest GAP, Ne… http://earth.gis.u…         NA        NA      5693
## 5 pj.10508 Southwest GAP, Co… http://earth.gis.u…         NA        NA      5286
## 6 pj.11044 Pennsylvania HP D… Plant Community/As…         NA        NA       251
## # ℹ 2 more variables: last_plot_added_date <dttm>, search_rank <dbl>

Plot Observations by Project Code

Once you have identified a project code, you can pass it directly to vb_get_plot_observations. This example retrieves the first 100 records of plot observations associated with the project pj.11044 (Pennsylvania HPD Delaware Water Gap), sorted by the author’s observation code.

vb_get_plot_observations("pj.11044", sort = "author_obs_code")
## # A tibble: 100 × 12
##    ob_code   author_obs_code pl_code   author_plot_code has_observation_synonym
##    <chr>     <chr>           <chr>     <chr>            <lgl>                  
##  1 ob.135454 DEWA.1.2003     pl.136716 DEWA.1           FALSE                  
##  2 ob.135453 DEWA.10.2003    pl.136715 DEWA.10          FALSE                  
##  3 ob.135451 DEWA.100.2003   pl.136714 DEWA.100         FALSE                  
##  4 ob.135545 DEWA.101.2003   pl.136713 DEWA.101         FALSE                  
##  5 ob.135435 DEWA.102.2003   pl.136712 DEWA.102         FALSE                  
##  6 ob.135448 DEWA.103.2003   pl.136711 DEWA.103         FALSE                  
##  7 ob.135577 DEWA.104.2003   pl.136710 DEWA.104         FALSE                  
##  8 ob.134250 DEWA.105.2003   pl.136709 DEWA.105         FALSE                  
##  9 ob.135358 DEWA.106.2003   pl.136708 DEWA.106         FALSE                  
## 10 ob.135459 DEWA.107.2003   pl.136707 DEWA.107         FALSE                  
## # ℹ 90 more rows
## # ℹ 7 more variables: latitude <dbl>, longitude <dbl>, area <dbl>,
## #   elevation <dbl>, country <chr>, state_province <chr>, year <dbl>

Party Data

vb_get_parties() returns information about the people and organization who have contributed to a plot, project, or plant/community interpretation

# get people associated with a project
vb_get_parties(vb_code = "pj.11044")
## # A tibble: 1 × 9
##   py_code  party_label        salutation given_name middle_name surname  
##   <chr>    <chr>                   <int> <chr>            <int> <chr>    
## 1 py.91547 Zimmerman, Ephraim         NA Ephraim             NA Zimmerman
## # ℹ 3 more variables: organization_name <chr>, contact_instructions <int>,
## #   obs_count <dbl>
# get people associated with a plot observation
vb_get_parties(vb_code = "ob.3298")
## # A tibble: 4 × 9
##   py_code party_label           salutation given_name middle_name surname       
##   <chr>   <chr>                      <int> <chr>            <int> <chr>         
## 1 py.1062 Reed, Cindy                   NA Cindy               NA Reed          
## 2 py.1245 Faber-Langendoen, Don         NA Don                 NA Faber-Langend…
## 3 py.1246 Von Loh, Jim                  NA Jim                 NA Von Loh       
## 4 py.1248 West, Keldyn                  NA Keldyn              NA West          
## # ℹ 3 more variables: organization_name <chr>, contact_instructions <int>,
## #   obs_count <dbl>

Plot Observations by Party

Once you have identified a party py code, it can be used to return all the plot observations associated with a person/organization:

vb_get_plot_observations(vb_code = "py.1062")
## # A tibble: 100 × 12
##    ob_code author_obs_code pl_code author_plot_code has_observation_synonym
##    <chr>   <chr>           <chr>   <chr>            <lgl>                  
##  1 ob.3062 FOLA.21         pl.3118 FOLA.21          FALSE                  
##  2 ob.3064 AGFO.2          pl.3120 AGFO.2           FALSE                  
##  3 ob.3065 AGFO.3          pl.3121 AGFO.3           FALSE                  
##  4 ob.3066 AGFO.4          pl.3122 AGFO.4           FALSE                  
##  5 ob.3079 AGFO.17         pl.3135 AGFO.17          FALSE                  
##  6 ob.3080 AGFO.18         pl.3136 AGFO.18          FALSE                  
##  7 ob.3081 AGFO.19         pl.3137 AGFO.19          FALSE                  
##  8 ob.3110 FOLA.25         pl.3166 FOLA.25          FALSE                  
##  9 ob.3111 FOLA.26         pl.3167 FOLA.26          FALSE                  
## 10 ob.3112 FOLA.27         pl.3168 FOLA.27          FALSE                  
## # ℹ 90 more rows
## # ℹ 7 more variables: latitude <dbl>, longitude <dbl>, area <dbl>,
## #   elevation <dbl>, country <chr>, state_province <chr>, year <dbl>

Taxon Observations

vb_get_taxon_observations() retrieves the individual plant taxon records associated with a given plot observation. Each row represents one taxon recorded in the plot

Taxon Observations by Plot

This example retrieves the taxon (plant) observations associated with the plot ob.135454

## # A tibble: 32 × 6
##    ob_code   to_code    int_curr_pc_code int_orig_pc_code taxon_inference_area
##    <chr>     <chr>      <chr>            <chr>                           <int>
##  1 ob.135454 to.2712789 pc.199973        pc.199973                          NA
##  2 ob.135454 to.2712790 pc.199285        pc.199285                          NA
##  3 ob.135454 to.2712791 pc.190695        pc.190695                          NA
##  4 ob.135454 to.2712792 pc.190492        pc.190492                          NA
##  5 ob.135454 to.2712793 pc.187519        pc.187519                          NA
##  6 ob.135454 to.2712794 pc.183905        pc.183905                          NA
##  7 ob.135454 to.2712795 pc.183661        pc.183661                          NA
##  8 ob.135454 to.2712796 pc.176794        pc.176794                          NA
##  9 ob.135454 to.2740072 pc.198464        pc.198464                          NA
## 10 ob.135454 to.2740073 pc.198388        pc.198388                          NA
## # ℹ 22 more rows
## # ℹ 1 more variable: rf_code <int>

Plant Species Concepts

vb_get_plant_concepts() can be used to retrieve the plant species concepts associated with plots and projects.

## # A tibble: 32 × 19
##    pc_code   plant_name             plant_code plant_description concept_rf_code
##    <chr>     <chr>                  <chr>                  <int> <chr>          
##  1 pc.111478 Acer rubrum L.         ACRU                      NA rf.37          
##  2 pc.111496 Acer saccharum Marsh.  ACSA3                     NA rf.37          
##  3 pc.111540 Achillea millefolium … ACMI2                     NA rf.37          
##  4 pc.115355 Anthoxanthum odoratum… ANOD                      NA rf.37          
##  5 pc.132596 Cornus racemosa Lam.   CORA6                     NA rf.37          
##  6 pc.145537 Euthamia graminifolia… EUGR5                     NA rf.37          
##  7 pc.146730 Fraxinus L.            FRAXI                     NA rf.37          
##  8 pc.147454 Galium mollugo L.      GAMO                      NA rf.37          
##  9 pc.151514 Hieracium L.           HIERA                     NA rf.37          
## 10 pc.152125 Houstonia caerulea L.  HOCA4                     NA rf.37          
## # ℹ 22 more rows
## # ℹ 14 more variables: concept_rf_label <chr>, status_rf_code <chr>,
## #   status_rf_label <chr>, obs_count <dbl>, plant_level <chr>, status <chr>,
## #   start_date <dttm>, stop_date <int>, current_accepted <lgl>, py_code <chr>,
## #   party_label <chr>, plant_party_comments <int>, parent_pc_code <chr>,
## #   parent_name <chr>

Plant Community Concepts

The example below searches for community concepts that include the genus Sequoiadendron.

sequoia_communities <- vb_get_community_concepts(search = "sequoiadendron")

# view the plots
head(sequoia_communities)
## # A tibble: 6 × 20
##   cc_code  comm_name comm_code comm_description concept_rf_code concept_rf_label
##   <chr>    <chr>     <chr>     <chr>            <chr>           <chr>           
## 1 cc.781   A.101     A.101     This forest all… rf.32           EcoArt 2002     
## 2 cc.5316  CEGL0031… CEGL0031… NA               rf.32           EcoArt 2002     
## 3 cc.7909  CEGL0086… CEGL0086… NA               rf.32           EcoArt 2002     
## 4 cc.21484 Sequoiad… CEGL0086… NA               rf.9844         Western Ecology…
## 5 cc.21485 Sequoiad… A.101     This forest all… rf.9844         Western Ecology…
## 6 cc.31147 Sequoiad… CEGL0031… These giant for… rf.9844         Western Ecology…
## # ℹ 14 more variables: status_rf_code <chr>, status_rf_label <chr>,
## #   obs_count <dbl>, comm_level <chr>, status <chr>, start_date <dttm>,
## #   stop_date <dttm>, current_accepted <lgl>, py_code <chr>, party_label <chr>,
## #   comm_party_comments <int>, parent_cc_code <chr>, parent_name <chr>,
## #   search_rank <dbl>

Then we can further determine which concept has the most plot observations, then retrieve all of those plot observations from VegBank by directly passing the community concepts codes into vb_get_plot_observations()

sequoia_plots <- sequoia_communities |>
  dplyr::arrange(-obs_count) |>
  dplyr::slice(1) |>
  dplyr::pull(cc_code) |>
  vb_get_plot_observations()

head(sequoia_plots)
## # A tibble: 2 × 12
##   ob_code author_obs_code pl_code author_plot_code has_observation_synonym
##   <chr>   <chr>           <chr>   <chr>            <lgl>                  
## 1 ob.5363 YOSE.98M67      pl.5419 YOSE.98M67       FALSE                  
## 2 ob.5655 YOSE.99S104     pl.5711 YOSE.99S104      FALSE                  
## # ℹ 7 more variables: latitude <dbl>, longitude <dbl>, area <dbl>,
## #   elevation <dbl>, country <chr>, state_province <chr>, year <dbl>

Other Function Options

Changing Limit & Offset

To download more than 100 records, increase the limit argument. To page through a large results in set chunks, combine limit and offset

# Download up to 500 records
vb_get_plot_observations(vb_code = "pj.10510", limit = 500)
## # A tibble: 500 × 12
##    ob_code  author_obs_code pl_code  author_plot_code has_observation_synonym
##    <chr>    <chr>           <chr>    <chr>            <lgl>                  
##  1 ob.51505 NV010603LH01    pl.51570 NV010603LH01     FALSE                  
##  2 ob.51506 NV010603LH02    pl.51571 NV010603LH02     FALSE                  
##  3 ob.51507 NV010603LH03    pl.51572 NV010603LH03     FALSE                  
##  4 ob.51508 NV010603LH04    pl.51573 NV010603LH04     FALSE                  
##  5 ob.51509 NV010603LH05    pl.51574 NV010603LH05     FALSE                  
##  6 ob.51510 NV010603LH06    pl.51575 NV010603LH06     FALSE                  
##  7 ob.51511 NV010603LH07    pl.51576 NV010603LH07     FALSE                  
##  8 ob.51512 NV010603LH08    pl.51577 NV010603LH08     FALSE                  
##  9 ob.51513 NV010603LH09    pl.51578 NV010603LH09     FALSE                  
## 10 ob.51514 NV010603LH10    pl.51579 NV010603LH10     FALSE                  
## # ℹ 490 more rows
## # ℹ 7 more variables: latitude <dbl>, longitude <dbl>, area <int>,
## #   elevation <dbl>, country <int>, state_province <chr>, year <dbl>
# Download the second page of 100 records
vb_get_plot_observations(vb_code = "pj.10510", limit = 100, offset = 100)
## # A tibble: 100 × 12
##    ob_code  author_obs_code pl_code  author_plot_code has_observation_synonym
##    <chr>    <chr>           <chr>    <chr>            <lgl>                  
##  1 ob.51605 NV011003JS20    pl.51670 NV011003JS20     FALSE                  
##  2 ob.51606 NV011003LH01    pl.51671 NV011003LH01     FALSE                  
##  3 ob.51607 NV011003LH02    pl.51672 NV011003LH02     FALSE                  
##  4 ob.51608 NV011003LH03    pl.51673 NV011003LH03     FALSE                  
##  5 ob.51609 NV011003LH04    pl.51674 NV011003LH04     FALSE                  
##  6 ob.51610 NV011003LH05    pl.51675 NV011003LH05     FALSE                  
##  7 ob.51611 NV011003LH06    pl.51676 NV011003LH06     FALSE                  
##  8 ob.51612 NV011003LH07    pl.51677 NV011003LH07     FALSE                  
##  9 ob.51613 NV011003LH08    pl.51678 NV011003LH08     FALSE                  
## 10 ob.51614 NV011003LH09    pl.51679 NV011003LH09     FALSE                  
## # ℹ 90 more rows
## # ℹ 7 more variables: latitude <dbl>, longitude <dbl>, area <int>,
## #   elevation <dbl>, country <int>, state_province <chr>, year <dbl>

Sorting Results

Use the sort argument to order results by the following fields:

Endpoint Sortable Fields
plot-observations default, author_obs_code
plant-concepts default, plant_name, obs_count
community-concepts default, comm_name, obs_count
projects default, project_name, obs_count
parties default, surname, organization_name, obs_count

Prefix the field name with - for descending order. The example below retrieves plot observations for projects containing by descending author_obs_name

vb_get_plot_observations(vb_code = "pj.10510", sort = '-author_obs_code')
## # A tibble: 100 × 12
##    ob_code  author_obs_code pl_code  author_plot_code has_observation_synonym
##    <chr>    <chr>           <chr>    <chr>            <lgl>                  
##  1 ob.68830 NV121902JS30    pl.68895 NV121902JS30     FALSE                  
##  2 ob.68829 NV121902JS29    pl.68894 NV121902JS29     FALSE                  
##  3 ob.68828 NV121902JS28    pl.68893 NV121902JS28     FALSE                  
##  4 ob.68827 NV121902JS27    pl.68892 NV121902JS27     FALSE                  
##  5 ob.68826 NV121902JS26    pl.68891 NV121902JS26     FALSE                  
##  6 ob.68825 NV121902JS25    pl.68890 NV121902JS25     FALSE                  
##  7 ob.68824 NV121902JS24    pl.68889 NV121902JS24     FALSE                  
##  8 ob.68823 NV121902JS23    pl.68888 NV121902JS23     FALSE                  
##  9 ob.68822 NV121902JS22    pl.68887 NV121902JS22     FALSE                  
## 10 ob.68821 NV121802JS30    pl.68886 NV121802JS30     FALSE                  
## # ℹ 90 more rows
## # ℹ 7 more variables: latitude <dbl>, longitude <dbl>, area <int>,
## #   elevation <dbl>, country <int>, state_province <chr>, year <dbl>

Manipulating Downloaded Data with dplyr

Because downloaded VegBank data is saved as dataframes, the data can be manipulated using base R or dplyr functions. Below we highlight a few possible data manipulations.

Select a subset of Columns

This example retrieves the plot data for ob.4577, and then uses the dplyr::select() function to select only a subset of columns.

# Downloading plant concept data
plants <- vb_get_plant_concepts("ob.135454")

# Selecting only the plant_name. plant_code, and status columns
plants_small <- plants |> 
  dplyr::select(plant_name, plant_code, status)

Filter Rows by a Condition

This example first retrieves the first 100 records for project pj.11044 (Pennsylvania HPD Delaware Water Gap), sorted by the author’s observation code. Then it will filter out observations where the elevation is greater than 250 meters

obs <- vb_get_plot_observations(
  vb_code = "pj.11044",
  sort    = "author_obs_code",
  limit   = 100
)

# filter where elevation column is greater than 250 meters
obs |>
  dplyr::filter(elevation > 250)
## # A tibble: 20 × 12
##    ob_code   author_obs_code pl_code   author_plot_code has_observation_synonym
##    <chr>     <chr>           <chr>     <chr>            <lgl>                  
##  1 ob.135454 DEWA.1.2003     pl.136716 DEWA.1           FALSE                  
##  2 ob.135422 DEWA.119.2003   pl.136694 DEWA.119         FALSE                  
##  3 ob.135421 DEWA.120.2003   pl.136692 DEWA.120         FALSE                  
##  4 ob.135591 DEWA.121.2003   pl.136691 DEWA.121         FALSE                  
##  5 ob.135592 DEWA.122.2003   pl.136690 DEWA.122         FALSE                  
##  6 ob.135443 DEWA.127.2003   pl.136733 DEWA.127         FALSE                  
##  7 ob.135442 DEWA.128.2003   pl.136732 DEWA.128         FALSE                  
##  8 ob.135432 DEWA.136.2003   pl.136723 DEWA.136         FALSE                  
##  9 ob.135430 DEWA.138.2003   pl.136721 DEWA.138         FALSE                  
## 10 ob.135357 DEWA.144.2003   pl.136685 DEWA.144         FALSE                  
## 11 ob.135449 DEWA.149.2003   pl.136669 DEWA.149         FALSE                  
## 12 ob.135383 DEWA.181.2004   pl.136633 DEWA.181         FALSE                  
## 13 ob.135382 DEWA.182.2004   pl.136632 DEWA.182         FALSE                  
## 14 ob.135381 DEWA.183.2004   pl.136631 DEWA.183         FALSE                  
## 15 ob.135380 DEWA.184.2004   pl.136630 DEWA.184         FALSE                  
## 16 ob.135377 DEWA.185.2004   pl.136629 DEWA.185         FALSE                  
## 17 ob.135447 DEWA.186.2004   pl.136628 DEWA.186         FALSE                  
## 18 ob.135376 DEWA.187.2004   pl.136627 DEWA.187         FALSE                  
## 19 ob.135375 DEWA.188.2004   pl.136626 DEWA.188         FALSE                  
## 20 ob.135373 DEWA.189.2004   pl.136625 DEWA.189         FALSE                  
## # ℹ 7 more variables: latitude <dbl>, longitude <dbl>, area <dbl>,
## #   elevation <dbl>, country <chr>, state_province <chr>, year <dbl>

Summarize Numeric Variables

This example downloads the full plot observation records for project pj.10510 and computes the mean, minimum and maximum slope gradient across all plots.

plot_data <- vb_get_plot_observations(vb_code = "pj.10510", detail = "full")

avg_slope <- plot_data |>
  dplyr::summarise(
    slope_mean = mean(slope_gradient, na.rm = TRUE),
    slope_min  = min(slope_gradient,  na.rm = TRUE),
    slope_max  = max(slope_gradient,  na.rm = TRUE)
  )

head(avg_slope)
## # A tibble: 1 × 3
##   slope_mean slope_min slope_max
##        <dbl>     <dbl>     <dbl>
## 1       11.1         0        70