Edit spatial data

Occasionally, you may encounter a third type of data object: spatialVector and spatialRaster. These objects contains spatial data (ie maps), such as a shapefile or geodatabase.

Editing a spatialVector or spatialRaster is similar to editing a dataTable or an otherEntity. A physical and attributeList should be present. We will focus on how to get the information unique to spatialData and how to create the spatialVector/spatialRaster

File types

File extensions to look for that might be spatial data: kml, geoJSON, geoTIFF, .dbf, .shp, and .shx

Additionally, spatial data that involve multiple files should typically be archived within a .zip file to ensure all related and interdependent files stay together (ie . a geodatabase). This is one of the exceptions to our rule regarding .zip files.

For example, a spatial dataset for a shapefile should, at a minimum, consist of separate .dbf, .shp, and .shx files with the same prefix in the same directory. All these files are required in order to use the data. Also note that shapefiles limit attribute names to 10 characters, so attribute names in the metadata may not match exactly to attribute names in the data. Some spatial raster data come as standalone files (.tiff or .nc) and some come as a group of files. If you aren’t sure whether to unzip a file, ask Jasmine or Jeanette.

There are specific formatIds for these kinds of zipped files: application/vnd.shp+zip image/geotiff+zip. Remember to check that the files have the correct formatId

Reading Spatial Files

Read in the files to (1) help you in creating your attributes table and (2) sometimes also figure out the coordinate reference system.

library(sf)
spatial_file <- sf::read_sf("example.kml")

When you read kml files, read_sf() sometimes shows additional columns that aren’t in the actual file. Always open kml files in text editor to check if the columns actually exist.

If it is a zipped shapefile there is a handy function you can use arcticdatautils::read_zip_shapefile(mn, pid)

Coordinate Systems

The coordinate system allow to work with spatial data using the same frame of reference (a Datum). A common coordinate system is “GCS_WGS_1984 (used in Google Maps!) which is suitable for plotting points distributed globally. There are many others that may be better suited for certain areas in the world.

All latitudes and longitude coordinates should have a coordinate system (like a frame of reference).

There are horizontal coordinate systems (earth’s surface) and vertical coordinate systems (depth). More information can be found here.

To find the horizCoordSysName you can use:

sf::st_crs(spatial_file)

Take the Datum and add GCS (Geographic Coordinate System) in front. For example: “GCS_WGS_1984”

spatialVector

Adding Geometry

One important difference is that a spatialVector object should also have a geometry slot that describes the geometry features of the data. The possible values include one or more (in a list) of ‘Point’, ‘LineString’, ‘LinearRing’, ‘Polygon’, ‘MultiPoint’, ‘MultiLineString’, ‘MultiPolygon’, or ‘MultiGeometry’. You will likely have to open the file itself within QGIS or R (ie . the sf package) to get the correct geometry value.

To add just a geometry slot use:

doc$dataset$spatialVector[[1]]$geometry <- "Polygon"

To add it using the data pid: 1. Get the geometry and spatialReference 2. Use pid_to_eml_entity() to generate the spatialVector

 spatialVector <- pid_to_eml_entity(adc, 
                                    pkg$data[n], 
                                    entity_type = "spatialVector",
                                    entityName = "filename.kml",
                                    entityDescription = "some desciption",
                                    attributeList = attributeList,
                                    geometry = "Point",
                                    spatialReference = list(horizCoordSysName = "GCS_WGS_1984"))
  1. Add the spatialVector to the doc
doc$dataset$spatialVector[[1]] <- spatialVector

spatialRasters

Most often these come in GeoTiff or Tiff files. The data is presented as a grid of “pixels”. For more information ESRI has a indepth article here.

To use the helper function get:

  1. the path of your raster file
  2. an attribute table
  3. a coordinate system

To get a coordinate system name, you can use the output of the function on your first try (which will print the coordinate reference system, if it is defined). You can use the return value of get_coord_list() (a large data.frame) to find the correct coordinate system name.

Another way to get the coordinate system name is using rgdal::GDALinfo(path). This function can provide many details for your GeoTiff or Tiff files including the coordinate system name. More information can be found here here.

rgdal::GDALinfo(path)

eml_get_raster_metadata(path, coord_name, attributes)