Edit spatial data
Occasionally, you may encounter a third type of data object: spatialVector
and spatialRaster
. These objects contains spatial data (ie maps), such as a shapefile or geodatabase.
Editing a spatialVector
or spatialRaster
is similar to editing a dataTable
or an otherEntity
. A physical
and attributeList
should be present. We will focus on how to get the information unique to spatialData and how to create the spatialVector/spatialRaster
File types
File extensions to look for that might be spatial data: kml, geoJSON, geoTIFF, .dbf, .shp, and .shx
Additionally, spatial data that involve multiple files should typically be archived within a .zip file to ensure all related and interdependent files stay together (ie . a geodatabase). This is one of the exceptions to our rule regarding .zip files.
For example, a spatial dataset for a shapefile should, at a minimum, consist of separate .dbf, .shp, and .shx files with the same prefix in the same directory. All these files are required in order to use the data. Also note that shapefiles limit attribute names to 10 characters, so attribute names in the metadata may not match exactly to attribute names in the data. Some spatial raster data come as standalone files (.tiff or .nc) and some come as a group of files. If you aren’t sure whether to unzip a file, ask Jasmine or Jeanette.
There are specific formatIds
for these kinds of zipped
files: application/vnd.shp+zip
image/geotiff+zip
. Remember to check that the files have
the correct formatId
Reading Spatial Files
Read in the files to (1) help you in creating your attributes table and (2) sometimes also figure out the coordinate reference system.
library(sf)
spatial_file <- sf::read_sf("example.kml")
When you read kml files, read_sf()
sometimes shows
additional columns that aren’t in the actual file. Always open kml files
in text editor to check if the columns actually exist.
If it is a zipped shapefile there is a handy function you can use
arcticdatautils::read_zip_shapefile(mn, pid)
Coordinate Systems
The coordinate system allow to work with spatial data using the same frame of reference (a Datum). A common coordinate system is “GCS_WGS_1984
(used in Google Maps!) which is suitable for plotting points distributed globally. There are many others that may be better suited for certain areas in the world.
All latitudes and longitude coordinates should have a coordinate system (like a frame of reference).
There are horizontal coordinate systems (earth’s surface) and vertical coordinate systems (depth). More information can be found here.
To find the horizCoordSysName
you can use:
sf::st_crs(spatial_file)
Take the Datum and add GCS (Geographic Coordinate System) in front. For example: “GCS_WGS_1984”
spatialVector
Adding Geometry
One important difference is that a spatialVector
object should also have a geometry
slot that describes the geometry features of the data. The possible values include one or more (in a list) of ‘Point’, ‘LineString’, ‘LinearRing’, ‘Polygon’, ‘MultiPoint’, ‘MultiLineString’, ‘MultiPolygon’, or ‘MultiGeometry’. You will likely have to open the file itself within QGIS or R (ie . the sf
package) to get the correct geometry value.
To add just a geometry
slot use:
To add it using the data pid:
1. Get the geometry and spatialReference
2. Use pid_to_eml_entity()
to generate the spatialVector
spatialVector <- pid_to_eml_entity(adc,
pkg$data[n],
entity_type = "spatialVector",
entityName = "filename.kml",
entityDescription = "some desciption",
attributeList = attributeList,
geometry = "Point",
spatialReference = list(horizCoordSysName = "GCS_WGS_1984"))
- Add the spatialVector to the
doc
doc$dataset$spatialVector[[1]] <- spatialVector
spatialRasters
Most often these come in GeoTiff or Tiff files. The data is presented as a grid of “pixels”. For more information ESRI has a indepth article here.
To use the helper function get:
- the path of your raster file
- an attribute table
- a coordinate system
To get a coordinate system name, you can use the output of the function on your first try (which will print the coordinate reference system, if it is defined). You can use the return value of get_coord_list()
(a large data.frame
) to find the correct coordinate system name.
Another way to get the coordinate system name is using rgdal::GDALinfo(path)
. This function can provide many details for your GeoTiff or Tiff files including the coordinate system name. More information can be found here here.
rgdal::GDALinfo(path)
eml_get_raster_metadata(path, coord_name, attributes)