In this lesson we will learn how to perform some basic spatial analysis in R
. First we will review interpolation using the IDW interpolation method.
At the end of this lesson you will be able to:
writeRaster()
First, we load all the libraries that we plan to use in our code and ensure that the working directory is set.
# set working dir
setwd("~/Documents/data/oss/oss2017data")
# load libraries
library(dplyr)
library(tidyr)
library(gstat)
library(rgdal)
library(raster)
library(rgeos)
library(scales)
options(stringsAsFactors = FALSE)
Next, let’s get the data. The data that we are using today - similar to yesterday are stored on Figshare. We will use R to
# download the data from a figshare URL
download.file("https://ndownloader.figshare.com/files/8955466",
destfile = "./tues-data.zip")
#unzip the data into our oss2017 data directory
unzip("tues-data.zip")
Sometimes we want to extract values from a raster dataset and assign them to points or polygons. In the GIS works (Arc and QGIS) we do this via zonal statistics. In the R world we can perform the same tasks using the extract function.
In this segment of the lesson, we will use a set of points. At each point we will extract raster values from a buffer region around that point and calculate a summary stat (in this case a mean raster value). We will then plot the points on a map weighting the size according to relative mean value.
To begin let’s import some raster data and the same study area file that we used in class yesterday. Note that the data are in .nc
format. This is a hierarchical format which we will discuss more this afternoon.
study-area/study-area.shp
climate/air.sfc.mon.mean.nc
(HINT: this is a raster)study_area <- readOGR("study-area/study-area.shp")
## OGR data source with driver: ESRI Shapefile
## Source: "study-area/study-area.shp", layer: "study-area"
## with 1 features
## It has 1 fields
## Integer64 fields read as strings: id
# import climate data - netcdf format
climate <- raster("climate/air.sfc.mon.mean.nc")
# check out our data - notice any issues?
crs(study_area)
## CRS arguments:
## +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
extent(study_area)
## class : Extent
## xmin : -107.5188
## xmax : -66.48791
## ymin : 10.817
## ymax : 37.22675
crs(climate)
## CRS arguments:
## +proj=lcc +x_0=5632642.22547 +y_0=4612545.65137 +lat_0=50
## +lon_0=-107 +lat_1=50 +lat_2=50 +ellps=WGS84
extent(climate)
## class : Extent
## xmin : -16231.49
## xmax : 11313351
## ymin : -16231.5
## ymax : 8976020
Next – what do we do? Notice any issues? Our data are not in the same projection. We will need to reproject the data in order to perform any processing on it.
Generally I try to avoid reprojecting raster data if at all possible. In this case, let’s reproject our raster data because we are working with a relatively small area (relative to the globe) and we are only extracting values from the data - we won’t be using it for further processing steps.
Generally i’d prefer to reproject the vector data.
Do the following
# reproject study area layer climate data
climate_geog <- projectRaster(climate,
crs = crs(study_area))
# crop the data
climate_geog_cr <- crop(climate_geog,
study_area)
plot(climate_geog_cr)
plot(study_area,
add = TRUE)
If we wanted, we could export our reprojected and cropped climate data as a geotiff.
Next, let’s import our vector data that we downloaded from figshare.
# import vector data
sea_level_2000_sp <- readOGR("tues-data/sea_level_2000.shp")
## OGR data source with driver: ESRI Shapefile
## Source: "tues-data/sea_level_2000.shp", layer: "sea_level_2000"
## with 16 features
## It has 12 fields
crs(sea_level_2000_sp)
## CRS arguments:
## +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
extent(sea_level_2000_sp)
## class : Extent
## xmin : -97.21667
## xmax : -81.105
## ymin : 24.555
## ymax : 30.40333
Next we will extract climate surface temperature values for each point location in our sea_level_2000
layer. Rather than select a single value, we will create a circular BUFFER around each point and then extract raster values that fall within that circle. Finally, we will calculate a mean of all pixels values to create a final value.
# Note that below will return a data.frame containing the max height
# calculated from all pixels in the buffer for each plot
climate_mean <- raster::extract(climate_geog_cr, # the raster that you wish to extract values from
sea_level_2000_sp, # a point, or polygon spatial object
buffer = .5, # specify a .5 degree radius
fun = mean, # extract the MEAN value from each plot
sp = TRUE) # create spatial object
class(climate_mean)
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
The default option when we extract data in R is to store all of the raster pixel values in a list. However, when we add a function argument to extract(), R summarizes the data for us. Also note that we are using the sp = TRUE argument to tell R to create a spatialPointsDataFrame.
This will allow us to plot our data!
Speaking of plot - let’s plot the points on top of our climate data! Notice below that i’ve tricked R to plot the points by surface temperature value. Because the range of the data is not large, i’ve tricked R by sending the cex = argument (which specifies the point size) a command which
cex = ((climate_mean$Monthly.air.temperature.at.Surface - 280)/10)
# view mean values
climate_mean$Monthly.air.temperature.at.Surface
## [1] 291.2770 291.7492 291.6662 291.4210 291.2210 287.7624 287.1931
## [8] 292.1241 283.0992 291.6178 283.5167 287.2808 294.3733 294.0009
## [15] 297.4615 297.5251
# scale the data
(climate_mean$Monthly.air.temperature.at.Surface - 280)/10
## [1] 1.1277020 1.1749156 1.1666199 1.1420974 1.1220982 0.7762443 0.7193124
## [8] 1.2124098 0.3099197 1.1617819 0.3516650 0.7280778 1.4373332 1.4000871
## [15] 1.7461509 1.7525143
Let’s plot the data.
# plot climate data
plot(climate_geog_cr,
main = "sample points sized by surface temperature")
plot(climate_mean,
pch = 19,
# size the points according to the temp value
cex = ((climate_mean$Monthly.air.temperature.at.Surface - 280)/10),
add = TRUE)
Let’s try to scale the points using the scales library. Here we scale the elevation data to to a range inclusive of 1:3.
# plot climate data
plot(climate_geog_cr,
main = "sample points sized by surface temperature")
plot(climate_mean,
pch = 19,
# size the points according to the temp value
cex = scales::rescale(climate_mean$Monthly.air.temperature.at.Surface - 280, to = c(1, 3)),
add = TRUE)