Introduction

In this lesson we will learn how to perform some basic spatial analysis in R. First we will review interpolation using the IDW interpolation method.

Learning outcomes

At the end of this lesson you will be able to:

  1. Interpolate data using Inverse Distance Weighted (IDW) interpolator
  2. Export a raster to geotiff format using writeRaster()

Setup

First, we load all the libraries that we plan to use in our code and ensure that the working directory is set.

# set working dir
setwd("~/Documents/data/oss/oss2017data")
# load libraries
library(dplyr)
library(tidyr)
library(gstat)
library(rgdal)
library(raster)
library(rgeos)
library(scales)
options(stringsAsFactors = FALSE)

Next, let’s get the data. The data that we are using today - similar to yesterday are stored on Figshare. We will use R to

  1. download the data and
  2. unzip it into the SAME data directory that we used yesterday
# download the data from a figshare URL
download.file("https://ndownloader.figshare.com/files/8955466",
              destfile = "./tues-data.zip")
#unzip the data into our oss2017 data directory
unzip("tues-data.zip")

Extract data from a raster in R

Sometimes we want to extract values from a raster dataset and assign them to points or polygons. In the GIS works (Arc and QGIS) we do this via zonal statistics. In the R world we can perform the same tasks using the extract function.

In this segment of the lesson, we will use a set of points. At each point we will extract raster values from a buffer region around that point and calculate a summary stat (in this case a mean raster value). We will then plot the points on a map weighting the size according to relative mean value.

Extract data from a buffer

Extract data from a buffer

To begin let’s import some raster data and the same study area file that we used in class yesterday. Note that the data are in .nc format. This is a hierarchical format which we will discuss more this afternoon.

  1. open study-area/study-area.shp
  2. open climate/air.sfc.mon.mean.nc (HINT: this is a raster)
study_area <- readOGR("study-area/study-area.shp")
## OGR data source with driver: ESRI Shapefile 
## Source: "study-area/study-area.shp", layer: "study-area"
## with 1 features
## It has 1 fields
## Integer64 fields read as strings:  id
# import climate data - netcdf format
climate <- raster("climate/air.sfc.mon.mean.nc")

# check out our data - notice any issues?
crs(study_area)
## CRS arguments:
##  +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
extent(study_area)
## class       : Extent 
## xmin        : -107.5188 
## xmax        : -66.48791 
## ymin        : 10.817 
## ymax        : 37.22675
crs(climate)
## CRS arguments:
##  +proj=lcc +x_0=5632642.22547 +y_0=4612545.65137 +lat_0=50
## +lon_0=-107 +lat_1=50 +lat_2=50 +ellps=WGS84
extent(climate)
## class       : Extent 
## xmin        : -16231.49 
## xmax        : 11313351 
## ymin        : -16231.5 
## ymax        : 8976020

Next – what do we do? Notice any issues? Our data are not in the same projection. We will need to reproject the data in order to perform any processing on it.

Generally I try to avoid reprojecting raster data if at all possible. In this case, let’s reproject our raster data because we are working with a relatively small area (relative to the globe) and we are only extracting values from the data - we won’t be using it for further processing steps.

Generally i’d prefer to reproject the vector data.

Do the following

  1. reproject the raster data (HINT: use the projectRaster function)
  2. crop the raster data using the study_area layer.
  3. finally - plot both layers together using the plot function
# reproject study area layer climate data
climate_geog <- projectRaster(climate,
                              crs = crs(study_area))
# crop the data
climate_geog_cr <- crop(climate_geog,
                        study_area)
plot(climate_geog_cr)
plot(study_area,
     add = TRUE)

If we wanted, we could export our reprojected and cropped climate data as a geotiff.

Import vector points layer

Next, let’s import our vector data that we downloaded from figshare.

# import vector data
sea_level_2000_sp <- readOGR("tues-data/sea_level_2000.shp")
## OGR data source with driver: ESRI Shapefile 
## Source: "tues-data/sea_level_2000.shp", layer: "sea_level_2000"
## with 16 features
## It has 12 fields
crs(sea_level_2000_sp)
## CRS arguments:
##  +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
extent(sea_level_2000_sp)
## class       : Extent 
## xmin        : -97.21667 
## xmax        : -81.105 
## ymin        : 24.555 
## ymax        : 30.40333

Next we will extract climate surface temperature values for each point location in our sea_level_2000 layer. Rather than select a single value, we will create a circular BUFFER around each point and then extract raster values that fall within that circle. Finally, we will calculate a mean of all pixels values to create a final value.

# Note that below will return a data.frame containing the max height
# calculated from all pixels in the buffer for each plot
climate_mean <- raster::extract(climate_geog_cr, # the raster that you wish to extract values from
                    sea_level_2000_sp, # a point, or polygon spatial object
                    buffer = .5, # specify a .5 degree radius
                    fun = mean, # extract the MEAN value from each plot
                    sp = TRUE) # create spatial object
class(climate_mean)
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"

The default option when we extract data in R is to store all of the raster pixel values in a list. However, when we add a function argument to extract(), R summarizes the data for us. Also note that we are using the sp = TRUE argument to tell R to create a spatialPointsDataFrame.

This will allow us to plot our data!

Speaking of plot - let’s plot the points on top of our climate data! Notice below that i’ve tricked R to plot the points by surface temperature value. Because the range of the data is not large, i’ve tricked R by sending the cex = argument (which specifies the point size) a command which

  1. takes the climate_mean value and subtracts 280
  2. divides by 10 to make the points smaller

cex = ((climate_mean$Monthly.air.temperature.at.Surface - 280)/10)

# view mean values
climate_mean$Monthly.air.temperature.at.Surface
##  [1] 291.2770 291.7492 291.6662 291.4210 291.2210 287.7624 287.1931
##  [8] 292.1241 283.0992 291.6178 283.5167 287.2808 294.3733 294.0009
## [15] 297.4615 297.5251
# scale the data
(climate_mean$Monthly.air.temperature.at.Surface - 280)/10
##  [1] 1.1277020 1.1749156 1.1666199 1.1420974 1.1220982 0.7762443 0.7193124
##  [8] 1.2124098 0.3099197 1.1617819 0.3516650 0.7280778 1.4373332 1.4000871
## [15] 1.7461509 1.7525143

Let’s plot the data.

# plot climate data
plot(climate_geog_cr,
     main = "sample points sized by surface temperature")
plot(climate_mean,
     pch = 19,
     # size the points according to the temp value
     cex = ((climate_mean$Monthly.air.temperature.at.Surface - 280)/10),
     add = TRUE)

Let’s try to scale the points using the scales library. Here we scale the elevation data to to a range inclusive of 1:3.

# plot climate data
plot(climate_geog_cr,
     main = "sample points sized by surface temperature")
plot(climate_mean,
     pch = 19,
     # size the points according to the temp value
     cex = scales::rescale(climate_mean$Monthly.air.temperature.at.Surface - 280, to = c(1, 3)),
     add = TRUE)