-
Notifications
You must be signed in to change notification settings - Fork 4
/
index.Rmd
488 lines (360 loc) · 23.2 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
---
pagetitle: "Extra: Intro to vector"
author: "Jan Verbesselt, Sytze de Bruyn, Loïc Dutrieux, Valerio Avitabile, Dainius Masiliunas, David Swinkels"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
knitrBootstrap::bootstrap_document:
title: "Extra: Intro to vector"
theme: "simplex"
highlight: Tomorrow Night Bright
menu: FALSE
theme.chooser: TRUE
highlight.chooser: TRUE
---
<style type="text/css">
body {max-width: none;}
a:visited {color: #91170a;}
</style>
# [WUR Geoscripting](https://geoscripting-wur.github.io/) <img src="https://www.wur.nl/upload/854757ab-168f-46d7-b415-f8b501eebaa5_WUR_RGB_standard_2021-site.svg" alt="WUR logo" style="height: 35px;"/>
# Extra: Intro to vector
## Learning objectives
* Create point, line and polygon objects from scratch
* Explore the structure of `sf` classes for spatial vector data
* Plot spatial vector data
* Transform between datums and map projections
* Write and read spatial vector formats (e.g. KML, GML, GeoJSON, shapefile)
* Apply basic operations on vector data, such as buffering, intersection and area calculation
# Vector R basics
## Working with spatial vector data in R
In this tutorial we will use the *sf* package. The *sf* package focuses solely on vector data. It provides a standardized encoding of vector data and uses GDAL to read and write data, GEOS for geometrical operations and Proj for projection conversions and datum transformations.
The possiblities are huge. During the course (and also in this extra tutorial) we can only scratch the surface with some essentials, which hopefully invite you to experiment further and use them in your research. Details can be found in the book *Applied Spatial Data Analysis with R* and several vignettes authored by Roger Bivand, Edzer Pebesma and Virgilio Gomez-Rubio.
## Creating and manipulating geometries
The package *sf* has geometry types, such as `point` (for one point), `linestring` (for one line) and `polygon` (for one polygon). For sets of spatial classes it has the `multipoint` (multiple points), `multilinestring`, `multipolygon` and `geometrycollection`. The `geometrycollection` data class allows to have mixed geometries in one object. The package *sf* has even more special geometry types [(check here)](https://cran.r-project.org/web/packages/sf/vignettes/sf1.html). The most common geometry types for spatial vector data are shown below:
| Geometry | type | description |
|------------- |----------------- |------------- |
| point | *POINT* | single point ; zero-dimensional geometry |
| points | *MULTIPOINT* | set of points |
| line | *LINESTRING* | sequence of points connected by straight, non-self intersecting line pieces; one-dimensional geometry |
| lines | *MULTILINESTRING* | set of linestrings |
| rings | *POLYGON* | sequence of points form a closed, non-self intersecting ring; the first ring denotes the exterior ring, zero or more subsequent rings denote holes in this exterior ring; geometry with a positive area (two-dimensional); |
| rings | *MULTIPOLYGON* | set of polygons |
| mix | *GEOMETRYCOLLECTION* | set of geometries of any type except GEOMETRYCOLLECTION |
Table: Overview of *sf* package geometry types.
Simple feature objects are stored in data classes.
Three data classes are used in *sf* to represent features:
- simple feature geometry (`sfg`): the feature geometry of an individual simple feature;
- simple feature collections (`sfc`): a list or collection of simple feature geometries;
- simple features (`sf`): simple feature collections linked to a data.frame with feature attributes;
We will go through a few examples of creating geometries from scratch to familiarize yourself with these geometry types and data classes.
First, go to [Google Maps](https://www.google.nl/maps/?hl=en) on your computer, click on a location (e.g. a building or road), look at the bottom of the banner that pops up and copy the longitude and latitude of two points in Wageningen that are relevant to you. Use a decimal degree notation with at least 4 digits after the decimal point.
```{block, type="alert alert-success"}
> **Question 1:** In what coordinate reference system [CRS] are your points from Google Maps? What is the EPSG belonging to that CRS?
```
### Points
The example below shows how you can create spatial point objects from these coordinates. Type `?function name` (e.g. `?st_point`) for finding help on the functions used.
```{r}
# Check for the sf package and install if missing
if(!"sf" %in% installed.packages()){install.packages("sf")}
# Load the sf package
library(sf)
# Create two simple feature geometry points
# Get coordinates of two points identified in Google Maps, for example:
point1_sfg <- st_point(c(5.664190, 51.993843)) # enter your own lat/long
point2_sfg <- st_point(c(5.673310, 51.982752)) # enter your own lat/long
# Combine point features into simple feature collection with CRS: mypoints_sfc
mypoints_sfc <- st_sfc(point1_sfg, point2_sfg, crs = 4326)
```
```{r, eval=FALSE}
# Inspect the geometry types and data class of your objects
point1_sfg
class(point1_sfg)
mypoints_sfc
class(mypoints_sfc)
```
```{r, eval=TRUE}
# Create and display some attribute data and store in a data frame
mydata <- data.frame(cbind(
id = c(1, 2),
name = c("my first point","my second point")))
mydata
# Combine simple feature collection and dataframe to create simple feature: mypoints_sf
mypoints_sf <- st_set_geometry(mydata, mypoints_sfc)
```
```{r, eval=FALSE}
mypoints_sf
class(mypoints_sf) # notice how simple features are two classes
class(mypoints_sf) == "data.frame"
```
```{block, type="alert alert-success"}
> **Question 2:** What is the the difference between the objects `mypoints_sfc` and `mypoints_sf`?
```
```{r, eval=TRUE}
# When plotting a simple feature it makes a plot for every column
# Plot only the name column
plot(mypoints_sf[, "name"], col = c("orange", "blue"))
```
### Lines
Now let us connect the two points by a straight line. Consult how to make a line: `?st_linestring`.
```{r}
## Now make the line by creating a matrix and linestring
points_matrix <- rbind(point1_sfg, point2_sfg) # create matrix
myline_sfg <- st_linestring(points_matrix) # create sfg
myline_sfc <- st_sfc(myline_sfg, crs = 4326) # create sfc
```
```{r, eval=FALSE}
# Check the class and structure of the objects again.
class(points_matrix)
class(myline_sfg)
class(myline_sfc)
myline_sfc
```
```{r}
# Add feature attributes by giving the line an id and a name
myline_df <- data.frame(id = 1, name = "Straight line") # create data.frame
myline_sf <- st_set_geometry(myline_df, myline_sfc) # create sf
```
```{r, eval=FALSE}
# Check class and geometry type
class(myline_df)
class(myline_sf)
myline_sf
```
```{r, fig.height=6, fig.width=12}
plot(myline_sf[1], main = myline_sf$name, col = "red", lwd = 3, lty = 5, graticule = TRUE, axes = TRUE)
```
*Try to understand the above code and its results by studying help: `?plot_sf`*
*Try to add the points together with the lines on the same map.*
### Writing and reading spatial vector data
What now follows is a brief intermezzo before we continue with the classes for polygons.
We will use the OGR functionality of GDAL available through the package *sf*.
```{r eval = FALSE}
# Create subdirectory data within current working directory
dir.create("data", showWarnings = FALSE)
# Write to KML
st_write(obj = mypoints_sf, dsn = "data/mypoints.kml")
st_write(obj = myline_sf, dsn = "data/myline.kml")
```
The function `st_write` requires entries for the arguments `obj` (object of class sf) and `dsn` (data source name). The function detects what driver it needs based on the file extension of the dsn. Please study details in the help file: `?st_write`.
Similarly the function `st_read` allows reading OGR compatible data into a suitable simple feature object.
Check (in Google Maps) whether the attribute data were written correctly to the KML output (use [My Maps](https://www.google.com/maps/about/mymaps/) from Google). You need to be logged in on your Google account, go to [My Maps](https://www.google.com/maps/about/mymaps/), click `get started`, click the `+` button to create a new digital map. On the untitled layer click `import`, import the `mypoints.kml`, add a new layer and import `myline.kml`. Cool! You just created a webmap with your point and line features.
Now let us see if we can create a line in the webmap and import it to R. Digitize a route (e.g. a walking route) between two points of interest in Wageningen: click on the `draw a line` symbol (between the place marker and the direction symbol), click `Add walking route`, draw your route in Wageningen and name the new layer `myroute`.
```{r echo=FALSE, out.width='100%'}
knitr::include_graphics('figs/Route_map.png')
```
Click on the main menu in the left panel (top ... symbol), click `export to KML`, check box `export to .KML file`, select your route layer and download the kml (check [here](https://support.google.com/mymaps/answer/3109452?hl=en) how to export). Save the KML in the data directory under the name `myroute.kml`. We will read this file into a simple features object.
```{r}
## Read the KML as a simple feature
myroute <- st_read(dsn = "data/myroute.kml")
```
```{r}
# What is the geometry type of this object?
myroute
# Select name column only
names(myroute)
myroute = myroute[, "Name"]
# What geometry types do you see?
plot(myroute)
```
Now we have the data, but some of our friends want these exact data too. One friend of ours is a software engineer and he wants a GeoJSON. Another friend is a GIS-analyst in QGIS and as a backup he wants the file in two formats: Geographic Markup Language (GML) and a Shapefile. These fileformats (KML, GeoJSON, GML and Shapefile) are commonly used in spatial analysis. Let's try to give them the files in those formats!
```{r, eval=FALSE}
# Try to export the simple feature to a GeoJSON. What happens?
try(st_write(myroute, dsn = "data/route.geojson"))
```
Okay success! Let us continue with the GML.
```{r, eval=FALSE}
# Try to export the simple feature to a GML. What happens?
try(st_write(myroute, dsn = "data/route.gml"))
```
Works! Now transform to a Shapefile.
```{r, message=FALSE, warning=TRUE, eval = FALSE}
## Again export to a Shapefile. What happens now?
try(st_write(myroute, dsn = "data/route.shp"))
```
Writing the myroute object to a Shapefile worked only for the linestring, but did not work for the points. Shapefiles do not handle multiple geometry types well. We can solve this! Have a look at `?st_is`.
```{r, eval=TRUE}
# Select linestring route
myroute_line <- myroute[st_is(myroute, type = "LINESTRING"),]
# Select start and end point route
myroute_points <- myroute[st_is(myroute, type = "POINT"),]
```
```{r, eval=FALSE, message=FALSE, warning=TRUE}
# Let us try to write the simple features to a Shapefile
try(st_write(myroute_line, dsn = "data/route_line.shp"))
try(st_write(myroute_points, dsn = "data/route_points.shp"))
```
We run into another issue; Shapefiles do not support 3D line strings and points. We have fixed the geometry type, but now need to work on the dimensions of the data. Let us look into dimensions.
### Dimensions
All geometries are composed of points. Points are coordinates in a 2-, 3- or 4-dimensional space. All points in a geometry have the same dimensionality. In addition to X and Y coordinates, there are two optional additional dimensions:
- a Z coordinate, denoting altitude
- a M coordinate (rarely used), denoting a measure that is associated with the point; examples could be time of measurement, or measurement error of the coordinates
The four possible cases are:
1. two-dimensional points `XY`
2. three-dimensional points `XYZ`
3. three-dimensional points `XYM`
4. four-dimensional points `XYZM`
So we need to find a function that changes dimensions from XYZ to XY. Feel free to script your own function that can change the dimensions. Otherwise look into `?st_zm`.
```{r, eval=TRUE}
## Convert XYZ dimension to XY dimension
myroute_points_xy <- st_zm(myroute_points)
myroute_line_xy <- st_zm(myroute_line)
## Check your geometry type and dimension. Are they point/linestring and XY?
myroute_points_xy
myroute_line_xy
```
```{r, eval=FALSE, message = FALSE}
# Try to export the simple feature to a Shapefile
try(st_write(myroute_points_xy, dsn = "data/route_points.shp"))
# Overwrite your previous Shapefile by adding delete_layer = TRUE
try((st_write(myroute_line_xy, dsn = "data/route.shp", delete_layer = TRUE)))
```
Hooray, it worked! Now our friends have the data they wanted. Try to understand the above code and results. Feel free to display the data in QGIS.
### Transformation of coordinate system
Transformations between coordinate systems are crucial to many GIS applications. The **Keyhole Markup Language (KML)** used by Google Earth uses latitude and longitude in a polar WGS84 coordinate system (i.e. geographic coordinates). However, in some of the examples below we will use metric distances (i.e. projected coordinates).There are two types of coordinate systems that you need to recognise: **projected coordinate systems** and **geographic coordinate systems**.
One of the challenges of working with geo-spatial data is that geodetic locations (points on the Earth surface) are mapped into a two-dimensional cartesian plane using a cartographic projection. Projected coordinates are coordinates that refer to a point on a two-dimensional map that *represents* the surface of the Earth (i.e. **projected coordinate system**). Latitude and longitude values are an example of an **unprojected coordinate system**. These are coordinates that directly refer to a point on the Earth's surface. One way to deal with this is by transforming the data to a planar coordinate system. In R this can be achieved via bindings to the **PROJ - Cartographic Projections Library** ([https://proj.org/](https://proj.org/)), which are available in *sf*.
Central to spatial data in the *sf* package is that they have a coordinate reference system, which is coded as a numeric epsg code and a character string describing the projection by the *PROJ* projection library. Operations on different spatial data sets need to have compatible coordinate reference systems (i.e., identical). An interface to the *PROJ* library is available through the *sf* package.
We will transform our spatial data to the Dutch grid (Rijksdriehoekstelsel), often referred to as RD or RD_New.
Please note that:
- Some widely spread definitions of the Dutch grid (**EPSG: 28992**) are incomplete
(see e.g. [http://www.spatialreference.org](http://www.spatialreference.org) and search for the EPSG number);
- The transformation used below is approximate. Details can be found at [http://nl.wikipedia.org/wiki/Rijksdriehoekscoordinaten](http://nl.wikipedia.org/wiki/Rijksdriehoekscoordinaten).
- The PROJ details can be found here: [http://www.spatialreference.org/ref/epsg/28992/proj4/](http://www.spatialreference.org/ref/epsg/28992/proj4/)
```{r}
## Define CRS object for RD projection
# See how the transformation from unprojected WGS84 is defined to a projected coordinate system
prj_string_RD <- st_crs("+proj=sterea +lat_0=52.15616055555555 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +towgs84=565.2369,50.0087,465.658,-0.406857330322398,0.350732676542563,-1.8703473836068,4.0812 +units=m +no_defs")
# Perform the coordinate transformation from WGS84 to RD
myroute_rd_string <- st_transform(myroute_line_xy, crs = prj_string_RD)
# Instead of having to define the projection ourselves, we can make use of the projection library by using an EPSG code
myroute_rd_num <- st_transform(myroute_line_xy, crs = 28992)
myroute_points_rd <- st_transform(myroute_points_xy, crs = 28992)
# Check if your transformed routes are the same
identical(myroute_rd_string, myroute_rd_num)
```
```{block, type="alert alert-success"}
> **Question 3:** Why are your routes not identical? Hint: inspect the data
```
After transforming the line, plot the line using the basic plot command.
```{r plotmyroute_rd_num, fig.width=3, fig.height=3, message=TRUE, eval=TRUE, fig.align='center',strip.white=TRUE}
plot(myroute_rd_num, main = myroute_rd_num$name, col = "red")
box()
```
Now that the geometries are projected to a planar coordinate system, the length can be computed. Let's use different methods to calculate the distance between the two points. Before we calculate distances we need the package geosphere.
```{r}
# Check for the geosphere package and install if missing
if(!"geosphere" %in% installed.packages()){install.packages("geosphere")}
# Load package
library(geosphere)
# Network distance: WGS84, RD_New
(dist_netw_wgs = st_length(myroute_line_xy)) # line with WGS84 CRS
(dist_netw_rd = st_length(myroute_rd_num)) # line with RD_new CRS
## Euclidean distance: WGS84, RD_New
# Get data from matrix position [1, 2]
(dist_eucl_wgs = st_distance(myroute_points_xy)[1, 2])
(dist_eucl_rd = st_distance(myroute_points_rd)[1, 2])
```
Okay as expected the network distance is bigger than a straight line between the two points, but a question remains:
```{block, type="alert alert-success"}
> **Question 4:** Why does the distance slightly differ between objects with CRS WGS84 and RD_New?
```
Now we know the network and euclidean distance between the two points. As you might have noticed the `st_length()` and `st_distance()` functions gave a unit behind the number. It was a meter in our case, but maybe we want to know how many feet or inch the distance was. The *sf* package allows unit conversions by bindings to the *units* package.
```{r, eval=TRUE}
units::set_units(dist_netw_rd, km)
units::set_units(dist_netw_rd, feet)
units::set_units(dist_netw_rd, inch)
```
Calculating units can also be performed for surfaces or volumes. Later you can give a try to calculate from square meters to hectares for a surface area.
### Polygons
After the intermezzo, we now continue with *sf* classes for polygon objects. The idea is to illustrate the classes; the data are meaningless. Let us create overlapping circles around our points.
```{r, eval=TRUE}
# Create sfc with geometry in RD New
# We need to have projected (RD_New) data to be able to measure features in 2D (e.g. meters)
point1_sfg_rd <- st_point(c(174023.833719082, 445086.995949553))
point2_sfg_rd <- st_point(c(174655.068279838, 443855.475880421))
# Combine point features into simple feature collection with CRS
mypoints_sfc_rd <- st_sfc(point1_sfg_rd, point2_sfg_rd, crs = 28992)
# Create individual points
pnt1_rd <- mypoints_sfc_rd[[1]]
pnt2_rd <- mypoints_sfc_rd[[2]]
# Calculate distance between points
dist_points_rd <- st_distance(mypoints_sfc_rd)[1, 2]
```
```{r, eval=TRUE}
# Make circles around points, with radius equal to distance between points
# Define a series of angles going from 0 to 2pi
ang <- pi * 0:200 / 100
# Distance needs to be converted from units m class to double class
circle1x <- pnt1_rd[1] + cos(ang) * as.double(dist_points_rd)
circle1y <- pnt1_rd[2] + sin(ang) * as.double(dist_points_rd)
circle2x <- pnt2_rd[1] + cos(ang) * as.double(dist_points_rd)
circle2y <- pnt2_rd[2] + sin(ang) * as.double(dist_points_rd)
c1 <- cbind(circle1x, circle1y)
c2 <- cbind(circle2x, circle2y)
```
You can plot these points using basic R plot commands:
```{r, fig.width=4, fig.height=6, message=TRUE, eval=TRUE, fig.align='center'}
plot(c1, pch = 19, cex = 0.2, col = "red", ylim = range(circle1y, circle2y), xlim = range(circle1x, circle2x))
points(c2, pch = 19, cex = 0.2, col = "blue")
plot(mypoints_sfc_rd, pch = 3, col = "darkgreen", add = TRUE)
```
See how we calculated a circle around every point. Although it looks like a closed ring, the visualized circle is not a polygon. Let's try to create a circular polygon. As before we step through the process of creating a simple feature:
- `Simple feature geometry (sfg)`
- `Simple feature collections (sfc)`
- `Dataframe (df)`
- `Simple features (sf)`
```{r}
# Bind columns together into matrix and create a list of matrix
# ?cbind combines columns and ?rbind combines rows
circle1_sfg <- st_polygon(list(cbind(circle1x, circle1y)))
circle2_sfg <- st_polygon(list(cbind(circle2x, circle2y)))
circles_sfc <- st_sfc(circle1_sfg, circle2_sfg, crs = 28992)
circles_df <- data.frame(name = c("circle1", "circle2"), row.names = c("1", "2"))
circles_sf <- st_set_geometry(circles_df, circles_sfc)
```
```{r}
## Plot the circle
plot(circles_sf)
```
Now we calculated the points of the circle to obtain a perfect circle and then made it into a polygon, but we can also create a circular polygon by using the function `st_buffer()`.
The *sf* package can do more spatial operations by spatially selecting certain features and changing the geometry. Examples of spatial selections are `st_intersects`, `st_contains`, `st_within`, `st_covers` etc.
The package can change geometries of features with `st_difference` (clip), `st_intersection` (intersect), `st_union`, `st_sym_difference`, `st_centroid`, `st_segmentize`, `st_convex_hull` and `st_buffer`. More methods can be found with the command: `methods(class = "sf")`.
The final results of `st_buffer` can be plotted using basic R plotting commands:
```{r, fig.height=6, fig.width=4, fig.cap="plot example of final results", fig.align='center'}
# Buffer
point_buff <- st_buffer(mypoints_sfc_rd, dist = dist_points_rd)
# Plot
plot(point_buff, col = c("#2b8cbe", "#31a354"), lwd = 2)
plot(mypoints_sfc_rd, add = TRUE, col = "red", pch = 19, cex = 1.5)
```
```{r buffering, eval = TRUE, fig.width=4, fig.height=4, fig.align='center'}
# Buffer again with the nquadsegs option
circle_quadsegs <- st_buffer(mypoints_sfc_rd[1], dist = dist_points_rd, nQuadSegs = 2)
circles_diff <- st_difference(circles_sfc[1], circle_quadsegs)
# Plot (see the difference?)
plot(circles_sfc[1], col = "red")
plot(circle_quadsegs, add = TRUE, lty = 3, lwd = 2, col = "grey")
```
```{block, type="alert alert-success"}
> **Question 5:** What happens if you change `nQuadSegs` to a higher number?
```
```{r, eval = TRUE, fig.width=4, fig.height=4, fig.align='center'}
# Check areal difference
st_area(circles_diff)
# What is the difference in hectares?
units::set_units(st_area(circles_diff), hectare)
```
```{r, eval = TRUE, fig.width=4, fig.height=4, fig.align='center'}
# Intersect
circles_intersect <- st_intersection(circles_sfc[1], circles_sfc[2])
# Plot
plot(circles_sfc, col = "grey")
plot(circles_intersect, col = "red", add = TRUE)
```
```{r outcome}
# What percentage of the two circles overlaps?
(paste(round(100 * st_area(circles_intersect) / st_area(circles_sfc[1]), 2), "%"))
```
```{block, type="alert alert-success"}
> **Question 6:** Do you understand the script? What is the difference between `st_difference` and `st_intersection` ?
```
# More info
* [Learning more about SF and Raster data via Data Camp](https://campus.datacamp.com/courses/spatial-analysis-in-r-with-sf-and-raster/)