-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
245 lines (173 loc) · 9.32 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
warning = FALSE, message = FALSE
)
```
# solong
[![R-CMD-check](https://github.com/SCAR/solong/workflows/R-CMD-check/badge.svg)](https://github.com/SCAR/solong/actions)
[![codecov](https://codecov.io/gh/SCAR/solong/branch/master/graph/badge.svg)](https://codecov.io/gh/SCAR/solong)
Overview
--------
This R package provides allometric equations that relate the body size of Southern Ocean taxa to their body part measurements. It is a component of the [Southern Ocean Diet and Energetics Database](https://www.scar.org/data-products/southern-ocean-diet-energetics/) project.
## Packaged equations
```{r include=FALSE}
devtools::load_all()
library(dplyr)
library(ggplot2)
```
The package currently includes `r nrow(sol_equations())` equations, covering mostly cephalopods and fish. A breakdown of the number of equations by taxonomic class and the allometric property that they estimate:
```{r echo=FALSE, cache=TRUE}
library(worrms)
uid <- unique(na.omit(sol_equations()$taxon_aphia_id))
ucl <- lapply(uid, wm_classification)
uid_cls <- tibble(taxon_aphia_id = uid, taxonomic_class = vapply(ucl, function(z)z$scientificname[z$rank == "Class"], FUN.VALUE = ""))
```
```{r echo=FALSE}
##knitr::kable(sol_equations() %>% left_join(uid_cls, by="taxon_aphia_id") %>% xtabs(~taxonomic_class+return_property, data=.))
ggplot(sol_equations() %>% left_join(uid_cls, by = "taxon_aphia_id") %>% dplyr::filter(!is.na(taxonomic_class)), aes(return_property, taxonomic_class)) + stat_bin_2d() +
labs(x = "Return property", y = "Taxonomic class") +
scale_fill_gradient(name = "Count", trans = "sqrt") +
theme_bw() + theme(axis.text.x = element_text(angle = 45, hjust = 1))
```
Installing
----------
```{r, eval=FALSE}
install.packages("devtools")
library(devtools)
install_github("SCAR/solong")
```
Usage
-----
```{r eval=FALSE, message=FALSE, warning=FALSE}
library(solong)
library(dplyr)
```
Let's say we have some measurements of *Architeuthis dux* squid beaks:
```{r}
x <- tibble(LRL = c(11.3, 13.9), species = "Architeuthis dux")
x
```
It doesn't matter what the column names are, but we do need to set the properties of the columns so that `solong` can find the appropriate data to use in each allometric equation. Here we've measured lower rostral length, so:
```{r}
x$LRL <- sol_set_property(x$LRL, "lower rostral length")
```
Now we can apply allometric equations to our data. What equations do we have available for our species of interest?
```{r}
sol_equations() %>% dplyr::filter(taxon_name == "Architeuthis dux") %>% summary
```
Here we use the equation with ID `342218_ML_Roel2000`, which is from Roeleveld (2000) and gives the mantle length of *Architeuthis dux* based on the lower rostral length.
This equation can be applied to to all rows:
```{r}
sol_allometry(x, "342218_ML_Roel2000")
```
Or we can apply a different equation to each row. Here we could use different allometric equations for mantle length:
```{r}
xa <- sol_allometry(x, c("342218_ML_Roel2000", "342218_ML_Clar1986"))
xa
```
The `allometric_value` column contains the values that have been estimated, and the `allometric_property` column gives the name of the property that has been estimated.
We can also see that the returned `allometric_value` is of that specific property, with appropriate units:
```{r}
sol_get_property(xa$allometric_value)
units(xa$allometric_value)
```
## Details
We can apply equations that use different inputs, provided that they estimate the same output property. For example, equation `342218_WW_Clar1986` estimates the body weight of the squid *Architeuthis dux* based on lower rostral length measurements:
```{r}
sol_equation("342218_WW_Clar1986") %>% summary
```
And equation `195932_WW_GaBu1988` estimates the weight of male Weddell seals based on their standard length:
```{r}
sol_equation("195932_WW_GaBu1988") %>% summary
```
Note that this equation estimates weight in kg, whereas `342218_WW_Clar1986` estimates weight in g. We can apply the two equations together to a single data set:
```{r}
x <- tibble(LRL = c(11.3, NA_real_),
species = c("Architeuthis dux", "Leptonychotes weddellii"),
SL = c(NA_real_, 175)) %>%
mutate(LRL = sol_set_property(LRL, "lower rostral length"),
SL = sol_set_property(SL, "standard length", "cm"))
xa <- sol_allometry(x, c("342218_WW_Clar1986", "195932_WW_GaBu1988"))
xa %>% dplyr::select(species, allometric_property, allometric_value)
```
The output values are of property "wet weight" and have all been provided in g (because the output column `allometric value` must have a single set of units):
```{r}
sol_get_property(xa$allometric_value)
units(xa$allometric_value)
```
If we try to apply equations that estimate different properties, we will get a warning:
```{r}
x <- tibble(LRL = c(11.3, 13.9), species = "Architeuthis dux") %>%
mutate(LRL = sol_set_property(LRL, "lower rostral length"))
xa <- sol_allometry(x, c("342218_ML_Roel2000", "342218_WW_Clar1986"))
xa %>% dplyr::select(species, allometric_property, allometric_value)
```
And while the `allometric_property` column still says which property was estimated for each row, the property type and units of the returned `allometric_value` will not be set, because they are not consistent across the different equations:
```{r}
sol_get_property(xa$allometric_value)
```
### Missing information
What happens if we don't have the required information in our data to use a particular equation? The `234631_SL~OL_WiMc1990` equation is for fish length, and requires otolith length (not present in our test data).
```{r}
tryCatch(
sol_allometry(x, "234631_SL~OL_WiMc1990"),
error = function(e) conditionMessage(e)
)
```
### Reliability of equations
Most equations have been published with the number of samples used to fit the equation (N) and the resulting goodness-of-fit of the equation to the data (R^2). These two quantities (if provided by the original source) can be found in the `reliability` component of an equation, and can be used to help decide if a given equation is appropriate for your data.
Some equations, typically from more recent publications, also provide the standard errors of the coefficients (or similar information) and thereby allow the `allometric_value_lower` and `allometric_value_upper` values to be estimated (the upper and lower bounds on the estimate). These should give a more reliable indicator of the precision of the estimated quantities.
```{r}
x <- tibble(TL = 10 %>% sol_set_property("carapace length", with_units = "mm"))
sol_allometry(x, "369214_WW_Lake2003") %>%
dplyr::select(allometric_value, allometric_value_lower, allometric_value_upper)
```
Attempts are made to avoid allowing an equation to extrapolate beyond its valid input data range. Some equations will explicitly return `NA` results for such inputs. The `inputs` component of the equation may also hold information about the range of the inputs used to fit the equation, which may help assess whether your data lie within its valid range.
### Other sources of equations
http://www.fishbase.org provides estimates of length-weight coefficients. These can be obtained via the `sol_fb_length_weight()` function (which uses the `rfishbase` package under the hood). For example:
```{r}
myeq <- sol_fb_length_weight("Electrona antarctica", input_properties = "standard length")
summary(myeq)
x <- tibble(SL = 10) %>%
mutate(SL = sol_set_property(SL, "standard length", with_units = "cm"))
sol_allometry(x, myeq)
```
### Adding your own equations
TODO: document, including what to do when a property is not part of the `sol_properties()` collection.
### Taxonomy
Equations are registered against *taxon_name* and *taxon_aphia_id* (the species identifier in the World Register of Marine Species). The *taxon_aphia_id* may be more reliable than species names, which can change over time. Users might like to look at the [worrms package](https://cran.r-project.org/package=worrms) for interacting with the World Register of Marine Species.
### Other random stuff
Is equation X included in the package? Call `sol_equations()` to get all equations that are part of the package, and have a rummage through that.
To see the references from which equations have been drawn, do something like:
```{r}
eqs <- sol_equations()
## the first few
head(unique(eqs$reference))
```
## More examples
Grab all equations from Eastman (2019):
```{r}
eq <- sol_equations() %>% dplyr::filter(grepl("East2019", equation_id)) %>%
## some equations have been included twice with different species names - drop the duplicates
dplyr::filter(!grepl("Accepted taxon name", notes))
```
Reproduce Figure 2 from that paper:
```{r}
## extract each maximum length
max_lengths <- sapply(eq$equation, function(z) z()$allometric_value)
## histogram
hist(max_lengths, 20)
```
## Related packages
- [units](https://cran.r-project.org/package=units) for handling units of measurement
- [worrms](https://cran.r-project.org/package=worrms) for taxonomy
- [shapeR](https://cran.r-project.org/package=shapeR) for collection and analysis of otolith shape data
- https://github.com/James-Thorson/FishLife for estimating fish life history parameters
- [rfishbase](https://cran.r-project.org/package=rfishbase) an interface to www.fishbase.org