-
Notifications
You must be signed in to change notification settings - Fork 0
/
gibson2023.Rmd
523 lines (409 loc) · 51 KB
/
gibson2023.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
---
title: Urban biogeography of fungal endophytes across San Francisco
preprint: false
author:
- name: Emma Gibson
affiliation: 1
email: [email protected]
- name: Naupaka Zimmerman
affiliation: 1
email: [email protected]
corresponding: true
affiliation:
- code: 1
address: Department of Biology, University of San Francisco
abstract: >
In natural and agricultural systems, the plant microbiome--the microbial organisms associated with plant tissues and rhizosphere soils--has been shown to have important effects on host physiology and ecology, yet we know little about how these plant-microbe relationships play out in urban environments. Here we characterize the composition of fungal communities associated with living leaves of one of the most common sidewalk trees in the city of San Francisco, California. We focus our efforts on endophytic fungi (asymptomatic microfungi that live inside healthy leaves), which have been shown in other systems to have large ecological effects on the health of their plant hosts. Specifically, we characterized the foliar fungal microbiome of *Metrosideros excelsa* (Myrtaceae) trees growing in a variety of urban environmental conditions. We used high-throughput culturing, PCR, and Sanger sequencing of the internal transcribed spacer nuclear ribosomal DNA (ITS nrDNA) region to quantify the composition and structure of fungal communities growing within healthy leaves of 30 *M. excelsa* trees from 6 distinct sites, which were selected to capture the range of environmental conditions found within city limits. Sequencing resulted in 854 high-quality ITS sequences. These sequences clustered into 85 Operational Taxonomic Units (97% OTUs). We found that these communities encompass relatively high alpha (within) and beta (between-site) diversity. Because the communities are all from the same host tree species, and located in relatively close geographical proximity to one another, these analyses suggest that urban environmental factors such as heat islands or differences in vegetation or traffic density (and associated air quality) may potentially be influencing the composition of these fungal communities. These biogeographic patterns provide evidence that plant microbiomes in urban environments can be as dynamic and complex as their natural counterparts. As human populations continue to transition out of rural areas and into cities, understanding the factors that shape environmental microbial communities in urban ecosystems stands to become increasingly important.
bibliography: references.bib
csl: peerj.csl
output:
bookdown::pdf_book:
base_format: rticles::peerj_article # for using bookdown features like \@ref()
includes:
in_header: preamble.tex
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, message = FALSE, warning = FALSE)
```
<!-- setup code -->
```{r load-packages}
# load packages
# tidyverse packages
library("dplyr")
library("tidyr")
library("ggplot2")
library("ggthemes")
# for community analyses
library("vegan")
library("broom")
# for mapping
library("ggmap")
library("rgdal")
library("sf")
# for document output and rendering
library("ggpubr")
library("ggstatsplot")
library("knitr")
library("kableExtra")
library("patchwork")
library("rticles")
# save run environment to logfile in output directory after loading all packages
devtools::session_info() %>%
capture.output(file = "output/log_files/r_session_info.txt")
```
```{r run-pre-rmd-scripts, include=FALSE}
# run R data processing scripts
source("scripts/02_join_metadata.R")
source("scripts/03_make_otu_table.R")
source("scripts/04_clean_and_join_tbas_taxonomy.R")
source("scripts/05_setup_maps_and_gis_coords.R")
```
```{r set-site-name-color-vectors}
# set the values for site name and colors for later use
site_names_vector_by_site_id <- c("Balboa", # S1B
"Downtown", # S2B
"Mt. Davidson", # S3B
"Bay", # S4B
"Freeway", # S5B
"Ocean") # S6B
# colorblind-friendly color palette generated via `scales::viridis_pal()(6)`
site_colors_vector_by_site_id <- c("#440154FF", # Balboa
"#414487FF", # Downtown
"#2A788EFF", # Mt. Davidson
"#22A884FF", # Bay
"#7AD151FF", # Freeway
"#FDE725FF") # Ocean
site_colors_alphabetical_by_site_name <- c("#440154FF", # Balboa
"#22A884FF", # Bay
"#414487FF", # Downtown
"#7AD151FF", # Freeway
"#2A788EFF", # Mt. Davidson
"#FDE725FF") # Ocean
site_id_west_to_east <- c("S6",
"S1",
"S3",
"S5",
"S2",
"S4")
sites_names_west_to_east <- c("Ocean",
"Balboa",
"Mt. Davidson",
"Freeway",
"Downtown",
"Bay")
```
# Introduction {-}
Although major cities only cover a small portion of the Earth's total geographic area, more than 50% of the human population lives in these urban centers, and these cities have major impacts on the biogeochemistry, hydrology, and climate of both their immediate surroundings and of the biosphere as a whole [@schneider2009new]. With increasing urbanization, understanding the ecology of cities and urban settings has become critical to human health and well being. Despite their comparatively small geographic size, the high density of human populations in these environments makes them distinct ecosystems with their own unique dynamics [@sukopp1998urban]. Urban environments represent the convergence of humans from around the world, any plant or animal species those humans might have brought with them, and infrastructures such as roads, sewers, and tall buildings. Despite the complexity that these ecosystems present, they are often overlooked by ecologists because more traditional ecology has focused on 'natural' systems. In urban systems, however, human influence is a primary ecological factor [@mcdonnell2011history]. In recent years, ecologists have begun studying the urban environment just as they would a natural environment, in order to understand the novel environmental conditions this setting presents to the organisms that live there [@wu2014urban].
In this study, we focus on the urban ecology of trees in San Francisco, California. Urban trees can play a major role in shaping the ecosystem of a city. Just as the trees in a forest have a considerable impact on its climate and ecology, trees in cities can have notable effects on a city's environment. For example, plant life in cities can impact temperature, air quality, and other aspects of human health [@Willis374]. The urban heat island effect, which occurs when 'islands' of heat form as heat gets trapped between tall buildings, is one of the most well-documented unique urban anthropogenic environmental conditions [@oke1973city]. Trees in urban environments have been shown to interact with these city-specific environmental factors [@salford40381]. For example, trees in cities have been shown to improve urban air quality by taking up significant amounts of carbon dioxide from city air [@NOWAK2014119]. As pollution generated in urban centers is one of the major contributing factors to worldwide pollution, trees in urban environments may have a role in managing the environmental impacts of urbanization [@alberti2003integrating].
Due to their importance, there is growing interest in understanding and maintaining sustainable urban ecologies. One potentially major factor influencing health of plants both in nature and cities is the plant microbiome. As the widespread availability of DNA sequencing has made it possible to characterize microbial communities more easily and comprehensively, the microbiome has become a major area of interest in numerous organisms [@kyrpides2016microbiome]. Just as the emerging field of human microbiome study has revealed that symbiotic, non-pathogenic microbes can have major impacts on human health [@david2014diet], plants also host numerous symbiotic microbes. Similar to the human microbiome, the plant microbiome contains great diversity and is comprised of multiple distinct communities in various tissue systems such as the roots (rhizosphere), leaf surface (phyllosphere), and leaf interior (endosphere) [@turner2013plant]. All of these microbiomes can have an impact on their host's physiology. For instance, bacterial root microbes have been shown to play a role in the growth of various plant species [@gaiero2013inside]. They can also play a variety of roles in host physiology, depending on their location within the plant [@schlaeppi2015plant]. These communities can be quite dynamic, and vary with factors such as plant age [@cavaglieri2009rhizosphere]. While each of these sets of interactions is important, here we focus on the microbial ecology of fungal microorganisms living in the leaf endosphere. The endosphere is an ideal system for biogeographic studies, both because it is highly diverse, well replicated, and because it is a major interface between the host plant and its environment [@meyer2012microbiology].
Endophytic microbes are naturally found in the interior of leaves, whether introduced by natural wounds or openings in the leaf surface, or through penetrating the plant surface with hydrolytic enzymes [@hallmann1997bacterial]. Although some of these microbes may be latent pathogens or decomposers waiting for the leaf to die, others are mutualists that may confer a benefit to their host [@carroll1988fungal]. Endophytes are commonly divided into classes, based on where they are found and what roles they are known to play in host tissues [@Rodriguez2009]. In wild grasses, Class 1 endophytic fungi (Clavicipitaceous endophytes) have been shown to protect their hosts by discouraging herbivory, and can even affect host reproductive viability in those same systems [@clay1988fungal]. In controlled settings, inoculation experiments with Class 3 endophytes (horizontally transmitted and localized to shoots) have shown that these specific taxa of endophytes can also have an impact on their host's overall health, including factors such as resistance and susceptibility to disease [@Busby2016]. In nature, Class 3 fungal endophytes can have impacts on their host's physiology, such as limiting pathogen damage [@arnold2003fungal]. In this study, we expected to primarily find Class 3 endophytes, which are known for having high diversity both within populations of plants and within plant tissues [@Rodriguez2009].
In the wild, endophytic communities display species diversity comparable to that of any macroscopic community, even among individual trees from the same species [@gazis2011]. However, determining the factors that influence this diversity across disparate environments remains an area of active research. The biodiversity of these communities can be quite high, especially in areas like tropical forests [@arnold2007diversity]. In such natural settings, plant-associated microbial community compositions can show clear biogeographic structure [@andrews2000ecology]. The urban setting, however, may be distinct, because factors such as rainfall and elevation will likely be less apparent over a smaller geographic area, even as new factors such as proximity to roads and tall buildings may introduce effects of their own. Studies of suburban forests in Japan have indicated that an urban setting has a notable impact on endophytic diversity [@MATSUMURA2013191]. However, the full impact of urban environmental factors on endophytic communities has yet to be completely understood.
In this study, we used culturing and barcode gene sequencing to identify the species makeup of endophytic communities in the New Zealand Christmas Tree (Māori: pōhutukawa), *Metrosideros excelsa* (Myrtaceae), throughout San Francisco, CA, to quantify the diversity and biogeographic patterns of foliar fungal communities in this urban environment. Based on studies in natural systems, we expected to find high community diversity both within and between sites, as well as a degree of biogeographic structure to these community compositions. We also anticipated that urban environmental factors might play a role in shaping the biogeography in these endophytic communities.
# Methods {-}
<!-- figures -->
```{r site-map, fig.pos="p!", fig.width=7, fig.cap="A map of the locations sampled across the city of San Francisco, California, USA. Five trees from each site were sampled. Sites were selected to span the range of environmental conditions across the city. Map data: Google, ©2018 TerraMetrics.", out.extra = ""}
# plot map and adjust legend as needed
ggmap(sf_map) +
geom_point(data = tree_coords_mean,
mapping = aes(x = mean_lon,
y = mean_lat),
color = "white",
pch = 16,
size = 8) +
geom_point(data = tree_coords_mean,
mapping = aes(x = mean_lon,
y = mean_lat,
color = Site_ID),
pch = 16,
size = 5) +
scale_color_manual(
name = "Sites",
labels = site_names_vector_by_site_id,
values = site_colors_vector_by_site_id) +
labs(x = "Longitude",
y = "Latitude")
```
## Host Selection {-}
We selected *Metrosideros excelsa* as a focal host tree. *Metrosideros excelsa* is widely planted throughout San Francisco and as such we were able to obtain samples from a large number of trees in a variety of locations throughout the city.
## Site Selection {-}
We selected six sites across the city for leaf sample collection (Fig. \@ref(fig:site-map)). When selecting sites, we took factors such as traffic levels, estimated temperature, proximity to tall buildings (greater than the 40-foot building-height limit imposed on most residential buildings in San Francisco), elevation, and proximity to the ocean and San Francisco Bay into account. In selecting the sites, we wanted to capture the wide range of potential urban climates. We used street tree location data available from the City of San Francisco (permalink: https://data.sfgov.org/City-Infrastructure/Street-Tree-List/tkzw-k3nq), which documents the location and species of every tree in San Francisco, to choose unique locations around the city with sufficient densities of *Metrosideros excelsa* individuals (Fig. \@ref(fig:site-map)).
## Sample Collection {-}
We collected small branches from 5 trees in each of these sites using a clipper pole, collecting at least 3 sun-exposed outer branches from each tree. Because *M. excelsa* is an evergreen tree and the newer leaves are likely to contain fewer fungi, we controlled for leaf age by only collecting branches which contained dark green leaves that appeared to be at least one year old. We collected all samples on the same day, August 26, 2017, to ensure that daily weather patterns and seasonal effects would not have an impact on the microbial community composition. Once collected, leaves were stored in labeled plastic bags and stored at 4$^{\circ}$C until culturing. All leaves were processed within 48 hours of collection. Permission to sample leaves was granted by the San Francisco Bureau of Urban Forestry.
## Culturing Methods {-}
From each sample, we selected a subset of dark green asymptomatic leaves for culturing. We surface-sterilized the leaves by first rinsing them with distilled water, then immersing them in a sterile petri dish with 95% ethanol for 10 seconds, 0.5% NaOCl for 2 minutes, then 70% ethanol for another 2 minutes following Arnold et al. [-@arnold2007b]. We emptied the dish between rinses and left it closed inside a sterile biosafety cabinet after the last rinse. We then cut the leaves into small (2mm x 2mm) pieces with flame-sterilized scissors and placed them into 1.5 mL microcentrifuge slant tubes partially filled with 2% VWR-brand malt extract agar (MEA). For each tree, we prepared 6 leaves and made 100 tubes, except for the trees from the downtown site. For the trees at this site, we prepared 140 tubes per tree because we found that they had low isolation frequencies in a preliminary sampling bout. All leaves were prepared this way within 48 hours of the initial leaf sampling, to prevent death of the leaf tissue from altering the fungal community composition.
After two weeks, we evaluated the tubes for fungal growth and subcultured the emergent fungi from tubes with growth onto 35mm plates with 2% MEA in order to better evaluate their morphotypes and accumulate sufficient tissue for future barcode gene sequencing and water voucher preparation. We re-evaluated and subcultured these tubes in the following months to capture any late-growing fungi. Each pure culture was vouchered in sterile dI water and stored in a living culture collection in the Zimmerman Lab at the University of San Francisco.
## Molecular Methods {-}
We extracted DNA from fungal mycelium using the Sigma RED Extract ‘n Amp DNA extraction kit and following previously published protocols [@uren2012a]. First, we added fungal tissue to sterile tubes filled with 1 mm zirconium oxide beads, then added 100 $\mu$L of Extract 'n' Amp DNA extraction solution. Next, we put the tubes in a bead-beater (Mini-Beadbeater-96, Bio Spec Products Inc.) for one minute. The samples were then placed on a heat block at 95$^{\circ}$C for 10 minutes. After the heating step, we added a dilution buffer to each tube and stored them at 4$^{\circ}$C until PCR.
We performed PCR using fungal-specific primers for the Internal Transcribed Spacer region, a commonly accepted fungal barcode locus [@schoch2012nuclear], using the ITS1F forward primer (5'-CTT GGT CAT TTA GAG GAA GTA A-3') and ITS4 reverse primer (5'-TCC TCC GCT TAT TGA TAT GC-3'). For each PCR reaction, we used 1 $\mu$L of template DNA, 10 $\mu$L Extract ‘n Amp Taq polymerase, 6.4 $\mu$L PCR-grade water, 1 $\mu$L bovine serum albumin, 0.8 $\mu$L ITS1F forward primer, and 0.8 $\mu$L ITS4 reverse primer [@uren2012a]. For the PCR reaction, we used a BioRAD T100 thermal cycler with the following cycles: 95$^{\circ}$C for 3 minutes; 35 cycles of 95$^{\circ}$C for 30 seconds, 54$^{\circ}$C for 30 seconds, 72$^{\circ}$C for 30 seconds; then 72$^{\circ}$C for 10 minutes. To ensure that the fungal DNA successfully amplified, and that the master mix was not contaminated, we ran 5 $\mu$L of each sample and a negative PCR control on a 1% agarose gel with 1X Tris-acetate-EDTA (TAE) buffer and SYBR Safe. We ran the gel at 120 volts for 20 minutes and visualized bands using UV transillumination. Successful PCR samples with clean negative controls were kept at 4$^{\circ}$C until sequencing preparation.
To prepare successfully amplified samples for Sanger sequencing, we first cleaned each sample with 1 $\mu$L Thermo Fisher Shrimp Alkaline Phosphatase Exonuclease (ExoSAP-IT). To clean the samples, we used the following cycle on a BioRAD T100 thermal cycler: 37$^{\circ}$C for 15 minutes, 80$^{\circ}$C for 15 minutes, then hold at 4$^{\circ}$C. After cleaning, samples were kept at 4$^{\circ}$C until they were ready to be sent for sequencing. Directly before being sent for sequencing, cleaned samples that showed bright bands on their gels were diluted with an additional 15 $\mu$L PCR water, although a small number of samples that had faint bands on the gels were not diluted. Cleaned samples were sent to MCLabs (South San Francisco, CA) for Sanger sequencing.
## Computational and Statistical Methods {-}
We used Geneious 11.1 to manually clean and trim the Sanger sequencing data, and to identify and remove failed and low-quality sequences [@geneious]. Sequences that appeared to have multiple strong signals at a given position were discarded because they may have originated from from a mixed culture. Usable sequences were cleaned by trimming the ends of low-quality nucleotides, clarifying any ambiguity codes, and resolving incorrect base assignment due to dye blobs. Cleaned sequences are available on GenBank (finalized genbank IDs to go here following acceptance).
After processing the cleaned sequences to add relevent metadata, we used mothur version 1.39.5 to determine Operational Taxonomic Units (OTUs), based on 97% ITS sequence similarity and abundance-based greedy clustering [@schloss2009introducing]. Next, we used R 4.1.1 [@r-citation] to analyze the resulting OTU table. We used vegan [@vegan] to: 1) calculate species accumulation curves to determine species richness from the OTU data, 2) estimate unknown total species richness using the Chao estimator with bias correction for small sample sizes [@chao1987a; @chiu2014a], 3) calculate Bray-Curtis distances between communities, 4) create a Non-Metric Multidimensional Scaling (NMDS) ordination, and 5) perform a PERMANOVA [@anderson2001a] to assess differences in communities between sites and between trees of different diameters at breast height (DBH) within sites. We used the Tree-Based Alignment Selector (TBAS) toolkit to assign taxonomic context to each ITS sequence [@tbas]. This toolkit matches unknown ITS sequences to the most similar ITS sequences in a large multi-gene phylogeny of confidently assigned taxa. We used the simple features [@sf] and ggmap [@ggmap] R packages to produce maps of study sites.
Significant differences in fungal leaf isolation frequency and tree diameter were determined using a nonparametric Kruskal–Wallis one-way test of variance and Dunn’s Test for multiple comparisons with Holm's correction for multiple hypothesis testing. All analyses, figures, and the final manuscript were generated using a makefile, bash scripts, and R and Rmarkdown. All relevant metadata, as well as analysis and manuscript code, are available on GitHub: https://github.com/ZimmermanLab/SF-metrosideros-endophytes/ and archived at (Zenodo doi here following acceptance). The complete computational environment for the analyses is documented in a Dockerfile based on rocker containers [@boettiger2014] and a renv.lock [@renv] file in that same repository.
## Methods for Comparing Fungi in San Fancisco Trees to New Zealand Trees {-}
In order to compare the foliar microbial populations found in *M. excelsa* throughout San Francisco to those in the tree's native range (New Zealand), we surveyed public GenBank records and performed a literature search. First, we searched Google Scholar on July 7th, 2021, for papers mentioning any of the following sets of search terms: "metrosideros endophytes", "metrosideros excelsa endophytes", "metrosideros excelsa foliage fungus", "metrosideros excelsa endophytes". After this search, we looked through the papers cited by those we originally found in order to find any additional studies.
Then, we searched GenBank for fungal ITS sequences that had *M. excelsa* listed as their host organism. We performed this search using the parameters "host="Metrosideros excelsa", organism=fungi", in order to limit our results to fungal sequences. This search yielded several new sequences that were not included in the previous search because they were unpublished. Using both of these methods, we were able to find about 30 taxa to include as a point of comparison, although many of these were described as pathogenic instead of endophytic.
# Results {-}
<!-- tables -->
```{r chao-richness-table}
# first calculate richness estimate across all trees/sites
overall_estimated_richness_chao <- specpool(otu_table_trees,
smallsample = TRUE)[1:3]
# then calculate for each site separately, to make a table
specpool(otu_table_trees,
smallsample = TRUE,
pool = rep(site_names_vector_by_site_id,
each = 5))[1:3] %>%
add_rownames() %>%
kable(col.names = c("Site", "OTU Count", "Chao Estimated Richness", "Chao Richness SE"),
caption = "Estimated richness of OTU counts across sites using the Chao estimator suggest that OTU richness is higher than observed and is characterized by large uncertainty at several sites.",
booktab = TRUE, digits = 1, linesep = "") %>%
kableExtra::kable_styling(latex_options = "hold_position")
```
```{r class-abundance-table}
tbas_with_site %>%
group_by(Class.level_assignment, Order.level_assignment) %>%
tally() %>%
arrange(Class.level_assignment, desc(n)) %>%
kable(col.names = c("Class", "Order", "Number of sequences"),
caption =
"Overall sequenced fungal isolate counts by Class and Order. All fungi successfully sequenced in this study were placed within the Ascomycota.",
booktabs = TRUE,
linesep = c('', '', '', '', '', '\\addlinespace',
'\\addlinespace',
'\\addlinespace',
'\\addlinespace',
'', '', '', '', '', '')) %>%
kableExtra::kable_styling(latex_options = "hold_position")
```
```{r permanova-table}
# set seet for permutations
set.seed(42)
# calculate PERMANOVA results for full model
tmp <- adonis(otu_table_trees ~ trees$Site_ID * trees$DBH_cm)$aov.tab %>%
broom::tidy() %>%
mutate(term = c("Site ID", "DBH (cm)", "Site x DBH", "Residuals", "Total")) %>%
kable(col.names = c("Term", "df", "Sum of Squares",
"Mean of Squares", "F", "RSQUARED", "p-value"),
caption =
"PERMANOVA model output based on Bray-Curtis distance. Foliar fungal community composition in leaves of \\emph{Metrosideros excelsa} across San Francisco is significantly related to both site of sampling and tree diameter at breast height (DBH), but the interaction between these parameters is not significant.",
booktabs = TRUE, digits = 3) %>%
kableExtra::kable_styling(latex_options = "hold_position")
# sort of hacky approach to get nice R2 formatting, from
# StackOverflow https://stackoverflow.com/a/62604838/3114071
knitr::asis_output(stringr::str_replace(tmp, "RSQUARED", "$R^{2}$"))
```
```{r permanova-table-dbh}
# set seet for permutations
set.seed(42)
# calculate PERMANOVA results for DBH but permuted only within sites
# and extract R2 and p-value for in-text use
dbh_results <- adonis(otu_table_trees ~ trees$DBH_cm,
strata = trees$Site_ID)$aov.tab
dbh_r2 <- round(dbh_results$R2[1], 2)
dbh_p <- round(dbh_results$`Pr(>F)`[1], 4)
```
<!-- figures -->
```{r rarefaction-plot, fig.width=6, fig.cap="Species accumulation curves showing fungal community OTU richness. Each line in A represents the OTU richness for a single host tree. Each line in B represents the combined OTU richness of all trees (n = 5) in one site."}
# two plots, side by side
par(mfrow = c(1, 2))
rarecurve(otu_table_trees,
col = rep(site_colors_vector_by_site_id, each = 5),
label = FALSE,
lwd = 2.5,
cex.main = 1,
xlab = "Number of fungal isolates",
ylab = "Number of fungal species (97% ITS OTUs)",
ylim = c(0, 40),
xlim = c(0, 250))
legend("bottomright",
legend = site_names_vector_by_site_id,
pch = 16,
cex = 0.6,
col = site_colors_vector_by_site_id)
mtext("A", side = 3, adj = -0.5, padj = -2.7, cex = 1.2)
rarecurve(otu_table_sites,
col = site_colors_vector_by_site_id,
label = FALSE,
lwd = 2.5,
cex.main = 1,
xlab = "Number of fungal isolates",
ylab = "Number of fungal species (97% ITS OTUs)",
ylim = c(0, 40),
xlim = c(0, 250))
legend("bottomright",
legend = site_names_vector_by_site_id,
pch = 16,
cex = 0.6,
col = site_colors_vector_by_site_id)
mtext("B", side = 3, adj = -0.5, padj = -2.7, cex = 1.2)
```
```{r isolation-dbh-freq-plots, fig.width=6, fig.height=8, fig.cap="Isolation frequencies (A) and tree diameters (B) at each site. Isolation frequency (A) is a measure of how many slant tubes showed signs of fungal growth, out of how many total slant tubes were made. Approximately 100 slant tubes were made for each tree except for the trees in the downtown site, which had 140 slant tubes per tree because they had low isolation frequencies during the initial sampling. Sites along x axis are arranged left to right in order of their geographic location from west (left) to east (right)."}
# do the isolation frequency calculations
trees_with_isofreq <- trees %>%
separate(isolation_freq,
into = c("grew", "total"),
sep = "/") %>%
mutate(grew = as.numeric(grew),
total = as.numeric(total)) %>%
mutate(isoFreq = grew / total) %>%
# for some reason can't do this with levels in scale_x_discrete
# so we have to do it here first
mutate(Site_ID = forcats::fct_relevel(Site_ID, c("S6B",
"S1B",
"S3B",
"S5B",
"S2B",
"S4B")))
# boxplot of tree dbh across sites from west to east
# see plotting tips
# here: https://github.com/IndrajeetPatil/ggstatsplot/issues/544
iso_freq_plot <- ggstatsplot::ggbetweenstats(
data = trees_with_isofreq,
x = Site_ID,
y = isoFreq,
type = "nonparametric",
plot.type = "box", # instead of violin
results.subtitle = TRUE, # add overall stats in subtitle
centrality.plotting = FALSE, # don't want medians too
ggsignif.args = list(vjust = -1, textsize = 2), # to prevent overplotting
point.args = list(color = "grey", position = "jitter")) +
ylim(c(0, 1)) + # always ranges between 0 and 1
scale_x_discrete(labels = sites_names_west_to_east) +
theme(axis.text.x = element_text(angle = 45,
vjust = 1,
hjust = 1)) +
labs(x = "Site",
y = "Isolation Frequency")
# boxplot of tree dbh across sites from west to east
dbh_plot <- ggstatsplot::ggbetweenstats(
data = trees_with_isofreq,
x = Site_ID,
y = DBH_cm,
type = "nonparametric",
plot.type = "box",
results.subtitle = TRUE,
centrality.plotting = FALSE,
ggsignif.args = list(vjust = -1, textsize = 2),
point.args = list(color = "grey", position = "jitter")) +
ylim(c(0, max(trees_with_isofreq$DBH_cm) + 20)) + # give it some space
scale_x_discrete(labels = sites_names_west_to_east) +
theme(axis.text.x = element_text(angle = 45,
vjust = 1,
hjust = 1)) +
labs(x = "Site",
y = "Diameter at Breast height (cm)")
# put the two plots together one on top of the other
iso_freq_plot / dbh_plot + plot_annotation(tag_levels = "A")
```
```{r taxonomy-by-site-plot, fig.width=6, fig.height=8, fig.cap="Normalized relative abundances of taxonomic groupings (Classes in A and Orders in B) in each site. Sites along x axis are arranged left to right in order of their geographic location from west (left) to east (right)."}
stacked_bar_class <- tbas_with_site %>%
inner_join(extractions, by = c("extraction_id" = "Label_ID")) %>%
group_by(site_id, Class.level_assignment) %>%
tally() %>%
ggplot(aes(x = site_id,
y = n,
fill = Class.level_assignment)) +
geom_col(position = position_fill()) +
scale_x_discrete(limits = site_id_west_to_east,
labels = sites_names_west_to_east) +
labs(y = "Relative abundance",
x = "Site") +
theme_few() +
scale_fill_viridis_d(name = "Class") +
theme(axis.text.x = element_text(angle = 45,
vjust = 1,
hjust = 1))
stacked_bar_order <- tbas_with_site %>%
inner_join(extractions, by = c("extraction_id" = "Label_ID")) %>%
group_by(site_id, Order.level_assignment) %>%
tally() %>%
ggplot(aes(x = site_id,
y = n,
fill = Order.level_assignment)) +
geom_col(position = position_fill()) +
scale_x_discrete(limits = site_id_west_to_east,
labels = sites_names_west_to_east) +
labs(y = "Relative abundance",
x = "Site") +
theme_few() +
scale_fill_viridis_d(name = "Order") +
theme(axis.text.x = element_text(angle = 45,
vjust = 1,
hjust = 1),
legend.text = element_text(size = 8),
legend.key.height = unit(10, "points"))
stacked_bar_class / stacked_bar_order + plot_annotation(tag_levels = "A")
```
```{r ordination_setup, include=FALSE}
set.seed(42)
ord_obj <- metaMDS(otu_table_trees, trymax = 100)
```
```{r nmds-plot, fig.width=6, fig.cap="NMDS ordination of community compositions. Each point represents the endophytic community of one tree, and the size of the point corresponds to that tree's DBH, while the color of said point corresponds to the site that tree is from. Points that are closer together indicate that the trees they represent have similar community compositions. The ellipses show the standard error around the centroid of all points within a site, and are also color-coded according to which site they represent."}
plot(ord_obj,
display = "sites",
type = "n",
xlab = "",
ylab = "",
tck = 0,
labels = FALSE)
points(ord_obj,
display = "sites",
col = rep(site_colors_vector_by_site_id, each = 5),
pch = 16,
cex = 1.2)
legend("bottomleft",
legend = site_names_vector_by_site_id,
pch = 16,
cex = 0.7,
pt.cex = 1.3,
col = site_colors_vector_by_site_id)
ordiellipse(ord_obj,
groups = rep(site_names_vector_by_site_id, each = 5),
label = FALSE,
col = site_colors_alphabetical_by_site_name, # note alphabetical
lwd = 5)
```
Overall, we found high diversity in these urban endophytic communities. From `r nrow(otu_table_trees)` different host trees, we successfully cultured and sequenced `r nrow(extractions)` fungal isolates, which made up `r ncol(otu_table_trees)` distinct 97% ITS OTUs. Some communities encompassed a low number of fungal isolates; species richness analyses showed that in nearly all of the trees analyzed, our sampling did not encompass the complete microbial diversity (Fig. \@ref(fig:rarefaction-plot)). While we found species diversity within some sites that exceeds 20 distinct fungal OTUs, there appears to be greater species diversity in these fungal endophytic communities than this study was able to document. Chao estimated richness overall was `r round(overall_estimated_richness_chao$chao, 1)` with a standard error of `r round(overall_estimated_richness_chao$chao.se, 1)` and is shown per site in Table \@ref(tab:chao-richness-table).
## Isolation Frequency and Host Tree Diameter at Breast Height {-}
```{r model-isofreq-dbh}
# calculate the significance of a linear regression model
# using dbh to predict isofreq
p_val_isof_vs_dbh <- round(anova(lm(data = trees_with_isofreq,
isoFreq ~ DBH_cm))[[5]][1], 3)
```
The isolation frequency, or the percentage of leaf pieces that yielded fungal isolates, varied significantly between sites (Kruskal Wallis p < 0.01, Fig. \@ref(fig:isolation-dbh-freq-plots)A). In most sites, the isolation frequency also varied between trees, especially in the Bay site. The diameter at breast height (DBH) of trees we sampled from differed significantly as well (Kruskal Wallis p < 0.01; Fig. \@ref(fig:isolation-dbh-freq-plots)B). Trees within the Bay site also show the greatest variation in this measure. However, there was not an overall significant linear relationship between tree DBH and measured isolation frequency (regression p = `r p_val_isof_vs_dbh`). The only site that shows consistently low variation in isolation frequency is the Downtown site, which is also the site with the lowest isolation frequencies overall.
## Diversity Patterns {-}
We found `r ncol(otu_table_trees)` total 97% nrITS OTUs among the `r nrow(otu_table_trees)` different trees. Both isolation frequency and the number of fungal species found varied notably between trees. Species accumulation curves showed that the total amount of fungal diversity was not completely sampled in any of the sites (Fig. \@ref(fig:rarefaction-plot)).
The most abundant fungal class that we found in our samples was Dothideomycetes, followed by Sordariomycetes; each had over 300 sequences assigned to them (Table \@ref(tab:class-abundance-table)). Dothideomycetes was the predominant class in most sites except for the Downtown and Bay sites (Fig. \@ref(fig:taxonomy-by-site-plot)). In both of these sites, Sordariomycetes was the most common class. There are several classes that are either absent or present in small numbers in most sites, but more abundant in one or several sites. For example, Eurotiomycetes are more common in the Downtown and Freeway sites, and Leotiomycetes are only abundant in the Mt. Davidson site. Only two sequences were not identified by T-BAS. We queried these two sequences against the NCBI nt database via BLASTn and found them to match with high confidence to the Pezizomycetes, based on similarity to a strain recently identified in Australia [@catcheside2017].
## Biogeographic Patterns {-}
```{r betadispersion}
# calculate multivariate dispersion values and test for site differences
betadispersion_result <- anova(
vegan::betadisper(vegdist(otu_table_trees,
method = "bray"),
group = rep(site_names_vector_by_site_id,
each = 5)))
betadisp_p_val <- round(betadispersion_result$`Pr(>F)`[1], 2)
```
While the NMDS ordination group ellipses (standard error around the centroid for each site) suggest that there might be differences in multivariate dispersion among sites, dispersions were not significantly different from one another (p = `r betadisp_p_val`; Fig. \@ref(fig:nmds-plot)), suggesting that PERMANOVA differences are likely to be caused by compositional differences rather than differences in dispersion. Both site and tree DBH were significant predictors of differences in community composition (Table \@ref(tab:permanova-table)), but their interaction was not significant (p > 0.05). Sites that were closer together geographically tended to also be closer to one another in NMDS ordination space. We also tested for the effect of tree DBH on community composition differences within sites, and found that it was significant (R^2^ = `r dbh_r2`, PERMANOVA p = `r dbh_p`).
# Discussion {-}
In this study, we sought to characterize the foliar microbiome of *Metrosideros excelsa* trees across the city of San Francisco. We found that the microbial composition of these urban trees' leaves varied in many aspects, from the isolation frequencies of fungi from their leaves to the taxonomic identities of their fungal isolates. Although *M. excelsa's* endophytic communities have been characterized in its native range (New Zealand), there have been few studies about these communities outside of its native environment or in an urban setting [@mckenzie1999]. In a Hawaiian endemic congener, *Metrosideros polymorpha*, the species makeup of foliar fungal endophyte communities has been shown to vary greatly across gradients of environmental factors such as elevation and rainfall [@Zimmerman13022]. While we did not explicitly quantify environmental variation in this study, the community variation we observed could be explained by either environmental or host physiological factors.
## Isolation Frequency and Tree size {-}
Both isolation frequency and DBH varied among sites. The trees in the Bay site show the greatest range in both isolation frequency and DBH (Fig. \@ref(fig:isolation-dbh-freq-plots)). The Mt. Davidson trees have a larger median and range of DBH than trees at the Balboa site (Fig. \@ref(fig:isolation-dbh-freq-plots)B), and also have a significantly higher median isolation frequency than the Downtown site. Just as it shows the greatest range of isolation frequencies and tree sites, the Bay site also has some of the least similarity between its fungal communities on the NMDS ordination (Fig. \@ref(fig:nmds-plot)). The trees in the Mt. Davidson site, which have some of the most similar communities (Fig. \@ref(fig:nmds-plot)), also have fairly large trees with similar DBH. Although there appears to be a general pattern with larger trees hosting a greater number of fungal endophytes, this realtionship was not significant; thus DBH cannot explain all of the variation in isolation frequencies.
## Diversity Patterns {-}
Endophytic microbiomes in tree leaves are known to be highly diverse in natural systems, and that also appears to be the case in these urban trees. The species accumulation curves indicate that even in the most well-represented sites, these are still many potentially undiscovered OTUs within these endophytic communities (Fig. \@ref(fig:rarefaction-plot)). Additionally, there is a considerable amount of taxonomic variation between certain sites, in addition to the generally high diversity of each site (Fig. \@ref(fig:taxonomy-by-site-plot)). In such cases, it is likely that environmental factors play a role in shaping the endophytic communities of these trees. While the impact of environmental factors may be less evident when focusing on fungal endophyte community richness, it becomes more apparent when looking at the identities of these endophytes. The composition of these communities can vary greatly among trees with similar isolation frequencies, as demonstrated by the taxonomic composition of the Mt. Davidson and Bay sites (Fig. \@ref(fig:taxonomy-by-site-plot)). It is possible that prevailing onshore weather dynamics, which generally lead the western part of the city to be colder and foggier than the eastern portion, explain some of the patterns we observed. In general, there was higher compositional similarity between trees from the same location than between trees of a similar size (Fig. \@ref(fig:nmds-plot) and Table \@ref(tab:permanova-table)).
## Biogeographic Patterns {-}
There appears to be a degree of biogeographic structure to the endophytic communities within these urban trees. Communities of trees from the same site generally cluster closer together than trees from different sites (Fig. \@ref(fig:nmds-plot) and Table \@ref(tab:permanova-table)). Although the Balboa site appears to have a reasonably high isolation frequency (Fig. \@ref(fig:isolation-dbh-freq-plots)A), many of the slant tubes that showed fungal growth failed to grow into a larger culture, leading to a lower number of usable sequences (Fig. \@ref(fig:rarefaction-plot)B). This low number of sequences per tree may indicate that the divergence in community composition that appears to be present might be due to the low sample size, rather than actual difference between the microbial communities between the Balboa trees (Fig. \@ref(fig:nmds-plot)).
However, even sites with low within-site diversity appear to be more similar to other trees from geographically close sites. For example, the Downtown trees do not cluster with the Mt. Davidson, Balboa, or Ocean trees in the ordination, indicating that they have fairly different fungal communities from those sites (Fig. \@ref(fig:nmds-plot)). Generally, the westernmost sites (Ocean and Balboa) cluster to one side of the ordination, while the Easternmost sites (Bay and Downtown) cluster to the other, and the sites closer to the middle of the city (Mt. Davidson and Freeway) cluster between them (Fig. \@ref(fig:nmds-plot)). Notably, the Downtown and Freeway sites cluster fairly close together (Fig. \@ref(fig:nmds-plot)) despite being fairly distant geographically (Fig. \@ref(fig:site-map)). In this case, it is possible that there are environmental factors that cause the communities to be more compositionally similar. One of the most likely environmental similarities between these two sites is that they are both exposed to high traffic levels, with one site located in a bustling downtown and the other located near the intersection of two large freeways (280 and 101). Carbon dioxide emissions, particulates, or both of these pollutants could possibly be having an effect on both these trees and their microbiomes. Endophytic bacteria have been shown to have a phytoremediating effect [@afzal2014endophytic], so it is possible that endophytic fungi may be playing a similar role in helping these plants persist in high-traffic and thus lower air quality conditions.
## Community composition compared to related trees in other locations {-}
In widely distributed plant species, fungal endophyte composition can differ quite considerably in a species from its native home to the places it has been introduced [@taylor_hyde_jones_1999]. *M. excelsa* is native to New Zealand, but it and its close relatives in the *Myrtaceae* are widespread across the world. Across all locations, the most abundant classes found within these *M. excelsa* fungal communities were Dothideomycetes and Sordariomycetes by a notable margin (Table \@ref(tab:class-abundance-table)). At broad taxonomic levels such as Class, the endophytes found in this study appear to be similar to those found in other Myrtaceae trees elsewhere. For instance, a study of *Myrtaceae* trees in South America, the first and second most common fungal isolates were also identified as Dothideomycetes followed by Sordariomycetes [@vaz2014fungal]. Dothideomycetes and Sordariomycetes are also the first and second most abundant classes in the endophytic communities of *M. excelsa's* close relative in Hawai'i, *Metrosideros polymorpha*, as well [@Zimmerman13022]. This could indicate that these fungal classes are well-adapted to living as endophytes within these trees, but it is also possible that these communities are actually different when considered at a finer taxonomic resolution.
In order to determine if the fungi we identified were simply members of ubiquitous taxa or if they were potentially more specific to *M. excelsa*, we compared our results to a study of endophytes in two species of trees in the nearby Northern California county of Mendocino. This study looked at two species of host plants: an angiosperm, *Vaccinium ovatum*, and a gymnosperm, *Pinus muricata*. Overall, urban *M. excelsa* shared few of its prominent endophytic classes with *P. muricata*, but it shared Dothideomycetes as a common class with *V. ovatum* [@Oono2020]. The most common class in *P. muricata* was Atractiellomycetes [@Oono2020], which did not appear prominently in any of the trees we studied. Based on this admittedly limited comparison, it appears that angiosperm species may have more endophytic similarities to each other than to gymnosperm species in the same geographic area. However, because we found that the second most common fungal class in our *M. excelsa* leaves, Sordariomycetes, was not present in large quantities in any of the trees in the study by Oono and colleagues, finer-scale host species filtering is also likely to be playing a role in shaping the communities that establish in *M. excelsa*. This conclusion has been suggested in other more comprehensive studies of fungal endophyte host plants as well [@uren2019; @darcy2020].
## Comparison of trees in San Francisco and New Zealand {-}
When comparing *M. excelsa's* endophytic taxa found in the literature to those that we collected, we found that the taxa between the regions were somewhat similar on an order level, and even more so on a class level. Only one of the orders that was extremely common in New Zealand (Mycosphaerellales) was absent in San Francisco, and only two of the most common orders in San Francisco (Capnodiales and Pezizales) were absent in New Zealand. However, most of the families that are found in New Zealand are absent in San Francisco, and vice versa. We only found four families in common: Mycosphaerellaceae, Glomerellaceae, Nectriaceae, and Helotiaceae. Of these, Mycosphaerellaceae is the only family that is present in New Zealand and was prevalent (over 50 isolates) in San Francisco. The other families that they had in common were only present in very small numbers in San Francisco, and we were unable to find documentation of three most prevalent families in San Francisco (Xylariaceae, Botryosphaeriaceae, and Cladosporiaceae) being isolated from leaves in New Zealand. Although this could be due to the small sample size, it suggests that *M. excelsa's* endophytic populations may be influenced primarily by their environment. However, additional research into both the endophytic populations of other trees in San Francisco and *M. excelsa* in its native range would be required to draw further conclusions.
It is important to note the difference in methods between how we collected our samples and how others studying *M. excelsa* in New Zealand collected theirs. Whereas we took care to only collect samples from unfallen, asymptomatic leaves, most of the papers referenced in this comparison use either leaf litter, dead leaves, or obviously symptomatic leaves. Therefore, it is reasonable to infer that the data we are comparing our results to is biased towards pathogenic and decomposing fungi, and underrepresents the non-harmful endophytic populations in *M. excelsa's* native range. Additionally, the number of fungal isolates we were able to find using literature and GenBank searches was much lower than the number of samples we collected ourselves, making quantitative comparison unfeasible. However, we think that such comparisons are still worthwhile as a preliminary look into the extent that *M. excelsa* in San Francisco may maintain its native endophytic microbiome.
# Conclusions {-}
These findings indicate that the endophytic microbiomes of urban trees are complex and diverse, and may show a degree of biogeographic structure that reflects their natural counterparts. Additionally, it is likely that urban environmental factors play a considerable role in shaping the endophytic communities of these trees. This study has demonstrated that the urban endophytic microbiome is highly diverse and has the potential to be structured by urban environmental factors. Our species accumulation curves indicate that the full diversity of these endophytic communities has yet to be sampled (Fig. \@ref(fig:rarefaction-plot)). Even in the small geographic area of San Francisco, we found notable trends in microbiome composition that may vary with geographic factors, host physical factors such as DBH and age, and uniquely urban environmental factors, such as traffic. A combination of environmental factors and perhaps host physiology therefore appear to be the driving force behind the diversity of these microbiomes. While it is difficult to determine, given the data we have, the exact mechanisms that influence the composition of these communities, the amount of species diversity and biogeographic structure we present here indicate that the foliar microbiomes of urban trees may be just as complex and dynamic as those of trees in nature, and are worth of further concerted study.
# Acknowledgements {-}
This study was conducted as an undergraduate Biology honors thesis project by EG under the guidance of NZ. Joshua Copeland and Julian Murdzek assisted with culture processing, vouchering, and DNA extractions. Jeff Oda contributed supplies and equipment expertise.
# References