Repository of the code, data, and text for the manuscript "Urban biogeography of fungal endophytes across San Francisco" published in the scientific journal PeerJ, written by Emma Gibson and Naupaka Zimmerman.
Use of Google Maps images follows the guidelines here:
https://about.google/brand-resource-center/products-and-services/geo-guidelines/
- You'll need to install Docker and git first if you don't have them already.
- Then, clone the repository, build the image, and start a container with RStudio server.
git clone https://github.com/ZimmermanLab/SF-metrosideros-endophytes
cd SF-metrosideros-endophytes
docker build -t sfendos .
docker run --rm -d -p 8787:8787 -e DISABLE_AUTH=true -e ROOT=true -v /home/rstudio/sfendos/renv sfendos
OR, if you have docker installed and want to use the prebuilt image from Docker Hub, it's even easier/faster:
docker run --rm -d -p 8787:8787 -e DISABLE_AUTH=true -e ROOT=true -v /home/rstudio/sfendos/renv naupaka/sfendos
- Open http://localhost:8787 in your web browser.
- Click on the
sfendos
directory in the file pane (lower right of the window), and then on theSF-metrosideros-endophytes.Rproj
file to open the project. - Open a new bash terminal in the RStudio Server interface and type
make
. - Go get a cup of coffee (it still has to install a good number of LaTeX packages) and when you come back the manuscript should be built..
The directory structure is shown below. The most important files are the
raw data in the data/
directory, the gibson2023.Rmd
Rmarkdown file
that contains all of the text and a good bit of the code for the
manuscript, the references.bib
file that contains the references for
the manuscript, and the scripts in the scripts/
directory, which are
called by the Rmd file or the Makefile
to run the pipeline. The
computational environment is specified by the Dockerfile
and the
renv.lock
file. The output/
directory contains a number of
intermediate output files for the processing pipeline, and also includes
the R session info for the last time the manuscript was generated, in
output/log_files/r_session_info.txt
. Other files that haven't been
mentioned are mostly ancillary files for generating the manuscript using
bookdown and LaTeX.
├── Dockerfile
├── LICENSE.md
├── Makefile
├── README.md
├── SF-metrosideros-endophytes.Rproj
├── data
│ ├── blast_results_for_unknowns
│ │ ├── EUSF00620-genbank-blastn-alignment-2021-10-31.txt
│ │ └── EUSF00664-genbank-blastn-alignment-2021-10-31.txt
│ ├── metadata
│ │ ├── culturing_worksheet_emma.csv
│ │ ├── extraction_worksheet_emma.csv
│ │ ├── m_excel_tree_metadata_with_isolationfreq.csv
│ │ ├── tbas_taxonomies.csv
│ │ └── water_voucher_worksheet_emma.csv
│ └── sequences
│ ├── hand-cleaned_seqs
│ └── new_zealand_sequence_search.csv
├── gibson2023.Rmd
├── gibson2023.fff
├── gibson2023.log
├── gibson2023.pdf
├── gibson2023.tex
├── gibson2023.ttt
├── gibson2023_files
│ └── figure-latex
│ ├── isolation-dbh-freq-plots-1.pdf
│ ├── nmds-plot-1.pdf
│ ├── rarefaction-plot-1.pdf
│ ├── site-map-1.pdf
│ └── taxonomy-by-site-plot-1.pdf
├── init_docker.sh
├── output
│ ├── log_files
│ │ ├── r_session_info.txt
│ │ └── seq_script_log.txt
│ ├── metadata_tables
│ │ ├── groupfile.tsv
│ │ ├── m_excel_tree_metadata_with_isolationfreq_wgs_lat_long.csv
│ │ └── tbas_with_site.csv
│ ├── mothur_pipeline
│ │ ├── 01_all_seqs.fasta
│ │ ├── 02_good_seqs.fasta
│ │ ├── 04_good_seqs_short_names_checked.agc.0.03.fasta
│ │ ├── 04_good_seqs_short_names_checked.agc.list
│ │ ├── 04_good_seqs_short_names_checked.count_table
│ │ ├── 04_good_seqs_short_names_checked.fasta
│ │ ├── 04_good_seqs_short_names_checked.names
│ │ ├── 04_good_seqs_short_names_checked.unique.fasta
│ │ ├── 05_seq_names.txt
│ │ ├── 06_otu_temp.txt
│ │ ├── 07_seq_ids_temp.txt
│ │ ├── 08_seq_with_OTU_ID.txt
│ │ ├── 09_otu_seq_sorted.txt
│ │ ├── 10_names_sorted.txt
│ │ ├── 11_joined_otu_seqs.txt
│ │ └── 12_seq_with_OTU_ID.txt
│ ├── sf_basemap.Rdata
│ └── tbas_output
│ ├── 2019-06-17_tbas21_archiveYIYORC5W_.tar.gz
│ └── 2019-06-17_tbas_run
├── peerj.csl
├── preamble.tex
├── references.bib
├── renv
│ ├── activate.R
│ └── settings.dcf
├── renv.lock
├── scripts
│ ├── 01_process_fasta.sh
│ ├── 02_join_metadata.R
│ ├── 03_make_otu_table.R
│ ├── 04_clean_and_join_tbas_taxonomy.R
│ └── 05_setup_maps_and_gis_coords.R
└── wlpeerj.cls
16 directories, 60 files