This is a code repository to reproduce the following manuscript:
Scan-Centric, Frequency-Based Method for Characterizing Peaks from Direct Injection Fourier transform Mass Spectrometry Experiments
Robert M Flight, Joshua M Mitchell, Hunter N.B. Moseley
doi: https://doi.org/10.3390/metabo12060515 (final published version)
Copies of manuscript & supplemental materials:
If you are looking for the scan-centric peak characterization R package, it is available here: https://github.com/MoseleyBioinformaticsLab/ScanCentricPeakCharacterization
git clone https://github.com/MoseleyBioinformaticsLab/manuscript.peakCharacterization.git
cd manuscript.peakCharacterization
wget https://zenodo.org/record/6568098/files/manuscript.peakCharacterization-reproducible_v4.zip
unzip manuscript.peakCharacterization-reproducible_v4.zip
We also need the data files and the _targets directory.
wget https://zenodo.org/record/6568053/files/scancentric_manuscript_targets.zip
unzip scancentric_manuscript_targets.zip
wget https://zenodo.org/record/6568016/files/scancentric_manuscript_data.zip
unzip scancentric_manuscript_data.zip
Start R within the manuscript directory:
# depending on how it starts up, you may
# need to install renv first
# theoretically it should "just work"
install.packages("renv")
renv::install("[email protected]")
renv::install("[email protected]")
renv::install("[email protected]")
renv::install("bioc::[email protected]")
renv::install("bioc::[email protected]")
renv::install("bioc::[email protected]")
renv::restore()
Restart R, just to be sure we start from a clean slate.
source("./packages.R")
target_status = tar_network(targets_only = TRUE)
target_status$vertices %>%
dplyr::filter(!(status %in% "uptodate"))
When I do this, I see the manuscript, and some RSD pieces listed as out of date (7 items total). All of the various analysis pieces seem to be mostly intact, which means that it should be trivial to update things, or go through and examine different pieces of the overall workflow.
- _targets.R: the overall workflow for the analysis.
- _targets/: keeping track of the state of things, and the actual outputs.
- R/: the various functions necessary for running the analysis
- doc/: the manuscript (in a couple of different styles), and supplemental materials.
- data/
- data_input: the input data files
- data_output: where various output files go, including the generated scan-centric peak characterization outputs, and their assignments.
- lung_data: outputs related to the lung data
- lungcancer_all: scripts related to running all the NSCLC files on remote machines.
- assignments: all of the assignment files from the NSCLC samples in JSON form.
- ftms_artifacts/peaklists: the peak lists from Xcalibur for all of the NSCLC samples in JSON form.
Each of the scan-centric-characterization bits takes at least 30 minutes, for the ones with more peaks (noperc especially), they will take even longer. I personally ran them over the course of an afternoon across 5 different machines with shared network storage.
The SMIRFE code for generating assignments is not generally available, unfortunately. For those who absolutely need it, please contact one of the authors about getting access to it. The input zip files (and underlying JSON peak lists) are present, as well as the assignments generated by SMIRFE.