Skip to content

Latest commit

 

History

History
102 lines (67 loc) · 2.88 KB

README.md

File metadata and controls

102 lines (67 loc) · 2.88 KB

Marco Cacciabue, Melina Obregón, Axel N. Fenoglio

voRtex

voRtex is an R package

voRtex is a package created with the purpose to help in the data analysis of large groups of foot and mouth desease rna sequences.

this package contains a collection of functions designed to manage and analyze sample data in an efficient way. It manages VCF files, bed files, fasta files, DNAStringSet and else.

Examples

VCFToDataFrame

VCFToDataFrame.R creates a data frame off a vcf file, containing the information of allel position, allel frecuency and depth coverage

  file <- system.file("extdata", "variant_file.vcf", package = "voRtex", mustWork = TRUE)
  vcf_data <- VariantAnnotation::readVcf(file)
  vcfdataframe<-VCFToDataFrame(vcf_data)
  vcfdataframe

Compute_coverage

With compute_coverage.R we can create a dataframe containing the average position and coverage of a sequence, using a .bed file and giving the function a window size of choosing

FilePath <- system.file("extdata", "SRR12664421_full_coverage.bed",
                          package = "voRtex", mustWork = TRUE)

data <- read.table(FilePath, col.names = c("reference", "startpos", "endpos", "coverage"))

data_processed<-compute_coverage(data, 50,TRUE)

ggplot_heatmap

Then with ggplot_heatmap.R we can create a heatmap based on the data frame created with compute_coverage

color  <- c("#D53E4F","#F46D43","#FDAE61","#FEE08B","#E6F598","#ABDDA4","#66C2A5","#3288BD")

Rplot<-ggplot_heatmap(inputdata=data_processed,
               color_pal = color)

Rplot

Resulting in this beautiful heat map

SRR12664421_Heatmap

SRR12664421_Heatmap

NreadFilter

With NreadFilter we use a .bed file containing the coverage of a sequence, and it creates a dateframe with that information and then filters the rows based on a filter value, and returns a list containing the filtered dataframe and the percentage of bases that passed the filter.

 FilePath <- system.file("extdata", "SRR12664421_full_coverage.bed",
                         package = "voRtex", mustWork = TRUE)
 OGDataFrame <- read.table(FilePath,
                          col.names = c("reference","startpos","endpos","nreads"))
 salida <- NreadFilter(OGDataFrame,5000)
 

NCountFilter

With NCountFilter we take a DNAStringSet file and filter it according to the minimum number of “N” (base not readed) we want.

  FilePath <- system.file("extdata",
                         "renamed_all.fasta",
                         package = "voRtex",
                         mustWork = TRUE)

 DNASequence <- Biostrings::readDNAStringSet(FilePath)

 NCountFilter(DNASequence,1000)