Skip to content

R package for working with vcf files

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

marcocacciabue/voRtex

Repository files navigation

Marco Cacciabue, Melina Obregón, Axel N. Fenoglio

voRtex

voRtex is an R package

voRtex is a package created with the purpose to help in the data analysis of large groups of foot and mouth desease rna sequences.

this package contains a collection of functions designed to manage and analyze sample data in an efficient way. It manages VCF files, bed files, fasta files, DNAStringSet and else.

Examples

VCFToDataFrame

VCFToDataFrame.R creates a data frame off a vcf file, containing the information of allel position, allel frecuency and depth coverage

  file <- system.file("extdata", "variant_file.vcf", package = "voRtex", mustWork = TRUE)
  vcf_data <- VariantAnnotation::readVcf(file)
  vcfdataframe<-VCFToDataFrame(vcf_data)
  vcfdataframe

Compute_coverage

With compute_coverage.R we can create a dataframe containing the average position and coverage of a sequence, using a .bed file and giving the function a window size of choosing

FilePath <- system.file("extdata", "SRR12664421_full_coverage.bed",
                          package = "voRtex", mustWork = TRUE)

data <- read.table(FilePath, col.names = c("reference", "startpos", "endpos", "coverage"))

data_processed<-compute_coverage(data, 50,TRUE)

ggplot_heatmap

Then with ggplot_heatmap.R we can create a heatmap based on the data frame created with compute_coverage

color  <- c("#D53E4F","#F46D43","#FDAE61","#FEE08B","#E6F598","#ABDDA4","#66C2A5","#3288BD")

Rplot<-ggplot_heatmap(inputdata=data_processed,
               color_pal = color)

Rplot

Resulting in this beautiful heat map

SRR12664421_Heatmap

SRR12664421_Heatmap

NreadFilter

With NreadFilter we use a .bed file containing the coverage of a sequence, and it creates a dateframe with that information and then filters the rows based on a filter value, and returns a list containing the filtered dataframe and the percentage of bases that passed the filter.

 FilePath <- system.file("extdata", "SRR12664421_full_coverage.bed",
                         package = "voRtex", mustWork = TRUE)
 OGDataFrame <- read.table(FilePath,
                          col.names = c("reference","startpos","endpos","nreads"))
 salida <- NreadFilter(OGDataFrame,5000)
 

NCountFilter

With NCountFilter we take a DNAStringSet file and filter it according to the minimum number of “N” (base not readed) we want.

  FilePath <- system.file("extdata",
                         "renamed_all.fasta",
                         package = "voRtex",
                         mustWork = TRUE)

 DNASequence <- Biostrings::readDNAStringSet(FilePath)

 NCountFilter(DNASequence,1000)
 

About

R package for working with vcf files

Topics

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages