Skip to content

karinlag/spades_stats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

#spades_stats

Calculating statistics for SPAdes assemblies based on the coverage and length found in the fasta description line.

By Karin Lagesen | @karinlag

How to run

###Requirements

  • Python 2.7
  • Biopython

###Command line

usage: spades_stats.py [-h] [-d DIRECTORY] [-p STRING]

optional arguments:
  -h, --help            show this help message and exit
  -d DIRECTORY, --directory DIRECTORY
                        Directory containing directories with scaffolds.fasta
                        in them
  -p STRING, --prefix STRING
                        Prefix string for output files

###Input

The input for this code is a directory where there are spades assemblies. The code locates all files named "scaffolds.fasta" under that directory and calculates statistics for that file.

###Output

Statistics is calculated for each file, and the results are written to one output file. The output file will be prefixed with string given to option -p.

The output is output one line per scaffolds file, with the following columns (first line is header line):

Column Description
Scaffolds_filename name of input file, incl relative path
N50 N50 value for assembly
#contigs>=N50 the fewest number of contigs whose sum makes up N50
coverage_contigs average coverage over the #contigs>=N50

Next comes five number statistics for length and coverage. The five stats calculated are(in order):

  • min: minimum value (length or coverage)
  • max: maximum value (length or coverage)
  • avg: average value (length or coverage)
  • median: median value (length or coverage)
  • std: standard deviation (length or coverage)

##Issues

Please report problems here: https://github.com/karinlag/spades_stats/issues

##License

Please read the LICENSE included in the repository

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages