#Algorithms for Biosequence Comparison
The codes are for my own future reference and for prospective employers to judge my competence.
Name: CSE 584A Algorithms for Biosequence Comparison@Wash U (Graduate-Level)
Instuctor: Dr. Jeremy Buhler.
Course website: http://classes.engineering.wustl.edu/cse584a/
Taken: Spring 2016
Language used: JAVA, Python
This course surveys algorithms for comparing and organizing discrete sequential data, especially nucleic acid and protein sequences. Emphasis is on tools to support search in massive biosequence databases and to perform fundamental comparison tasks such as DNA short-read alignment. These techniques are also of interest for more general string processing and for building and mining textual databases. Algorithms are presented rigorously, including proofs of correctness and running time where feasible. Topics include classical string matching, suffix array string indices, space-efficient string indices, rapid inexact matching by filtering (including BLAST and related tools), and multiple alignment.
- Project 1: Implement the Aho-Corasick algorithm
- Project 2: Implement the linear-time DC3 suffix array construction algorithm
- Project 3: Implement the computation of distinct k-mers
- Final Project: k-mer Kernels for Genetic Distance