HIV sequence MSAs

The files contained in here are the processed data files used to infer the Potts models and annotated as below:

Protease (PR)

pr.exper.fullseq : Alignment of our processed drug-experienced HIV-1 subtype B Protease sequences (obtained from the Stanford HIVDB) in the full 20-letter amino acid.
pr.exper.reduce4.seq : Alignment of our processed drug-experienced HIV-1 subtype B Protease sequences in the reduced 4-letter (ABCD) amino acid alphabet as used to infer the drug-experienced Potts model.
pr.exper.weights : Sequence weights of individual sequences in pr.exper.fullseq or pr.exper.reduce4.seq. Sequences are given weights such that the effective number of sequences obtained from a single patient is 1.
pr.naive.fullseq : Alignment of our processed drug-naive HIV-1 subtype B Protease sequences (obtained from the Stanford HIVDB) in the full 20-letter amino acid.
pr.naive.reduce4.seq : Alignment of our processed drug-naive HIV-1 subtype B Protease sequences in the reduced 4-letter (ABCD) amino acid alphabet as used to infer the drug-naive Potts model.
pr.naive.exper.weights : Sequence weights of individual sequences in pr.naive.fullseq or pr.naive.reduce4.seq. Sequences are given weights such that the effective number of sequences obtained from a single patient is 1.

rt.exper.fullseq : Alignment of our processed drug-experienced HIV-1 subtype B RT sequences (obtained from the Stanford HIVDB) in the full 20-letter amino acid.
rt.exper.reduce4.seq : Alignment of our processed drug-experienced HIV-1 subtype B RT sequences in the reduced 4-letter (ABCD) amino acid alphabet as used to infer the Potts model.
rt.exper.weights : Sequence weights of individual sequences in rt.exper.fullseq or rt.exper.reduce4.seq. Sequences are given weights such that the effective number of sequences obtained from a single patient is 1.

in.exper.fullseq : Alignment of our processed drug-experienced HIV-1 subtype B IN sequences (obtained from the Stanford HIVDB) in the full 20-letter amino acid.
in.exper.reduce4.seq : Alignment of our processed drug-experienced HIV-1 subtype B IN sequences in the reduced 4-letter (ABCD) amino acid alphabet as used to infer the Potts model.
in.exper.weights : Sequence weights of individual sequences in in.exper.fullseq or in.exper.reduce4.seq. Sequences are given weights such that the effective number of sequences obtained from a single patient is 1.

ca.naive.fullseq : Alignment of our processed drug-naive HIV-1 subtype B CA (capsid protein, p24) sequences (obtained from the Los Alamos HIV Sequence Database) in the full 20-letter amino acid.
ca.naive.reduce4.seq : Alignment of our processed drug-naive HIV-1 subtype B CA sequences (capsid protein, p24) in the reduced 4-letter (ABCD) amino acid alphabet as used to infer the Potts model.
ca.naive.weights : Sequence weights of individual sequences in ca.naive.fullseq or ca.naive.reduce4.seq. Sequences are given weights such that the effective number of sequences obtained from a single patient is 1.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
ca.naive.fullseq		ca.naive.fullseq
ca.naive.reduce4.seq		ca.naive.reduce4.seq
ca.naive.weights		ca.naive.weights
in.exper.fullseq		in.exper.fullseq
in.exper.reduce4.seq		in.exper.reduce4.seq
in.exper.weights		in.exper.weights
pr.exper.fullseq		pr.exper.fullseq
pr.exper.reduce4.seq		pr.exper.reduce4.seq
pr.exper.weights		pr.exper.weights
pr.naive.fullseq		pr.naive.fullseq
pr.naive.reduce4.seq		pr.naive.reduce4.seq
pr.naive.weights		pr.naive.weights
rt.exper.fullseq		rt.exper.fullseq
rt.exper.reduce4.seq		rt.exper.reduce4.seq
rt.exper.weights		rt.exper.weights