Skip to content

Latest commit

 

History

History
41 lines (25 loc) · 3.31 KB

README.md

File metadata and controls

41 lines (25 loc) · 3.31 KB

HIV sequence MSAs

The files contained in here are the processed data files used to infer the Potts models and annotated as below:

Protease (PR)

  • pr.exper.fullseq : Alignment of our processed drug-experienced HIV-1 subtype B Protease sequences (obtained from the Stanford HIVDB) in the full 20-letter amino acid.

  • pr.exper.reduce4.seq : Alignment of our processed drug-experienced HIV-1 subtype B Protease sequences in the reduced 4-letter (ABCD) amino acid alphabet as used to infer the drug-experienced Potts model.

  • pr.exper.weights : Sequence weights of individual sequences in pr.exper.fullseq or pr.exper.reduce4.seq. Sequences are given weights such that the effective number of sequences obtained from a single patient is 1.

  • pr.naive.fullseq : Alignment of our processed drug-naive HIV-1 subtype B Protease sequences (obtained from the Stanford HIVDB) in the full 20-letter amino acid.

  • pr.naive.reduce4.seq : Alignment of our processed drug-naive HIV-1 subtype B Protease sequences in the reduced 4-letter (ABCD) amino acid alphabet as used to infer the drug-naive Potts model.

  • pr.naive.exper.weights : Sequence weights of individual sequences in pr.naive.fullseq or pr.naive.reduce4.seq. Sequences are given weights such that the effective number of sequences obtained from a single patient is 1.

Reverse Transcriptase (RT)

  • rt.exper.fullseq : Alignment of our processed drug-experienced HIV-1 subtype B RT sequences (obtained from the Stanford HIVDB) in the full 20-letter amino acid.

  • rt.exper.reduce4.seq : Alignment of our processed drug-experienced HIV-1 subtype B RT sequences in the reduced 4-letter (ABCD) amino acid alphabet as used to infer the Potts model.

  • rt.exper.weights : Sequence weights of individual sequences in rt.exper.fullseq or rt.exper.reduce4.seq. Sequences are given weights such that the effective number of sequences obtained from a single patient is 1.

Integrase (IN)

  • in.exper.fullseq : Alignment of our processed drug-experienced HIV-1 subtype B IN sequences (obtained from the Stanford HIVDB) in the full 20-letter amino acid.

  • in.exper.reduce4.seq : Alignment of our processed drug-experienced HIV-1 subtype B IN sequences in the reduced 4-letter (ABCD) amino acid alphabet as used to infer the Potts model.

  • in.exper.weights : Sequence weights of individual sequences in in.exper.fullseq or in.exper.reduce4.seq. Sequences are given weights such that the effective number of sequences obtained from a single patient is 1.

Capsid (CA)

  • ca.naive.fullseq : Alignment of our processed drug-naive HIV-1 subtype B CA (capsid protein, p24) sequences (obtained from the Los Alamos HIV Sequence Database) in the full 20-letter amino acid.

  • ca.naive.reduce4.seq : Alignment of our processed drug-naive HIV-1 subtype B CA sequences (capsid protein, p24) in the reduced 4-letter (ABCD) amino acid alphabet as used to infer the Potts model.

  • ca.naive.weights : Sequence weights of individual sequences in ca.naive.fullseq or ca.naive.reduce4.seq. Sequences are given weights such that the effective number of sequences obtained from a single patient is 1.