layout	title	author	author_url	date
protocols	repp-assisted plasmid design	Shaohe Wang	https://shaohewanglab.org	2023-11-08

repp stands for repository-based plasmid design. It is a command line tool that is very useful to automate some steps in plasmid design. See Timmons, J.J. & Densmore D. Repository-based plasmid design. PLOS One.. Because repp is currently only available as a command line tool, the instructions below assumes some familiarity with the command line interface. Otherwise, please see this page. We assume Windows users use the Git Bash program for their command line interface.

The original repp needs a multi-FASTA format sequence to build a database and the output is in json format, which is not convenient for typical molecular cloning work flow. Cristian Goina from the Janelia Scientific Computing Software team has helped us to implement several important i/o features to adapt for our typical cloning work flow, including building a database from a directory of plasmid seuqences in FASTA or GenBank format, re-using existing primers, and outputting convenient csv format spreadsheets.

1. (Optional) Install and set up Git

Git is a powerful tool for version control. You don't need Git for installing and using repp, but I encourage you to learn about it if unfamiliar. I highly recommend the Software Carpentry Lesson for Git.

Learn basic concepts of Git here.
Install Git.
- If you are using a new Mac, you probably have a quite new version of Git installed. Check by typing git --version in your terminal. If the version number is close to the current version, you can just use the pre-installed version.
- Otherwise, follow instructions here to install Git on Windows, Mac or Linux systems.
Set up Git on your computer.

2. Install the Janelia SciComp version of repp

Install 3 dependencies: go, primer3, and blast. Skip to the next step if you are updating repp.
- Install go (version >= 1.19) following instructions here.
- Install blast:
  - Go to the NCBI website to download the BLAST+ software.
    - From this website, follow this ftp download link.
  - For Mac (M1 or M2 chip OK), download "ncbi-blast-2.13.0+.dmg" for installation.
    - (Optional but good practice) Check the md5 sum of the downloaded installation package by cd into the download folder and run md5 ncbi-blast-2.13.0+.dmg. Make sure the md5 sum matches the ncbi-blast-2.13.0+.dmg.md5 in the ftp download list.
    - Open the dmg file to install blast.
    - Mac will warn you about this file is from "unidentified developer" and cannot be opened. You will need to go to "Security and Privacy" settings and click "Open Anyway" to open this installer.
    - Check whether installation is OK by running which blastn, which should print out the path to the blastn program (e.g., /usr/local/ncbi/blast/bin/blastn).
  - For Windows, download "ncbi-blast-2.13.0+-win64.exe"
    - (Optional but good practice) Check the md5 sum of the downloaded installation package by cd into the download folder and run md5sum ncbi-blast-2.13.0+-win64.exe in Git Bash. Make sure the md5 sum matches what is listed in ncbi-blast-2.13.0+-win64.exe.md5 in the ftp download list.
    - Open the exe file to install blast.
    - Check whether installation is OK by running blastn -h in Git Bash, which should print out the help message of the blastn program.
- Install primer3.
  - If you use Git, run git clone https://github.com/primer3-org/primer3.git.
  - If you don't use Git, go to the Primer3 GitHub page, click on the green button Code and Download ZIP. Unzip it.
  - For Mac:
    - Assuming the source code of primer3 is in the Downloads folder, run the following to compile, test, and install primer3:
      cd ~/Downloads/primer3/src make make test sudo make install
    - Check whether installation is OK by running which primer3_core, which should print out the path to the primer3_core program (e.g. /usr/local/bin/primer3_core).
  - For Windows:
    - Download and install TDM-GCC MinGW Compiler.
    - (Optional) Assuming the source code of primer3 is in the Downloads folder, run the following to compile and test primer3:
      cd ~/Downloads/primer3/src mingw32-make TESTOPTS=--windows
    - Copy the "primer3/src" folder that contains the compiled binaries to your preferred location, e.g., "C:\Program Files\primer3\src".
    - Add the above folder to your Path Environment Variable.
      - Open Start Menu then type Advanced system settings and press Enter.
      - Click Environment Variables.
      - Select Path in the variable list and click Edit... to add the above directory.
Install the Janelia SciComp version of repp.
- If you use Git, run git clone https://github.com/JaneliaSciComp/repp.git.
- If you don't use Git, go to the Janelia SciComp GitHub page, click on the green button Code and Download ZIP. Unzip it.
- Assuming the source code of repp is in the Downloads folder, run the following to compile repp:
```
cd ~/Downloads/repp/cmd/repp
go build
```
- The above generates an executable in the same folder.
  - For Mac, run sudo mv repp /usr/local/bin/. to move the executable to /usr/local/bin or your preferred location. Type your password to give permission if prompted.
  - For Windows, copy the "repp.exe" file to your preferred location, e.g., "C:\Program Files\repp". Add this folder to your Path Environment Variable.
    - Open Start Menu then type Advanced system settings and press Enter.
    - Click Environment Variables towards the bottom of the dialogue.
    - Select Path in the variable list and click Edit... to add the above directory.
- Check whether installation is OK by running which repp, which should print out the path to the repp program (e.g. /usr/local/bin/repp).

3. Use repp in your plasmid design work flow

Download repp_test.zip for testing.
(Optional) Add sequence databases from remote repositories (e.g., Addgene, iGEM, DNASU).

As direct synthesis of DNA fragments becomes more affordable, the advantage of PCR amplification from existing plasmids diminishes. Moreover, acquiring a plasmid from repositories introduces additional time costs. If you have access to basic plasmid backbones, you may omit this step.

The original repp author has assembled FASTA files from Addgene, iGEM, and DNASU, and made it available from the S3 bucket. Run the following command to download and add them to the repp sequence database on your computer:
```
# download repository FASTA files
for db in igem addgene dnasu; do
  curl -o "$db.fa.gz" "https://repp.s3.amazonaws.com/$db.fa.gz"
  gzip -d "$db.fa.gz"
done

# add sequence DBs with the cost of ordering a plasmid from each source
repp add database --name igem --cost 0.0 < igem.fa
repp add database --name addgene --cost 85.0 < addgene.fa
repp add database --name dnasu --cost 99.0 < dnasu.fa
```
Add sequence databases from local collections.

Put all sequence files (GenBank or FASTA format) in a directory. In the repp_test example, this directory is called "lab-plasmid-collection". Run the following command to add them:
```
# -n is shorthand for --name
# -c is shorthand for --cost
# run "repp add database --help" for more options
repp add database -n lab -c 0 lab-plasmid-collection
```
Note that adding sequence database is a one-time operation that stores the database files in a hidden directory. Adding a new directory to a database with the same name(e.g., -n lab) will overwrite it.
- On Mac, they are in "~/.repp/dbs".
- On Windows, they are in "C:/users/username/.repp/dbs".

(Recommened) Organize your primer database.

Although not strictly required, it is highly recommended to create an organized primer database for re-using primers. The primer database parameter "-m" of the Janelia version of repp accpets either a single spreadsheet or a folder containing multiple spreadsheets. When your primer database contains hundred or thousands of primers, it can be cumbersome to scroll down to the bottom of the spreadsheet to add new primers. Instead, it is much easier to maintain a small active spreadsheet with one or more archived spreadsheets. The primer database spreadsheet must have the "primer_id" and "sequence" columns, and optionally other columns for additional notes.

In the repp_test example, the "primer_database" folder has two spreadsheets: "1_archived_primer.csv" and "2_active_primer.csv". This is how the "2_active_primer.csv" looks like:

primer_id	sequence
oS41	ACTTTTCGGGGAAATGTGCG
oS42	GTGAGCAAAAGGCCAGCAAA
oS43	GTGCCAGTGGTCTCTTGTTG
oS44	CTATTACCATGGTGATGCGGTTTTGGCAGTAC
oS45	ACTGGATCTCTGCTGTCCCT
oS46	GGCATGGACGAGCTGTACAA
oS47	TTCAAGTCTGTTCACACGCC
oS48	CTTGCAGCAGATTCAGACCC
oS49	CCACGTGGGCTTTATCTTCC

(Optional) Organize your fragment database.

Following the same logic of re-using primers, you may also wish to re-use synthesized fragments. For this purpose, you can organize your synthesized fragment database similarly as the primer database. Similar to the primer database parameter, the synthesized fragment database parameter "-s" can also accpet a single spreadsheet or a folder containing multiple spreadsheets.

In the repp_test example, the "fragment database" folder has two spreadsheets: "1_archived_frag.csv" and "2_active_frag.csv". This is how the "2_active_frag.csv" looks like:

frag_id sequence

syn4 atgtca...(long sequence)...ataacc

syn5 CAGGGA...(long sequence)...TCAAAG
Put together the target plasmid sequence in GenBank format.

In the repp_test example, the target plasmid is called "pW256.gb".

We use ApE (A plasmid Editor) to edit and annotate DNA sequences. ApE is a free software written by M. Wayne Davis from University of Utah. You can use whatever software you prefer, but make sure to save it in GenBank format. Note that the default .ape file uses GenBank format and is compatible with repp.

Run the repp make sequence command.

The simpliest command is repp make sequence -i pW256.gb, which uses all available databases (in this case, "igem,addgene,dnasu,lab") and default parameters.

To specify the database (e.g., only use local collection "lab"), use repp make sequence -i pW256.gb -d lab.

To also specify the primer database and a synthesized fragment database, use repp make sequence -i pW256.gb -d lab -m primer_database -s frag_database. This result in two csv files: "pW256.output-strategy.csv" and "pW256.output-reagents.csv".

The first 5 columns of "pW256.output-strategy.csv" is shown below. Other columns (Match Pct, GC%, 50 low GC%, 50 high GC%, and Homopolymer) can help users decide whether certain fragments may be difficult to PCR or to synthesize.

# 2023/11/08 13:37:36
# Solution 1
# Fragments:5 (3 - pcr	2 - synth)
# Cost: 179.510000	Adjusted Cost: 179.510000
Frag ID	Fwd Primer	Rev Primer	Template	Size
pW256_1_pcr	oS44	oS45	pW212	1983
syn5	N/A	N/A	N/A	350
pW256_3_pcr	oS50	oS51	pR92	2876
syn6	N/A	N/A	N/A	766
pW256_5_pcr	oS52	oS53	pW212	5555

The "pW256.output-reagents.csv" is shown below. The priming region and Tm columns can help users optimize PCR amplification conditions.

# Solution 1
Reagent ID	Seq	Priming Region	Tm
*oS44	CTATTACCATGGTGATGCGGTTTTGGCAGTAC	TGATGCGGTTTTGGCAGTAC	59.12
*oS45	ACTGGATCTCTGCTGTCCCT	ACTGGATCTCTGCTGTCCCT	59.96
oS50	GCGGAGTGCAACATCAAAGT	GCGGAGTGCAACATCAAAGT	59.41
oS51	GACTGCTTGCCTCCACCAC	GACTGCTTGCCTCCACCAC	60.97
oS52	ACGCGTTAAGTCGACAATCA	ACGCGTTAAGTCGACAATCA	57.95
oS53	CAAAACCGCATCACCATGGTAATAGCGATGACTAA	ACCATGGTAATAGCGATGACTAA	57.2
*syn5	CAGGGA...(long sequence)...CAAAGTG	N/A	N/A
syn6	TGGTGG...(long sequence)...CAATCAA	N/A	N/A

Note that pre-existing primers and synthesized fragments are marked with an asterisk. The IDs of new primers and synthesized fragments are incremented from the last entry of the last spreadsheet of the database.

Sometimes repp generates multiple solutions, because some solution may have a larger cost but fewer fragments. You can choose your favorite solution to move forward.

Order new reagents in the "output-reagents.csv" file.

For primers, copy the first two columns of new primers (those not marked with an asterisk) from the to the active primer database spreadsheet. These can then be ordered from your preferred supplier; our standard choice is IDT.

Likewise, for synthesized fragments, copy the first two columns of the new entries (again, those without an asterisk) to the active synthesized fragment database spreadsheet. You can order these from your favorite DNA fragment provider. We recommend Twist for its exceptionally low cost of 9 cents per base pair.
PCR amplification and Gibson Assembly.

Print the table in "output-strategy.csv" for bench work reference.

Follow this protocol for PCR amplification using a high-fidelity DNA polymerase.

Follow this protocol for assembling the fragments using Gibson Assembly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

repp-assisted-plasmid-design.md

repp-assisted-plasmid-design.md

1. (Optional) Install and set up Git

2. Install the Janelia SciComp version of repp

3. Use repp in your plasmid design work flow

frag_id	sequence
syn4	atgtca...(long sequence)...ataacc
syn5	CAGGGA...(long sequence)...TCAAAG

Files

repp-assisted-plasmid-design.md

Latest commit

History

repp-assisted-plasmid-design.md

File metadata and controls

1. (Optional) Install and set up Git

2. Install the Janelia SciComp version of repp

3. Use repp in your plasmid design work flow