Substitution Matrices

A CRUD for substitution matrices like BLOSUM50, BLOSUM62, PAM250 and more; commonly used in Bioinformatics and Evolutionary Biology.

This has been built by ZENODE within the Hardhat environment and is licensed under the MIT-license (see LICENSE.md).

Overview

Dependencies

hardhat (npm module)
web3 (npm module)
Uses the zenode-contracts repository, which is automatically included as a Git submodule.

Features

CRUD in Solidity; immutable code, but flexible by design.
Modular; loose coupling and high cohesion promote easy implementation into other contracts.
Re-usable; deploy only once and use in multiple contracts.
Ownership; access control and administrative privilege management.

Dataset

AA (Amino acids; alphabet for Proteins)
- BLOSUM50
- BLOSUM62
- PAM40
- PAM120
- PAM250
NT (Nucleotides; alphabet for DNA — also known as the 'Nucleic acid notation')
- SIMPLE
- SMART

Hardhat

Scripts
- deploy.js - deploys the contract to the configured network.
- insert.js - reads, parses and inserts matrices or alphabets.
- delete.js - deletes matrices or alphabets.
Tasks for contract interaction (see 6. Interaction).

AWK

Text parsers that convert matrices and alphabets into Solidity code.

Getting Started

TL;DR

0. Clone --use the --recursive flag.
git clone --recursive https://github.com/zenodeapp/substitution-matrices.git <destination_folder>
1. Installation --use npm, yarn or any other package manager.
npm install
yarn install
2. Run the test node --do this in a separate terminal!
npx hardhat node
3. Deployment
npx hardhat run scripts/deploy.js
4. Configuration --add the contract address to zenode.config.js.
...
contracts: {
  substitutionMatrices: {
    name: "SubstitutionMatrices",
    address: "ADD_YOUR_CONTRACT_ADDRESS_HERE",
  },
},
...
5. Population
npx hardhat run scripts/alphabets/insert.js
npx hardhat run scripts/matrices/insert.js
6. Interaction --use the scripts provided in the Interaction phase.

0. Clone

To get started, clone the repository with the --recursive flag:

git clone --recursive https://github.com/zenodeapp/substitution-matrices.git <destination_folder>

This repository includes submodules and should thus contain the --recursive flag.

If you've already downloaded, forked or cloned this repository without including the --recursive flag, then run this command from the root folder:

git submodule update --init --recursive

Read more on how to work with submodules in the zenode-contracts repository.

1. Installation

Install all dependencies using a package manager of your choosing:

npm install

yarn install

2. Configure and run your (test) node

After having installed all dependencies, use:

npx hardhat node

Make sure to do this in a separate terminal!

This will create a test environment where we can deploy our contract(s) to. By default, this repository is configured to Hardhat's local test node, but can be changed in the hardhat.config.js file. For more information on how to do this, see Hardhat's documentation.

3. Deployment

Now that our node is up-and-running, we can deploy our contract using:

npx hardhat run scripts/deploy.js

You should see a message appear in your terminal, stating that the contract was deployed successfully.

4. Configuration

Our CRUD is deployed, but doesn't contain any data whatsoever. Before we go ahead and populate it with alphabets and matrices, we'll have to make a couple of changes to the zenode.config.js file.

4.1 Link contract address (required)

We add the address of our contract to the contracts object. That way it knows which deployed contract it should interact with.

...
contracts: {
  substitutionMatrices: {
    name: "SubstitutionMatrices",
    address: "ADD_YOUR_CONTRACT_ADDRESS_HERE",
  },
},
...

The contract address can be found in your terminal after deployment.

4.2 Editing insertions/deletions (Optional)

By default, all known alphabets and matrices will be inserted upon running the insert.js scripts (in the Population phase).

If you would like to change this behavior, edit the following key-value pairs:

{
  // You could also pass in a string instead of an array
  alphabetsToInsert: ["ALPHABET_ID_1", "ALPHABET_ID_2", ...],
  matricesToInsert: ["MATRIX_ID_1", "MATRIX_ID_2", ...],
}

and for the delete.js scripts:

{
  alphabetsToDelete: ["ALPHABET_ID_1", "ALPHABET_ID_2", ...],
  matricesToDelete: ["MATRIX_ID_1", "MATRIX_ID_2", ...],
}

NOTE: IDs are only valid if they are present in the alphabets or matrices objects (see 4.3).

4.3 Adding new alphabets/matrices (Optional)

There are two steps to consider when adding new alphabets or matrices, namely:

The creation of the actual file that represents our new dataset, and
Creating a reference to this dataset in zenode.config.js.

For step one it's important to know what data our text parser expects. For this it might be best to look at the files we've already included in the dataset folder. I also suggest to read more about the formatting of Alphabets and Matrices in the Appendix.

For the second step we add our new dataset to one of the following objects:

alphabets

alphabets: {
  ALPHABET_ID_1: "ALPHABET_ID_1_RELATIVE_PATH",
  ALPHABET_ID_2: "ALPHABET_ID_2_RELATIVE_PATH",
  ...
},

or matrices

matrices: {
  MATRIX_ID_1: {
    alphabet: "ALPHABET_ID_2",
    file: "MATRIX_ID_1_RELATIVE_PATH",
  },
  MATRIX_ID_2: {
    alphabet: "ALPHABET_ID_1",
    file: "MATRIX_ID_2_RELATIVE_PATH",
  },
  ...
},

4.3.1 Remarks

The alphabets-object only requires an ID and RELATIVE_PATH.
The matrices-object on the other hand also requires you to add an ALPHABET_ID.
The IDs can be used in alphabetsToInsert, alphabetsToDelete, matricesToInsert and matricesToDelete (see 4.2).

4.3.2 Examples

alphabet amino_acids (protein sequence characters):

alphabets: {
  amino_acids: "dataset/alphabets/aa.txt",
}

matrix blosum100 using alphabet amino_acids:

matrices: {
  blosum100: {
    alphabet: "amino_acids",
    file: "dataset/matrices/blosum100.txt",
  },
}

IMPORTANT: adding a new alphabet or matrix doesn't mean it gets inserted into the contract in the Population phase. For this it has to be included in the alphabetsToInsert or matricesToInsert key-value pair! (see 4.2)

5. Population

Now that we've deployed our contract and configured our setup, we can start populating our CRUD with alphabets and matrices!

5.1 Insertion

To insert all the alphabets/matrices you've configured in the key-value pair alphabetsToInsert/matricesToInsert use:

npx hardhat run scripts/alphabets/insert.js

npx hardhat run scripts/matrices/insert.js

NOTE: you cannot insert a matrix before having inserted the alphabet it belongs to!

5.2 Deletion

To delete all the alphabets/matrices you've configured in the key-value pair alphabetsToDelete/matricesToDelete use:

npx hardhat run scripts/alphabets/delete.js

npx hardhat run scripts/matrices/delete.js

6. Interaction

Deployed, populated and ready to explore!

Here are a few Hardhat tasks (written in hardhat.config.js) to test our contract with:

getScore

Get the alignment score of two characters based on the given substitution matrix.
- input: --matrix string --a char --b char
- output: int
```
npx hardhat getScore --matrix "MATRIX_ID" --a "SINGLE_CHAR_A" --b "SINGLE_CHAR_B"
```
getAlphabet

Returns an alphabet-object based on the given ALPHABET_ID.
- input: --id string
- output: struct Alphabet --see libraries/Structs.sol
```
npx hardhat getAlphabet --id "ALPHABET_ID"
```
getMatrix

Returns a matrix-object based on the given MATRIX_ID.
- input: --id string
- output: struct Matrix --see libraries/Structs.sol
```
npx hardhat getMatrix --id "MATRIX_ID"
```
getAlphabets

Returns the list of inserted ALPHABET_IDs.
- input: null
- output: string[]
```
npx hardhat getAlphabets
```
getMatrices

Returns the list of inserted MATRIX_IDs.
- input: null
- output: string[]
```
npx hardhat getMatrices
```

Appendix

A. Alphabets and Matrices

Alphabets and Matrices are the two main components of the SubstitutionMatrices contract. Alphabets include but are not limited to nucleotide and protein sequence characters (e.g. C, T, A and G), while matrices are 2-dimensional scoring grids (e.g. BLOSUM62, PAM40, PAM120, etc.). To get a better (visual) understanding, you should check out the alphabets and matrices included in the dataset folder.

These components are simple .txt files that abide by the following formatting rules:

An alphabet is a single line of characters, where the position of a character represents its numeric value.
A matrix is a 2-dimensional grid, where the first row and first column consist of only-alphabetical characters.
The remaining positions of a matrix are integers (zero, negative or positive).
The order of the alphabetical characters inside a matrix should be the same as the alphabet it belongs to (horizontally and vertically).
Every alphanumerical character, for both alphabet and matrix, is delimited by whitespaces.

B. zenode.config.js

This is where most of the personalization for contract deployment and filling takes place.

In the case of the substitution-matrices repository this includes:

Choosing which alphabets/matrices get inserted or deleted in the Population phase.
Configuring which contract we'll interact with in the Interaction phase.
Expanding (or shrinking for that matter) the list of known alphabets and matrices.

Credits

Hardhat's infrastructure! (https://hardhat.org/)

— ZEN

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
.github		.github
contracts		contracts
dataset		dataset
libraries		libraries
scripts		scripts
submodules		submodules
the-awk-files		the-awk-files
.gitignore		.gitignore
.gitmodules		.gitmodules
CHANGELOG.md		CHANGELOG.md
LICENSE.md		LICENSE.md
README.md		README.md
hardhat.config.js		hardhat.config.js
package.json		package.json
zenode.config.js		zenode.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Substitution Matrices

Overview

Dependencies

Features

Dataset

Hardhat

AWK

Getting Started

TL;DR

0. Clone

1. Installation

2. Configure and run your (test) node

3. Deployment

4. Configuration

4.1 Link contract address (required)

4.2 Editing insertions/deletions (Optional)

4.3 Adding new alphabets/matrices (Optional)

4.3.1 Remarks

4.3.2 Examples

5. Population

5.1 Insertion

5.2 Deletion

6. Interaction

Appendix

A. Alphabets and Matrices

B. zenode.config.js

Credits

About

Releases

Sponsor this project

Packages

Languages

License

zenodeapp/substitution-matrices

Folders and files

Latest commit

History

Repository files navigation

Substitution Matrices

Overview

Dependencies

Features

Dataset

Hardhat

AWK

Getting Started

TL;DR

0. Clone

1. Installation

2. Configure and run your (test) node

3. Deployment

4. Configuration

4.1 Link contract address (required)

4.2 Editing insertions/deletions (Optional)

4.3 Adding new alphabets/matrices (Optional)

4.3.1 Remarks

4.3.2 Examples

5. Population

5.1 Insertion

5.2 Deletion

6. Interaction

Appendix

A. Alphabets and Matrices

B. zenode.config.js

Credits

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Languages

Packages