Skip to content

Commit

Permalink
Added tests for CatalogTable.as_list_of_dicts (#1319)
Browse files Browse the repository at this point in the history
* Implemented test case with empty and standard input.

* Updated README.md for input folder inside test.

* Changed the input file structure to organize test inputs better for different test cases.
  • Loading branch information
Daud Ahmed committed Oct 23, 2024
1 parent b0b300e commit a3741d2
Show file tree
Hide file tree
Showing 10 changed files with 147 additions and 33 deletions.
6 changes: 6 additions & 0 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 15 additions & 4 deletions test/inputs/README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,27 @@
# Inputs for testing
# Inputs for Testing

In this folder we outline different kinds of inputs for testing.
This folder contains various input files used for testing different components of the Ersilia project. Each test may require specific inputs in different formats, and these are organized to maintain clarity and separation.

## Input formats
## Folder Structure

- **Test-Specific Folders**: Each test has its own dedicated folder, named after the test. Inside these folders, you'll find input files specific to that test.
- For example, the folder `test_inputs/` contains inputs for the `test_inputs.py` test.

## Adding New Test Inputs

When adding new tests, create a folder named after the test and place all relevant input files inside. Make sure to follow the format conventions mentioned above.

## Input Formats

The input files are available in different formats to accommodate various needs during testing:

* `.csv`: Input in tabular format.
* `.json`: Input in JSON format.
* `.py`: These files are used to mimick Python variables, i.e. useful when we use Ersilia in the Python API.

## Chemistry

The molecules (drugs) considered in this files are the following (in SMILES format):
Some tests such as test_inputs require chemical structures of molecules (drugs), which are following (in SMILES format):

```
CC1C2C(CC3(C=CC(=O)C(=C3C2OC1=O)C)C)O # artemisin
Expand Down
102 changes: 102 additions & 0 deletions test/inputs/catalog_samples.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
[
{
"Identifier": "eos1579",
"Slug": "metabokiller",
"Title": "Carcinogenic potential of metabolites and small molecules"
},
{
"Identifier": "eos157v",
"Slug": "grover-freesolv",
"Title": "Hydration free energy of small molecules in water"
},
{
"Identifier": "eos18ie",
"Slug": "antibiotics-ai",
"Title": "Substructure-based search of novel antibiotics"
},
{
"Identifier": "eos1af5",
"Slug": "molgrad-caco2",
"Title": "Coloring molecules for Caco-2 cell permeability"
},
{
"Identifier": "eos1amn",
"Slug": null,
"Title": "3D pharmacophore descriptor"
},
{
"Identifier": "eos1amr",
"Slug": "grover-bbbp",
"Title": "Blood-brain barrier penetration"
},
{
"Identifier": "eos1bba",
"Slug": null,
"Title": "GeoGNN Molecular Representation Prediction"
},
{
"Identifier": "eos1d7r",
"Slug": "small-world-zinc",
"Title": "Small World Zinc search"
},
{
"Identifier": "eos1mxi",
"Slug": "smiles-pe",
"Title": "SmilesPE: tokenizer algorithm for SMILES, DeepSMILES, and SELFIES"
},
{
"Identifier": "eos1n4b",
"Slug": "hdac3-inh",
"Title": "Identifying HDAC3 inhibitors"
},
{
"Identifier": "eos1noy",
"Slug": "chembl-sampler",
"Title": "ChEMBL Molecular Sampler"
},
{
"Identifier": "eos1pu1",
"Slug": "cardiotox-dictrank",
"Title": "Cardiotoxicity Classifier"
},
{
"Identifier": "eos1ut3",
"Slug": "molfeat-usrcat",
"Title": "USR descriptors with pharmacophoric constraints"
},
{
"Identifier": "eos1vms",
"Slug": "chembl-multitask-descriptor",
"Title": "Multi-target prediction based on ChEMBL data"
},
{
"Identifier": "eos1xje",
"Slug": "biogpt-embeddings",
"Title": "BioGPT embeddings"
},
{
"Identifier": "eos21q7",
"Slug": "inter_dili",
"Title": "InterDILI: drug-induced injury prediction"
},
{
"Identifier": "eos22io",
"Slug": "idl-ppbopt",
"Title": "Human Plasma Protein Binding (PPB) of Compounds"
},
{
"Identifier": "eos238c",
"Slug": "mesh-therapeutic-use",
"Title": "MeSH therapeutic use based on chemical structure"
},
{
"Identifier": "eos2401",
"Slug": "scaffold-decoration",
"Title": "Scaffold decoration"
},
{
"Identifier": "eos24ci",
"Slug": "drugtax",
"Title": "DrugTax: Drug taxonomy"
}
]
8 changes: 0 additions & 8 deletions test/inputs/compound_list.csv

This file was deleted.

3 changes: 0 additions & 3 deletions test/inputs/compound_lists.csv

This file was deleted.

5 changes: 0 additions & 5 deletions test/inputs/compound_pair_of_lists.csv

This file was deleted.

3 changes: 0 additions & 3 deletions test/inputs/compound_pairs_of_lists.csv

This file was deleted.

2 changes: 0 additions & 2 deletions test/inputs/compound_single.csv

This file was deleted.

8 changes: 0 additions & 8 deletions test/inputs/compound_singles.csv

This file was deleted.

24 changes: 24 additions & 0 deletions test/test_catalog.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
import json
import os
import pytest
from ersilia.hub.content.catalog import CatalogTable

@pytest.fixture
def catalog_samples():
file_path = os.path.join(os.path.dirname(__file__), 'inputs', 'catalog_samples.json')
with open(file_path, 'r') as f:
samples = json.load(f)
return samples

def test_as_list_of_dicts(catalog_samples):
columns = ['Identifier', 'Slug', 'Title']

# Test with standard catalog samples
catalog_table = CatalogTable(data=[list(item.values()) for item in catalog_samples], columns=columns)
result = catalog_table.as_list_of_dicts()
assert result == catalog_samples, "The result does not match the expected catalog samples"

# Test with empty catalog data
catalog_table_empty = CatalogTable(data=[], columns=columns)
result_empty = catalog_table_empty.as_list_of_dicts()
assert result_empty == [], "The result should be an empty list for empty input data"

0 comments on commit a3741d2

Please sign in to comment.