Added tests for CatalogTable.as_list_of_dicts (#1319)

* Implemented test case with empty and standard input. * Updated README.md for input folder inside test. * Changed the input file structure to organize test inputs better for different test cases.
ersilia-os · Oct 23, 2024 · a3741d2 · a3741d2
1 parent b0b300e
commit a3741d2
Show file tree

Hide file tree

Showing 10 changed files with 147 additions and 33 deletions.
diff --git a/package-lock.json b/package-lock.json
diff --git a/test/inputs/README.md b/test/inputs/README.md
@@ -1,16 +1,27 @@
-# Inputs for testing
+# Inputs for Testing
 
-In this folder we outline different kinds of inputs for testing.
+This folder contains various input files used for testing different components of the Ersilia project. Each test may require specific inputs in different formats, and these are organized to maintain clarity and separation.
 
-## Input formats
+## Folder Structure
+
+- **Test-Specific Folders**: Each test has its own dedicated folder, named after the test. Inside these folders, you'll find input files specific to that test.
+  - For example, the folder `test_inputs/` contains inputs for the `test_inputs.py` test.
+
+## Adding New Test Inputs
+
+When adding new tests, create a folder named after the test and place all relevant input files inside. Make sure to follow the format conventions mentioned above.
+
+## Input Formats
+
+The input files are available in different formats to accommodate various needs during testing:
 
 * `.csv`: Input in tabular format.
 * `.json`: Input in JSON format.
 * `.py`: These files are used to mimick Python variables, i.e. useful when we use Ersilia in the Python API.
 
 ## Chemistry
 
-The molecules (drugs) considered in this files are the following (in SMILES format):
+Some tests such as test_inputs require chemical structures of molecules (drugs), which are following (in SMILES format):
 
 ```
 CC1C2C(CC3(C=CC(=O)C(=C3C2OC1=O)C)C)O # artemisin

diff --git a/test/inputs/catalog_samples.json b/test/inputs/catalog_samples.json
@@ -0,0 +1,102 @@
+[
+    {
+        "Identifier": "eos1579",
+        "Slug": "metabokiller",
+        "Title": "Carcinogenic potential of metabolites and small molecules"
+    },
+    {
+        "Identifier": "eos157v",
+        "Slug": "grover-freesolv",
+        "Title": "Hydration free energy of small molecules in water"
+    },
+    {
+        "Identifier": "eos18ie",
+        "Slug": "antibiotics-ai",
+        "Title": "Substructure-based search of novel antibiotics"
+    },
+    {
+        "Identifier": "eos1af5",
+        "Slug": "molgrad-caco2",
+        "Title": "Coloring molecules for Caco-2 cell permeability"
+    },
+    {
+        "Identifier": "eos1amn",
+        "Slug": null,
+        "Title": "3D pharmacophore descriptor"
+    },
+    {
+        "Identifier": "eos1amr",
+        "Slug": "grover-bbbp",
+        "Title": "Blood-brain barrier penetration"
+    },
+    {
+        "Identifier": "eos1bba",
+        "Slug": null,
+        "Title": "GeoGNN Molecular Representation Prediction"
+    },
+    {
+        "Identifier": "eos1d7r",
+        "Slug": "small-world-zinc",
+        "Title": "Small World Zinc search"
+    },
+    {
+        "Identifier": "eos1mxi",
+        "Slug": "smiles-pe",
+        "Title": "SmilesPE: tokenizer algorithm for SMILES, DeepSMILES, and SELFIES"
+    },
+    {
+        "Identifier": "eos1n4b",
+        "Slug": "hdac3-inh",
+        "Title": "Identifying HDAC3 inhibitors"
+    },
+    {
+        "Identifier": "eos1noy",
+        "Slug": "chembl-sampler",
+        "Title": "ChEMBL Molecular Sampler"
+    },
+    {
+        "Identifier": "eos1pu1",
+        "Slug": "cardiotox-dictrank",
+        "Title": "Cardiotoxicity Classifier"
+    },
+    {
+        "Identifier": "eos1ut3",
+        "Slug": "molfeat-usrcat",
+        "Title": "USR descriptors with pharmacophoric constraints"
+    },
+    {
+        "Identifier": "eos1vms",
+        "Slug": "chembl-multitask-descriptor",
+        "Title": "Multi-target prediction based on ChEMBL data"
+    },
+    {
+        "Identifier": "eos1xje",
+        "Slug": "biogpt-embeddings",
+        "Title": "BioGPT embeddings"
+    },
+    {
+        "Identifier": "eos21q7",
+        "Slug": "inter_dili",
+        "Title": "InterDILI: drug-induced injury prediction"
+    },
+    {
+        "Identifier": "eos22io",
+        "Slug": "idl-ppbopt",
+        "Title": "Human Plasma Protein Binding (PPB) of Compounds"
+    },
+    {
+        "Identifier": "eos238c",
+        "Slug": "mesh-therapeutic-use",
+        "Title": "MeSH therapeutic use based on chemical structure"
+    },
+    {
+        "Identifier": "eos2401",
+        "Slug": "scaffold-decoration",
+        "Title": "Scaffold decoration"
+    },
+    {
+        "Identifier": "eos24ci",
+        "Slug": "drugtax",
+        "Title": "DrugTax: Drug taxonomy"
+    }
+]
diff --git a/test/inputs/compound_list.csv b/test/inputs/compound_list.csv
diff --git a/test/inputs/compound_lists.csv b/test/inputs/compound_lists.csv
diff --git a/test/inputs/compound_pair_of_lists.csv b/test/inputs/compound_pair_of_lists.csv
diff --git a/test/inputs/compound_pairs_of_lists.csv b/test/inputs/compound_pairs_of_lists.csv
diff --git a/test/inputs/compound_single.csv b/test/inputs/compound_single.csv
diff --git a/test/inputs/compound_singles.csv b/test/inputs/compound_singles.csv
diff --git a/test/test_catalog.py b/test/test_catalog.py
@@ -0,0 +1,24 @@
+import json
+import os
+import pytest
+from ersilia.hub.content.catalog import CatalogTable
+
+@pytest.fixture
+def catalog_samples():
+    file_path = os.path.join(os.path.dirname(__file__), 'inputs', 'catalog_samples.json')
+    with open(file_path, 'r') as f:
+        samples = json.load(f)
+    return samples
+
+def test_as_list_of_dicts(catalog_samples):
+    columns = ['Identifier', 'Slug', 'Title']
+
+    # Test with standard catalog samples
+    catalog_table = CatalogTable(data=[list(item.values()) for item in catalog_samples], columns=columns)
+    result = catalog_table.as_list_of_dicts()
+    assert result == catalog_samples, "The result does not match the expected catalog samples"
+
+    # Test with empty catalog data
+    catalog_table_empty = CatalogTable(data=[], columns=columns)
+    result_empty = catalog_table_empty.as_list_of_dicts()
+    assert result_empty == [], "The result should be an empty list for empty input data"