Cluster the cluster data #18

ialarmedalien · 2020-08-18T23:02:35Z

Part 1 of the changes in this old PR in the relation_engine_spec repo.

Merge all cluster fields in the djornl_node collection into a single field.
Update parser and tests accordingly.

I updated the README.md docs to reflect this change. -- N/A
This is not a breaking API change

ialarmedalien · 2020-08-18T23:03:09Z

importers/djornl/parser.py

@@ -43,15 +46,15 @@ def _configure(self):

        _CLUSTER_BASE = os.path.join(configuration['ROOT_DATA_PATH'], 'cluster_data')
        configuration['_CLUSTER_PATHS'] = {
-            'cluster_I2': os.path.join(
+            'markov_i2': os.path.join(


rename these to something more informative

ialarmedalien · 2020-08-18T23:04:26Z

importers/djornl/parser.py

+
+        self.check_deltas(edge_data=edge_data, node_metadata=node_metadata, cluster_data=clusters)
+
+    def check_deltas(self, edge_data={}, node_metadata={}, cluster_data={}):


brief dataset summary for sanity checking

ialarmedalien · 2020-08-18T23:05:42Z

importers/test/test_djornl_parser.py

+        for data_structure in [edge_data, expected]:
+            for k in data_structure.keys():
+                data_structure[k] = sorted(data_structure[k], key=lambda n: n['_key'])


order data as it won't necessarily be sorted when coming out of the parser

ialarmedalien · 2020-08-18T23:06:43Z

spec/collections/djornl/djornl_node.yaml

+    clusters:
+      type: array
+      title: Clusters
+      description: Clusters to which the node has been assigned
+      items:
+        type: string
+        format: regex
+        pattern: ^\w+:\d+$
+      examples: [["markov_i2:1", "markov_i4:5"], ["markov_i6:3"]]


The important bit

ialarmedalien · 2020-08-18T23:07:48Z

spec/test/stored_queries/test_djornl.py

    # results are in the form {"nodes": [...], "edges": [...]}
    # nodes are represented as a list of node[_key]
    # edges are objects with keys _to, _from, edge_type and score

-    def test_fetch_phenotypes_no_results(self):


the queries with no results have been merged in with the other tests

Update parser and tests accordingly

jayrbolton · 2020-08-19T19:31:48Z

spec/stored_queries/djornl/djornl_fetch_clusters.yaml

+      title: Cluster IDs
+      description: Cluster IDs, in the form "clustering_system_name:cluster_id"
+      items: {type: string}
+      examples: [['markov_i2:5', 'markov_i6:2'],['markov_i6:1']]


Should this be an object so we don't have to parse these entries?

I guess if the client is using string parameters like "markov_i2:5" then it doesn't matter

ialarmedalien requested a review from jayrbolton as a code owner August 18, 2020 23:02

ialarmedalien commented Aug 18, 2020

View reviewed changes

Reformat clusters to be a single field in the djornl_node collection.

b7780a0

Update parser and tests accordingly

ialarmedalien force-pushed the cluster_the_clusters branch from 65ffbff to b7780a0 Compare August 18, 2020 23:09

Base automatically changed from spec_loader_refactor to develop August 19, 2020 19:25

jayrbolton reviewed Aug 19, 2020

View reviewed changes

jayrbolton merged commit 7e9165b into develop Aug 19, 2020

jayrbolton deleted the cluster_the_clusters branch August 19, 2020 19:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cluster the cluster data #18

Cluster the cluster data #18

ialarmedalien commented Aug 18, 2020

ialarmedalien Aug 18, 2020

ialarmedalien Aug 18, 2020

ialarmedalien Aug 18, 2020

ialarmedalien Aug 18, 2020

ialarmedalien Aug 18, 2020

jayrbolton Aug 19, 2020

jayrbolton Aug 19, 2020


		self.check_deltas(edge_data=edge_data, node_metadata=node_metadata, cluster_data=clusters)

		def check_deltas(self, edge_data={}, node_metadata={}, cluster_data={}):

Cluster the cluster data #18

Cluster the cluster data #18

Conversation

ialarmedalien commented Aug 18, 2020

ialarmedalien Aug 18, 2020

Choose a reason for hiding this comment

ialarmedalien Aug 18, 2020

Choose a reason for hiding this comment

ialarmedalien Aug 18, 2020

Choose a reason for hiding this comment

ialarmedalien Aug 18, 2020

Choose a reason for hiding this comment

ialarmedalien Aug 18, 2020

Choose a reason for hiding this comment

jayrbolton Aug 19, 2020

Choose a reason for hiding this comment

jayrbolton Aug 19, 2020

Choose a reason for hiding this comment