-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster the cluster data #18
Conversation
@@ -43,15 +46,15 @@ def _configure(self): | |||
|
|||
_CLUSTER_BASE = os.path.join(configuration['ROOT_DATA_PATH'], 'cluster_data') | |||
configuration['_CLUSTER_PATHS'] = { | |||
'cluster_I2': os.path.join( | |||
'markov_i2': os.path.join( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename these to something more informative
|
||
self.check_deltas(edge_data=edge_data, node_metadata=node_metadata, cluster_data=clusters) | ||
|
||
def check_deltas(self, edge_data={}, node_metadata={}, cluster_data={}): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
brief dataset summary for sanity checking
for data_structure in [edge_data, expected]: | ||
for k in data_structure.keys(): | ||
data_structure[k] = sorted(data_structure[k], key=lambda n: n['_key']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
order data as it won't necessarily be sorted when coming out of the parser
clusters: | ||
type: array | ||
title: Clusters | ||
description: Clusters to which the node has been assigned | ||
items: | ||
type: string | ||
format: regex | ||
pattern: ^\w+:\d+$ | ||
examples: [["markov_i2:1", "markov_i4:5"], ["markov_i6:3"]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The important bit
# results are in the form {"nodes": [...], "edges": [...]} | ||
# nodes are represented as a list of node[_key] | ||
# edges are objects with keys _to, _from, edge_type and score | ||
|
||
def test_fetch_phenotypes_no_results(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the queries with no results have been merged in with the other tests
Update parser and tests accordingly
65ffbff
to
b7780a0
Compare
title: Cluster IDs | ||
description: Cluster IDs, in the form "clustering_system_name:cluster_id" | ||
items: {type: string} | ||
examples: [['markov_i2:5', 'markov_i6:2'],['markov_i6:1']] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be an object so we don't have to parse these entries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess if the client is using string parameters like "markov_i2:5" then it doesn't matter
Part 1 of the changes in this old PR in the relation_engine_spec repo.
Merge all
cluster
fields in thedjornl_node
collection into a single field.Update parser and tests accordingly.