Implementation of BETA BCO Ranking systems #329

HadleyKing · 2024-06-17T14:17:11Z

Implement the ideas from #328 into the BCO Scoring function:

Lines 599 to 621 in 456d002

    
           def bco_score(bco_instance: Bco) -> Bco: 
        
               """BCO Score  
        
               Process and score BioCompute Objects (BCOs). 
        
               """ 
        
               contents = bco_instance.contents 
        
               if "usability_domain" not in contents: 
        
                   bco_instance.score = 0 
        
                   return bco_instance 
        
               try: 
        
                   usability_domain_length = sum(len(s) for s in contents['usability_domain']) 
        
                   score = {"usability_domain_length": usability_domain_length} 
        
               except TypeError: 
        
                   score = {"usability_domain_length": 0} 
        
                   usability_domain_length = 0 
        
               bco_instance.score = usability_domain_length 
        
               return bco_instance

Kirans0615 · 2024-06-17T15:32:14Z

BCO Scoring System New.zip

seankim658 · 2024-06-17T18:18:02Z

Spoke with Hadley and we had some ideas for the representation of the scores in the data model. I've been implementing scores in the biomarker project for scoring "trustworthy" biomarkers and a few things that we've done that have made things easier to track are:

Have some sort of internal versioning for the scores. The scoring is an iterative process that changes over time and as it changes having some sort of way to delineate which scores come from which version formula is very helpful.
When calculating the scores, create an object with the formula breakdown. This can be used both internally when investigating scores and on the frontend to show users how the score was calculated/where the weights are coming from. The biomarker project has that information in our data schema and returns it on API requests. It looks like this:

{
  "score": 3.4,
    "score_info": {
      "contributions": [
        {
          "c": "first_pmid",
          "w": 1,
          "f": 1
        },
        {
          "c": "other_pmid",
          "w": 0.2,
          "f": 7
        },
        {
          "c": "first_source",
          "w": 1,
          "f": 1
        },
        {
          "c": "other_source",
          "w": 0.1,
          "f": 0
        },
        {
          "c": "generic_condition_pen",
          "w": -4,
          "f": 0
        },
        {
          "c": "loinc",
          "w": 1,
          "f": 0
        }
      ],
      "formula": "sum(w*f)",
      "variables": {
        "w": "weight",
        "c": "condition",
        "f": "frequency"
      }
    }
}

This shows that the score was calculated by the sum of the weights times the frequencies. For example, having one PMID associated with the biomarker is a weight of 1. Additional PMIDs get a weight of 0.2, and so on. So the calculation for this score was 1(1) + 0.2(7) + 1(1).

tiwa1154 · 2024-09-11T15:55:48Z

Write a FAQ on how the ranking system works, criteria, etc?

tiwa1154 · 2024-10-02T19:28:20Z

FAQ created in #446

HadleyKing assigned Kirans0615 Jun 17, 2024

HadleyKing added the enhancement New feature or request label Jun 17, 2024

HadleyKing added this to the 24.06.27 milestone Jun 17, 2024

tiwa1154 closed this as completed Oct 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of BETA BCO Ranking systems #329

Implementation of BETA BCO Ranking systems #329

HadleyKing commented Jun 17, 2024

Kirans0615 commented Jun 17, 2024

seankim658 commented Jun 17, 2024

tiwa1154 commented Sep 11, 2024

tiwa1154 commented Oct 2, 2024

Implementation of BETA BCO Ranking systems #329

Implementation of BETA BCO Ranking systems #329

Comments

HadleyKing commented Jun 17, 2024

Kirans0615 commented Jun 17, 2024

seankim658 commented Jun 17, 2024

tiwa1154 commented Sep 11, 2024

tiwa1154 commented Oct 2, 2024