-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingest ~2,000 high-quality models into Terarium #363
Comments
@bigglesandginger @YohannParis |
@liunelson Have you used the API? When I try |
You are right about the API. It seems to return the search page itself, as opposed to a nice JSON listing the model IDs. With the Model IDs, then you can use this endpoint to get the model SBML file Does this make sense? |
I can write a script to crawl these and pull models & metadata if you'd like. I should have time on Monday/Tues and I think it should only take a couple hours tops. Just let me know the schema you need for the output. |
This was a little bit annoying since I had to render the javascript instead of just building a crawler using simple requests and html parsing.. but it is done. I managed to pull all 2435 model href tags, the URL for the source publication and the download link for the model file.
How do we add these to Terrarium? Update: URLs, models and reference links can be found in the JSON here: https://drive.google.com/file/d/1Upv84-fWmSqBvTxSzRpJqSEQ3OQ61GTc/view?usp=share_link
|
@liunelson to convert these to AMR and upload to Terarium |
I've ad-hoc converted 2k models from SBML to AMR JSON that Julian has scrapped from the BioModels repository: see here. I've also used the Open Access Button API to find the download link of the associated paper PDF: I was only able to download 10.8% of the open-access URLs but I didn't spend any time trying to figure out why the other ~60% of open-access URL downloads didn't work. Some, the OA URL GET just 404. However, for example, model Charles, can you have Terarium ingest all these models with their paper (if available)?
|
The goal is to pre-populate Terarium with a significant number of "high quality" models from an existing repository such as BioModels.
The models that we want to ingest are the ~2432 models returned by the BioModels search interface with only the filter "model format = SBML". We should use the REST API to download the SBML file of each model.
Each SBML model file (extension =
xml
orsbml
) should go through the following script (requires MIRA package) to convert from SBML format to PetriNet AMR JSON format:I've tested ~200 models and ~60% can be successfully converted into a PetriNet AMR JSON.
We'll need @j2whiting 's help to subsequently populate the "Model Card" associated with each model.
The text was updated successfully, but these errors were encountered: