Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FAIR data page #374

Merged
merged 7 commits into from
Dec 16, 2024
Merged

FAIR data page #374

merged 7 commits into from
Dec 16, 2024

Conversation

bedroesb
Copy link
Member

@bedroesb bedroesb commented Nov 26, 2024

  • Add contributors
  • Add event
  • Add to sidebar
  • Add content
  • Add tools

@bedroesb bedroesb requested a review from a team as a code owner November 26, 2024 15:49
@bedroesb bedroesb linked an issue Nov 26, 2024 that may be closed by this pull request
@bedroesb
Copy link
Member Author

@EvaGarciaAlvarez I don't have time to add the tools now, either I do it later, or if you want to start, always welcome!

Copy link
Contributor

@EvaGarciaAlvarez EvaGarciaAlvarez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added minor comments to the page. In general it looks good!


## Findability

Findability is a crucial aspect of infectious diseases research, as it ensures that relevant data and resources can be easily located and accessed by researchers and other stakeholders.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe easily discovered and located instead of located and accessed as this already talks about Accessibility.


* Provide detailed metadata for infectious disease datasets, including the source, collection date, location, and any performed protocols (e.g. nasal swab being the method of isolation: [EFO:0010741](http://www.ebi.ac.uk/efo/EFO_0010741)). Even when the granularity of the (meta)data varies, you should always use descriptive fields with broadly understandable values.
* Use controlled vocabularies and ontologies to describe human data and infectious diseases (e.g. [EFO:0007182](http://www.ebi.ac.uk/efo/EFO_0007182) for Brill-Zinsser disease). Furthermore, do not forget contextual data that must meet intercommunity standards, for example: time, temperature, pressure, chemical components…
* Controlled vocabulary refers to a set of terms, standardised by the field community, used to describe and categorise concepts, ensuring consistency and accuracy in data organisation and retrieval. For example, when a disease (e.g. Alport syndrome) has multiple used names (e.g. Alport deafness-nephropathy), it is recommended to use the designated one in the ontologies, so the redundancy is kept to a minimum.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Alport syndrome is not an infectious disease. Then I'd either replace it by an example of a infectious disease or drop it.

Examples of ontologies related to infectious diseases and human data and diseases are EFO (Experimental Factor Ontology), MONDO (Mondo Disease Ontology), HP (Human Phenotype Ontology), CIDO (Ontology of Coronavirus Infectious Disease), IDO (Infectious Disease Ontology), IDO-COVID-19 (The COVID-19 Infectious Disease Ontology), VIDO (The Virus Infectious Disease Ontology), DOID (Human Disease Ontology), the OBI (Ontology for Biomedical Investigations), and VO (Vaccine Ontology).
* It is possible to disseminate any recommendation on how to choose “good” ontologies, participating in the better understanding of well used and better recognized terminologies in related fields. To do it, some ideas can be found in: [Identifying, naming and interoperating data in a Phenotyping platform network : the good, the bad and the ugly.](https://doi.org/10.5281/zenodo.3539259)
* To aid with the taxonomy classification of your samples (human source, xenografts, tissue cultures, viral agents, etc.) you can make use of the [NCBI's taxonomybrowser](https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi).
* Please refer to RDA covid19 recommendation (and others) to help you to use most recognized terminologies adapted to your case: RDA COVID-19 Working Group. (2020). [RDA COVID-19 Recommendations and Guidelines on Data Sharing (1.0)](https://doi.org/10.15497/rda00052)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

covid19 --> COVID-19


* Redacting and interpreting data reuse policies is a complex and tedious task, especially when time is the main bottleneck of the research. For this reason, Data Use Conditions ({% tool "the-data-use-ontology" %}) were created (search for yours at {% tool "ols" %}). These allow to annotate datasets with usage restrictions, enabling:
* Automatic discovery of the data based on user authorization level or intended use.
* A quick and easy interpretation, from the perspective of the users, of the conditions to be met for data usage. (e.g. use very well and open licences like [Creative Commons](https://creativecommons.org/) and repositories that permit public licences and embargos like {% tool "zenodo" %})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use very well known (?) and open

* Automatic discovery of the data based on user authorization level or intended use.
* A quick and easy interpretation, from the perspective of the users, of the conditions to be met for data usage. (e.g. use very well and open licences like [Creative Commons](https://creativecommons.org/) and repositories that permit public licences and embargos like {% tool "zenodo" %})
* Make these controls in an iterative way and publish your metadata!
* Keep track of data o reuses, and if publicly available, give a perspective of what was done with your dataset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data of reuses (?)

* A quick and easy interpretation, from the perspective of the users, of the conditions to be met for data usage. (e.g. use very well and open licences like [Creative Commons](https://creativecommons.org/) and repositories that permit public licences and embargos like {% tool "zenodo" %})
* Make these controls in an iterative way and publish your metadata!
* Keep track of data o reuses, and if publicly available, give a perspective of what was done with your dataset
* Make your dataset citable!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just an idea: Maybe it would be great to say how it can be made citable

Implemented Eva's review comments
@bedroesb
Copy link
Member Author

@EvaGarciaAlvarez, @hedi-ee has I think fixed everything you spotted, can you approve the PR when you think it is good to go? Then we also update the date in the news item

@bedroesb bedroesb merged commit 76385fc into main Dec 16, 2024
4 checks passed
@bedroesb bedroesb deleted the fair-data branch December 16, 2024 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New FAIRness page
3 participants