Implement Dropdown Selector for NIH Controlled Vocabulary in Keywords #267

Saixel · 2024-04-09T23:13:54Z

Background

As part of our ongoing collaboration with the CAFE project team, we have identified a need to refine the user experience in selecting controlled vocabulary terms for dataset keywords. This initiative aims to align the Dataverse metadata input process with standard vocabularies and enhance data discoverability and consistency.

Feature Request

Implement a selector (dropdown, box options, widget, etc) to allow users to select and add terms from the NIH controlled vocabulary glossary as keywords.

Current State:

The 'Keyword' metadata section allows for manual text entry, with the option to add additional input fields.
There is a textual prompt directing users to the NIH glossary website for keyword selection.

Desired Functionality:

Replace the manual entry system with a dropdown selector or a similar UI component.
Dynamically populate options within the selector based on the NIH controlled vocabulary glossary.

Justification

The CAFE project team requires a more standardized and error-proof method for keyword selection to ensure metadata quality and consistency. This enhancement will support users in accurately tagging datasets, thus facilitating better data curation and searchability.

Implementation Considerations

Explore the use of "Controlled Vocabulary URL" for dynamic term loading from the NIH glossary.
Consider the integration of a resource that has been prepared with each keyword, a description, and a URL, possibly using this as a CSV or similar format to load selector options.
The selector UI must be intuitive and should support multiple term additions as per dataset requirements.
Backend integration must ensure correct storage and handling of the selected vocabulary terms.

Additional Context

This request is driven by user feedback and the project's commitment to improving data quality and curation practices within the CAFE project's use of Dataverse. We have already compiled a comprehensive list, which includes each keyword, its description, and the associated URL, ready to be utilized for the selector feature.

pdurbin · 2024-04-10T13:55:43Z

Feature Request/Idea: Add "Term URI" metadata in Keyword block dataverse#10288

However, I checked with @Saixel and he plans to implement this using a custom metadata block for the CAFE project rather than attempting to modify the keyword field in the citation block (which is what the issue above is about).

He said there are almost 300 controlled vocabulary values.

scolapasta · 2024-04-16T19:29:22Z

First, this should be moved to the harvard dataverse repo, as it should not require any code in the core.

Second, we're wondering about this:
"Dynamically populate options within the selector based on the NIH controlled vocabulary glossary"

Is the idea to haver these values read from an existing API? If so we would use the external CV functionality and the best next step woiuld be a spike to use this API and make sure there are not any unexpected behavior. (that spike would likely be a size 10, for someone who already has experience with the external CV functionality)
If not, and it's just using our external CV functionality, then all that needs to be done is add the values to the appropriate tsv file, can be sized as a 3.

Saixel · 2024-04-24T23:33:34Z

Related:

Feature Request/Idea: Add "Term URI" metadata in Keyword block dataverse#10288

However, I checked with @Saixel and he plans to implement this using a custom metadata block for the CAFE project rather than attempting to modify the keyword field in the citation block (which is what the issue above is about).

He said there are almost 300 controlled vocabulary values.

@pdurbin Thanks for pointing out the related issue. My initial approach was to use a custom metadata block to avoid changing the current keyword block structure. However, I see in the comment in IQSS/dataverse#10288 that a similar case is suggested by implementing an autocomplete function. Our goal is to present a list of options for keyword selection from the prepared terms in a CSV. So either through a dropdown or autocomplete, either option could be a viable solution. If it's okay with you, we can dig deeper into this topic as we work on this implementation.

Saixel · 2024-04-24T23:37:23Z

First, this should be moved to the harvard dataverse repo, as it should not require any code in the core.

Second, we're wondering about this: "Dynamically populate options within the selector based on the NIH controlled vocabulary glossary"

Is the idea to haver these values read from an existing API? If so we would use the external CV functionality and the best next step woiuld be a spike to use this API and make sure there are not any unexpected behavior. (that spike would likely be a size 10, for someone who already has experience with the external CV functionality) If not, and it's just using our external CV functionality, then all that needs to be done is add the values to the appropriate tsv file, can be sized as a 3.

@scolapasta The issue has been moved to the Harvard Dataverse repo as per your guidance (thanks for pointing this out). Regarding the "Dynamically populate options within the selector based on the NIH controlled vocabulary glossary" feature, I'd like to clarify that we don't have an API. Instead, we have a CSV with a list of almost 300 terms. If we can use the external CV functionality you mentioned for this purpose, I would appreciate any documentation or pointers to existing implementations to explore and test this further.

pdurbin · 2024-04-25T10:23:53Z

I'd like to clarify that we don't have an API. Instead, we have a CSV with a list of almost 300 terms. If we can use the external CV functionality you mentioned for this purpose

I would recommend playing around with the configuring Author Affiliation to look up from ROR. For config advice, please see IQSS/dataverse#10331 (comment)

That said, this feature depends on an external API (like the ROR API). So you'd need to build and host that API somehow.

It might be easier to use the database and put the 300 values in a controlled vocabulary. But if you have a plan for how to build an API and where to host it, it should be do-able. 😄

Saixel · 2024-07-31T18:40:16Z

After further discussion, we've decided to expedite the NIH controlled vocabulary integration by creating a new custom metadata block with a dropdown for CCH terms, using our prepared list. This approach will help us avoid the complexity and longer development time of modifying the existing keyword metadata block.

jggautier · 2024-10-15T19:03:51Z

Hi all. In a meeting today, Sonia, Emily, Alexis, Ceilyn and I talked about this during a Zoom meeting and I was asked to write in this GitHub issue what I mentioned during the meeting.

There's evidence that other collection administrators (in addition to our colleagues at CAFE) would like their depositors and curators to be able to choose values from a list, like a drop down menu, instead of typing in terms and pasting in term URIs, as well as being able to enter their own values if nothing in the list is appropriate.

So letting collection admins adjust a field, like the keyword field, so that it suggests terms from a particular vocabulary, would be a helpful feature for other groups who manage collections.

But this can be more complex and take longer, like @Saixel wrote earlier in this issue. So in some cases where the collection is within an installation that has other collections with different needs, like collections in Harvard Dataverse, custom metadata blocks have been created, like what's being discussed in this GitHub issue.

In other cases, the collection admins ask depositors to enter metadata in a user interface that's separate from the one that the Dataverse repository uses, where a field like the keyword field has been changed to let depositors select terms suggested from a particular vocabulary and let depositors enter their own terms. And then Dataverse APIs are used to push that metadata to the Dataverse installation.

There are challenges with both of these approaches, too. In one of the GitHub issues I used to track work on one of CAFE's custom metadata blocks, I wrote about how fields in CAFE's custom metadata blocks overlap with fields in other metadata blocks, and that we'd want to resolve this design debt eventually.

Saixel added Type: Feature NIH CAFE Issues associated with the NIH CAFE project labels Apr 9, 2024

Saixel self-assigned this Apr 9, 2024

cmbz mentioned this issue Apr 10, 2024

Project: NIH CAFE IQSS/dataverse-pm#161

Open

15 tasks

Saixel transferred this issue from IQSS/dataverse Apr 24, 2024

Saixel added the Size: 3 A percentage of a sprint. label Apr 25, 2024

Saixel moved this from SPRINT- NEEDS SIZING to SPRINT READY in IQSS Dataverse Project Apr 25, 2024

Saixel added Size: 10 A percentage of a sprint. and removed Size: 3 A percentage of a sprint. labels Jul 17, 2024

Saixel added Size: 33 A percentage of a sprint. and removed Size: 10 A percentage of a sprint. labels Aug 14, 2024

Saixel moved this from SPRINT READY to This Sprint 🏃‍♀️ 🏃 in IQSS Dataverse Project Aug 14, 2024

cmbz added the FY25 Sprint 4 FY25 Sprint 4 label Aug 14, 2024

Saixel moved this from This Sprint 🏃‍♀️ 🏃 to In Progress 💻 in IQSS Dataverse Project Aug 28, 2024

cmbz added the FY25 Sprint 5 FY25 sprint 5 label Aug 28, 2024

Saixel added Size: 10 A percentage of a sprint. and removed Size: 33 A percentage of a sprint. labels Sep 11, 2024

cmbz added the FY25 Sprint 6 FY25 Sprint 6 label Sep 11, 2024

Saixel added the Status: Needs Input Applied to issues in need of input from someone currently unavailable label Sep 13, 2024

cmbz added the FY25 Sprint 7 FY25 Sprint 7 (2024-09-25 - 2024-10-09) label Sep 25, 2024

pdurbin removed the Status: Needs Input Applied to issues in need of input from someone currently unavailable label Oct 7, 2024

Saixel added Size: 3 A percentage of a sprint. and removed Size: 10 A percentage of a sprint. labels Oct 9, 2024

cmbz added the FY25 Sprint 8 FY25 Sprint 8 (2024-10-09 - 2024-10-23) label Oct 9, 2024

Saixel added Size: 0.5 and removed Size: 3 A percentage of a sprint. labels Oct 23, 2024

Saixel pinned this issue Oct 23, 2024

cmbz added the FY25 Sprint 9 FY25 Sprint 9 (2024-10-23 - 2024-11-06) label Oct 23, 2024

Saixel linked a pull request Oct 24, 2024 that will close this issue

Create customCAFEAdditionalData metadata block with CCH Terms field #318

Open

cmbz removed this from IQSS Dataverse Project Oct 25, 2024

cmbz added this to IQSS Dataverse Project Nov 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Dropdown Selector for NIH Controlled Vocabulary in Keywords #267

Implement Dropdown Selector for NIH Controlled Vocabulary in Keywords #267

Saixel commented Apr 9, 2024

pdurbin commented Apr 10, 2024 •

edited

Loading

scolapasta commented Apr 16, 2024

Saixel commented Apr 24, 2024 •

edited

Loading

Saixel commented Apr 24, 2024

pdurbin commented Apr 25, 2024

Saixel commented Jul 31, 2024

jggautier commented Oct 15, 2024 •

edited

Loading

Implement Dropdown Selector for NIH Controlled Vocabulary in Keywords #267

Implement Dropdown Selector for NIH Controlled Vocabulary in Keywords #267

Comments

Saixel commented Apr 9, 2024

Background

Feature Request

Justification

Implementation Considerations

Additional Context

pdurbin commented Apr 10, 2024 • edited Loading

scolapasta commented Apr 16, 2024

Saixel commented Apr 24, 2024 • edited Loading

Saixel commented Apr 24, 2024

pdurbin commented Apr 25, 2024

Saixel commented Jul 31, 2024

jggautier commented Oct 15, 2024 • edited Loading

pdurbin commented Apr 10, 2024 •

edited

Loading

Saixel commented Apr 24, 2024 •

edited

Loading

jggautier commented Oct 15, 2024 •

edited

Loading