Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract metadata for ims scorm cp #4258

Open
wants to merge 4 commits into
base: unstable
Choose a base branch
from

Conversation

manavagr1108
Copy link
Contributor

@manavagr1108 manavagr1108 commented Aug 21, 2023

This code aims to extract metadata from IMS content package.
This Pr is a part of GSoC Project linked with the issue #4081.

Summary

Description of the change(s) you made

  • Check if imsmanifest.xml is present in file or not
  • Metadata may also be present in imsmetadata.xml we need to check this file as well
  • Extract metadata through extractIMSMetadata

Manual verification steps performed

  1. IMS content package is a zip file and these format presents contains .zip format:
    • QTI
    • HTML5_DEPENDENCY
    • HTML5_ZIP
  2. Use jszip to load the uploaded ZIP file asynchronously.
  3. Once loaded, parse imsmanifest.xmland imsmetadata.xml file and read the file as text.
  4. Then parse the text to json so that we can extract the data.
  5. Example of metadata:
{
  title,
  description,
  language,
  folders: [
    {
      title,
      files: [
        {
          identifierref,
          resourceHref,
          title,
        },
        {
          identifierref,
          resourceHref,
          title,
        }
      ]
    },
    {
      title,
      files: [
        {
          identifierref,
          resourceHref,
          title,
        },
      ],
    },
  ],
}
  1. Once metadata is extracted, map the topic node to the resource node in frontend.
  2. We use extra_fields to map topic node with resource node and render in frontend.

Comments

Checking for sub manifest files present in content packe is yet to be implemented.

Contributor's Checklist

PR process:

  • If this is an important user-facing change, PR or related issue the CHANGELOG label been added to this PR. Note: items with this label will be added to the CHANGELOG at a later time
  • If this includes an internal dependency change, a link to the diff is provided
  • The docs label has been added if this introduces a change that needs to be updated in the user docs?
  • If any Python requirements have changed, the updated requirements.txt files also included in this PR
  • Opportunities for using Google Analytics here are noted
  • Migrations are safe for a large db

Studio-specifc:

  • All user-facing strings are translated properly
  • The notranslate class been added to elements that shouldn't be translated by Google Chrome's automatic translation feature (e.g. icons, user-generated text)
  • All UI components are LTR and RTL compliant
  • Views are organized into pages, components, and layouts directories as described in the docs
  • Users' storage used is recalculated properly on any changes to main tree files
  • If there new ways this uses user data that needs to be factored into our Privacy Policy, it has been noted.

Testing:

  • Code is clean and well-commented
  • Contributor has fully tested the PR manually
  • If there are any front-end changes, before/after screenshots are included
  • Critical user journeys are covered by Gherkin stories
  • Any new interactions have been added to the QA Sheet
  • Critical and brittle code paths are covered by unit tests

Reviewer's Checklist

This section is for reviewers to fill out.

  • Automated test coverage is satisfactory
  • PR is fully functional
  • PR has been tested for accessibility regressions
  • External dependency files were updated if necessary (yarn and pip)
  • Documentation is updated
  • Contributor is in AUTHORS.md

- Create topic and resouce node for the metadata
@manavagr1108 manavagr1108 force-pushed the extract-metadata-for-IMS-SCORM-cp branch 3 times, most recently from cb939ec to 33b1680 Compare August 22, 2023 23:33
@manavagr1108 manavagr1108 force-pushed the extract-metadata-for-IMS-SCORM-cp branch from 33b1680 to 90102de Compare August 22, 2023 23:36
- adding test cases for extractIMSMetadata
Copy link
Member

@rtibbles rtibbles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor tweaks needed, but this is looking very good!

Note: we will hold off merge until unstable has been released, so that we can release the H5P metadata extraction sooner, then this will be merged and released in a later release!

@input="trackSelect"
@removed="handleRemoved"
/>
<div v-if="getChildren !== undefined">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should change this to a length check on the array!

});
} else if (file.metadata.folders) {
this.createNode('topic', file.metadata).then(newNodeId => {
file.metadata.folders.forEach(org => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update org here to folder.

this.createNode('topic', file.metadata).then(newNodeId => {
file.metadata.folders.forEach(org => {
this.createNode('topic', org, newNodeId).then(topicNodeId => {
org.files.forEach(orgFile => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

orgFile to folderFile

return File.uploadUrl({
checksum: file.checksum,
size: file.file_size,
type: 'application/zip',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's double check if we need to specify this explicitly, or if it should be inferred by existing functionality.

total: file.size,
};
if (index === 0) {
this.selected = [resourceNodeId];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change this to set this.selected if we haven't already set this.selected to something - so only the first finalized node gets selected.

});
});
it('extractIMSMetadata should extract metadata from imsmanifest.xml', async () => {
// const manifestFile = get_imsmanifest_file({ title: 'Test file' });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean up comments!

<imsmd:lom>
<imsmd:general>
<imsmd:title>
<imsmd:langstring xml:lang="en">Test File</imsmd:langstring>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lang should be und here.

) {
metadata.language = xmlDoc
.getElementsByTagName('lomes:idiom')[0]
.children[0].textContent.replace(/ {2}|\r\n|\n|\r/gm, '');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double check whether trim will do the same job here!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertions in the tests should validate that this is working properly!

.getElementsByTagName('lomes:idiom')[0]
.textContent.replace(/ {2}|\r\n|\n|\r/gm, '') !== 'und'
) {
metadata.language = xmlDoc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For language extraction let's replicate the validation we are doing in H5P to check this is a supported language code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can also add some more tests for unhappy paths!

const IMS_PRESETS = [
FormatPresetsNames.QTI,
FormatPresetsNames.HTML5_DEPENDENCY,
FormatPresetsNames.HTML5_ZIP,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for me - I may need to consider an IMSCP preset format.

@manavagr1108 manavagr1108 force-pushed the extract-metadata-for-IMS-SCORM-cp branch from 2db6a02 to 6c94668 Compare September 19, 2023 12:46
xmlDoc
.getElementsByTagName('lomes:idiom')[0]
.textContent.replace(/ {2}|\r\n|\n|\r/gm, '') !== 'und'
LanguagesMap.has(xmlDoc.getElementsByTagName('lomes:idiom')[0].textContent.trim()) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could avoid a bit of repetition here and define outside of the if statement:

const language = xmlDoc.getElementsByTagName('lomes:idiom').length ? xmlDoc.getElementsByTagName('lomes:idiom')[0].textContent.trim() : 'und';

(defaulting to the disallowed und)

then you can do the checks and assignment against this value instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants