Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logic to convert packages into artifact names and namespaces #2458

Closed
ccerv1 opened this issue Nov 10, 2024 · 1 comment · Fixed by #2547
Closed

Logic to convert packages into artifact names and namespaces #2458

ccerv1 opened this issue Nov 10, 2024 · 1 comment · Fixed by #2547
Assignees
Labels
c:data Gathering data (e.g. indexing)

Comments

@ccerv1
Copy link
Member

ccerv1 commented Nov 10, 2024

What is it?

Using the SBOM data, we need to create consistent logic for deriving the artifact_name and artifact_namespace for a package and linking it to the project / GitHub repo that owns it.

For now, the priorities are mainly NPM and CRATES.


For example, this model grabs NPM packages

select distinct
  abp.project_source,
  abp.project_name,
  sbom.package,
  sbom.package_source
from `ossd.sbom` as sbom
join `oso.artifacts_by_project_v1` as abp
  on
    abp.artifact_namespace = lower(sbom.artifact_namespace)
    and abp.artifact_name = lower(sbom.artifact_name)
join `oso.projects_by_collection_v1` as pbc
  on abp.project_name = pbc.project_name
where
  abp.artifact_source = 'GITHUB'
  and sbom.package_source = 'NPM

and here is some quick logic to extract naming fields:

def extract_artifacts(pkg):
    pkg = pkg.replace('../','').replace('./','')
    namespace = pkg
    name = pkg
    if '/' in pkg:
        splt = pkg.split('/')
        if len(splt) > 2:
            if 'aztec-packages' in pkg:
                namespace = splt[0]
                name = splt[1]
            else:
                return (None, None)
        else:
            namespace, name = splt
        namespace = namespace[1:]
    return (namespace, name)
@github-project-automation github-project-automation bot moved this to Backlog in OSO Nov 10, 2024
@ccerv1 ccerv1 self-assigned this Nov 10, 2024
@ccerv1 ccerv1 added the c:data Gathering data (e.g. indexing) label Nov 10, 2024
@ccerv1
Copy link
Member Author

ccerv1 commented Nov 18, 2024

See also here

@Jabolol Jabolol moved this from In Progress to Up Next in OSO Nov 27, 2024
@ccerv1 ccerv1 linked a pull request Nov 29, 2024 that will close this issue
@github-project-automation github-project-automation bot moved this from Up Next to Done in OSO Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c:data Gathering data (e.g. indexing)
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants