Skip to content

Commit

Permalink
docs: Update oss-directory python package directions (#1821)
Browse files Browse the repository at this point in the history
  • Loading branch information
ryscheng authored Jul 19, 2024
1 parent a12f788 commit a931184
Showing 1 changed file with 47 additions and 21 deletions.
68 changes: 47 additions & 21 deletions apps/docs/docs/integrate/oss-directory.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,11 @@ sidebar_position: 11
---

:::info
[oss-directory](https://github.com/opensource-observer/oss-directory) contains structured data on as many open source projects as possible, enumerating all artifacts related to the project, from source code repositories to published packages and deployments. You can get the data via our npm library or downloading the data directly from GitHub.
[oss-directory](https://github.com/opensource-observer/oss-directory) contains structured data on as many open source projects as possible, enumerating all artifacts related to the project, from source code repositories to published packages and deployments. You can get the data via our npm library, python package, or downloading the data directly from GitHub.
:::

## Directory Structure

---

The OSS Directory is organized into two main folders:

- `./data/projects` - each file represents a single open source project and contains all of the artifacts for that project.
Expand All @@ -22,26 +20,29 @@ The OSS Directory is organized into two main folders:
- See `./src/resources/schema/collection.json` for the expected JSON schema
- Collections are identified by their unique `name`

## npm library
## Using as a library

---
We have also published this repository as a library that you can use in your own projects. This is useful if you want to build a tool that uses the data in this repository or perform your own custom analysis.

We have libraries for JavaScript and Python. We don't store the entire dataset with the package. Under the hood, this will clone the repository into a temporary directory, read all the data files, validate the schema, and return the objects. This way, you know you're getting the latest data, even if the package hasn't been updated in a while.

[oss-directory](https://www.npmjs.com/package/oss-directory)
is a library that you can use in your own projects.
This may be useful if you want to build a tool that uses the data
in this repository or perform your own custom analysis.
_Note: These do not work in a browser-environment_

### Installation
### JavaScript library

[npm page](https://www.npmjs.com/package/oss-directory)

#### Installation

Install the library

```bash
npm install --save oss-directory
npm install oss-directory
# OR yarn add oss-directory
# OR pnpm add oss-directory
```

### Fetch all of the data
#### Fetch all of the data

You can fetch all of the data in this repo with the following:

Expand All @@ -53,17 +54,42 @@ const projects: Project[] = data.projects;
const collections: Collection[] = data.collections;
```

:::note
We don't store the entire dataset with the npm package. Under the hood, this will clone the repository into a temporary directory, read all the data files, validate the schema, and return the objects. This way, you know you're getting the latest data, even if the npm package hasn't been updated in a while.
If we make a breaking change in a
[schema update](../how-oso-works/oss-directory/schema-updates.md)
you will need to update this library or an exception will throw
due to a typing mismatch with the old library version.
:::
#### Utility functions

## Direct Download from GitHub
We also include functions for casting and validating data:

---
- `validateProject`
- `validateCollection`
- `safeCastProject`
- `safeCastCollection`

### Python library

[PyPI page](https://pypi.org/project/oss-directory/)

#### Installation

Install the library

```bash
pip install oss-directory
# OR poetry add oss-directory
```

#### Fetch all of the data

You can fetch all of the data in this repo with the following:

```python
from ossdirectory import fetch_data
from ossdirectory.fetch import OSSDirectory

data: OSSDirectory = fetch_data()
projects: List[dict] = data.projects;
collections: List[dict] = data.collections;
```

## Direct Download from GitHub

All of the data is accessible from directly GitHub. You can download
[oss-directory](https://github.com/opensource-observer/oss-directory)
Expand Down

0 comments on commit a931184

Please sign in to comment.