Skip to content

Commit

Permalink
Merge pull request #6 from kuwala-io/kuwala/osm-poi
Browse files Browse the repository at this point in the history
OSM-POI Pipeline
  • Loading branch information
Matti committed May 4, 2021
2 parents 31e15a7 + b750ad4 commit 5f12280
Show file tree
Hide file tree
Showing 95 changed files with 22,932 additions and 379 deletions.
55 changes: 42 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,52 @@
![Logo Banner](./docs/images/Kuwala%20Title%20Banner.png)
![Logo Banner](./docs/images/kuwala_title_banner.png)

![License](https://img.shields.io/github/license/kuwala-io/kuwala)

[![Slack](https://img.shields.io/badge/slack-chat-orange.svg)](
https://join.slack.com/t/kuwala-community/shared_invite/zt-l5b2yjfp-pXKFBjbnl7_P3nXtwca5ag)

### The Vision of a Global Liquid Data Economy

With Kuwala, we want to enable the global liquid data economy. You probably also envision a future of smart cities, autonomously driving cars, and sustainable living. For all of that, we need to leverage the power of data. Unfortunately, many promising data projects fail, however. That's because too many resources are necessary for gathering and cleaning data. Kuwala supports you as a data engineer, data scientist, or business analyst to create a holistic view of your ecosystem by integrating third-party data seamlessly.
With Kuwala, we want to enable the global liquid data economy. You probably also envision a future of smart cities,
autonomously driving cars, and sustainable living. For all of that, we need to leverage the power of data.
Unfortunately, many promising data projects fail, however. That's because too many resources are necessary for
gathering and cleaning data. Kuwala supports you as a data engineer, data scientist, or business analyst to create a
holistic view of your ecosystem by integrating third-party data seamlessly.

### How Kuwala works

Kuwala explicitly focuses on integrating third-party data, so data that is not under your company's influence, e.g., weather or population information. To easily combine several different domains, we further narrow it down to data with a geo-component which still includes many sources. For matching data on different aggregation levels, such as POIs to a moving thunderstorm, we leverage [Uber's H3](https://eng.uber.com/h3/) spatial indexing.
Kuwala explicitly focuses on integrating third-party data, so data that is not under your company's influence, e.g.,
weather or population information. To easily combine several domains, we further narrow it down to data with a
geo-component which still includes many sources. For matching data on different aggregation levels, such as POIs to a
moving thunderstorm, we leverage [Uber's H3](https://eng.uber.com/h3/) spatial indexing.

![H3 Overview](./docs/images/h3_overview.png)

Connectors wrap individual data sources. Within the connector, raw data is cleaned and preprocessed. Based on that, the connector exposes query functions through REST and GraphQL endpoints. Through this, you can easily combine connectors and build further applications and pipelines on top of them. We plan on releasing an open-source solution specifically for this purpose.
Pipelines wrap individual data sources. Within the pipeline, raw data is cleaned and preprocessed. Based on that, the
pipeline exposes query functions through REST and GraphQL endpoints. Through this, you can easily combine pipelines and
build further applications and pipelines on top of them. We plan on releasing an open-source solution specifically for
this purpose.

### How you can contribute

The best first step to get involved is to [join](https://join.slack.com/t/kuwala-community/shared_invite/zt-l5b2yjfp-pXKFBjbnl7_P3nXtwca5ag) the Kuwala Community on Slack. There we discuss everything related to data integration and new connectors. Every connector will be open-source. We entirely decide, based on you, our community, which sources to integrate. You can reach out to us on Slack or [email](mailto:[email protected]) to request a new connector or contribute yourself. If you want to contribute yourself, you can use your choice's programming language and database technology. We have the only requirement that it is possible to run the connector locally, query the data through REST-API endpoints, and use [Uber's H3](https://eng.uber.com/h3/) functionality to handle geographical transformations. We will then take the responsibility to maintain your connector.
The best first step to get involved is to
[join](https://join.slack.com/t/kuwala-community/shared_invite/zt-l5b2yjfp-pXKFBjbnl7_P3nXtwca5ag) the Kuwala Community
on Slack. There we discuss everything related to data integration and new pipelines. Every pipeline will be open-source.
We entirely decide, based on you, our community, which sources to integrate. You can reach out to us on Slack or
[email](mailto:[email protected]) to request a new pipeline or contribute yourself. If you want to contribute
yourself, you can use your choice's programming language and database technology. We have the only requirement that it
is possible to run the pipeline locally, query the data through REST-API endpoints, and use
[Uber's H3](https://eng.uber.com/h3/) functionality to handle geographical transformations. We will then take the
responsibility to maintain your pipeline.

### Liberating the work with data

By working together as a community of data enthusiasts, we can create a network of seamlessly integratable connectors. It is now causing headaches to integrate third-party data into applications. But together, we will make it straightforward to combine, merge and enrich data sources for powerful models.
By working together as a community of data enthusiasts, we can create a network of seamlessly integratable pipelines.
It is now causing headaches to integrate third-party data into applications. But together, we will make it
straightforward to combine, merge and enrich data sources for powerful models.

### What's coming next for the connectors?
Based on the use-cases we have discussed in the community and potential users, we have identified a variety of data sources to connect with next:
### What's coming next for the pipelines?
Based on the use-cases we have discussed in the community and potential users, we have identified a variety of data
sources to connect with next:

#### Semi-structured data
Already structured data but not adapted to the Kuwala framework:
Expand All @@ -44,9 +68,14 @@ Data we would like to integrate, but a scalable approach is still missing:

---

## Using existing connectors
## Using existing pipelines

To use our published connectors clone this repository and navigate to ```kuwala-connectors```. There is a separate README for each connector on how to get started with it.
To use our published pipelines clone this repository and navigate to
[`kuwala-pipelines`](https://github.com/kuwala-io/kuwala/tree/master/kuwala-pipelines). There is a separate README
for each pipeline on how to get started with it.

We currently have the following connectors published:
- ```population-density```: Detailed population and demographic data
We currently have the following pipelines published:
- [`osm-poi`](https://github.com/kuwala-io/kuwala/tree/master/kuwala-pipelines/osm-poi):
Global collection of point of interests (POIs)
- [`population-density`](https://github.com/kuwala-io/kuwala/tree/master/kuwala-pipelines/population-density):
Detailed population and demographic data
Binary file added docs/images/h3_overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
Binary file added docs/images/population_density_overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
94 changes: 0 additions & 94 deletions kuwala-connectors/population-density/README.md

This file was deleted.

1 change: 0 additions & 1 deletion kuwala-connectors/population-density/config/h3/index.js

This file was deleted.

39 changes: 0 additions & 39 deletions kuwala-connectors/population-density/src/app/routes/cell/index.js

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

File renamed without changes.
10 changes: 10 additions & 0 deletions kuwala-pipelines/osm-poi/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
.idea
.DS_Store
**/.DS_Store
node_modules

config/env/*
!config/env/.env.local

tmp
!tmp/.gitkeep
Loading

0 comments on commit 5f12280

Please sign in to comment.