Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vida google-microsoft buildings to get-buildings #26

Open
cholmes opened this issue Oct 9, 2023 · 2 comments
Open

Add vida google-microsoft buildings to get-buildings #26

cholmes opened this issue Oct 9, 2023 · 2 comments
Labels
get_buildings Issues related to the get_buildings operations help wanted Extra attention is needed
Milestone

Comments

@cholmes
Copy link
Collaborator

cholmes commented Oct 9, 2023

The VIDA dataset on source combines google and microsoft buildings, and should get the most buildings of the different options. It should be relatively easy to add, but it doesn't use 'quadkey' for spatial partitioning, it's s2 instead. The one to add is https://beta.source.coop/vida/google-microsoft-open-buildings/geoparquet/by_country_s2 - as it's more partitioned and likely will perform much better (though it's worth trying both).

The main task for this is to have a different 'spatial' column - the current set up assumes quadkey, as that's what the first two were done with. Ideally download_buildings function would take an argument that would next be 'quadkey' or 's2', and we could add h3, geohash, etc. The get_building CLI should just have an option to use this dataset, and then it can pass the right arguments into download_buildings.

The quadkey is computed client side, and it's likely similarly easy to compute the s2 key, and then use that in the query.

@cholmes cholmes added this to the 0.10.0 milestone Oct 9, 2023
@cholmes cholmes added get_buildings Issues related to the get_buildings operations help wanted Extra attention is needed labels Oct 9, 2023
@cholmes
Copy link
Collaborator Author

cholmes commented Oct 26, 2023

Would also be great to add the one that doesn't split by s2 to see how the performance compares. https://beta.source.coop/vida/google-microsoft-open-buildings/geoparquet/by_country/

@mdjong1
Copy link

mdjong1 commented Jun 28, 2024

I think one challenge here is that our (VIDA) S2 cells are based upon the number of rows within a file, and are therefore not at a fixed level. This makes it difficult to determine the S2 cell id at a single level that can then be queried across all files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
get_buildings Issues related to the get_buildings operations help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants