Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create ETL that joins crashes and Moped components #1596

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Charlie-Henry
Copy link
Contributor

Associated issues

cityofaustin/atd-data-tech#19453

Testing

Steps to test:
First fill out the env_template for Hasura, from 1pass. Make sure you use Moped production and VZ staging credentials

Note, if you are on Apple Silicon you may need to add --platform linux/amd64 to get GDAL to install correctly.

docker build . -t atddocker/vz-moped-join
docker run -it --env-file env_file atddocker/vz-moped-join /bin/bash
python moped_project_components_spatial_join.py

The logging should be pretty self explanatory to see if things succeeded or not.

You can also then check if moped_component_crashes was populated with the lookup table. (maybe delete everything in that table yourself to double check if the script did anything).


Ship list

  • Check migrations for any conflicts with latest migrations in main branch
  • Confirm Hasura role permissions for necessary access
  • Code reviewed
  • Product manager approved

@johnclary johnclary changed the title moped project-vz crashes ETL Create ETL that joins crashes and Moped components Nov 7, 2024
Copy link
Member

@johnclary johnclary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome—this worked like a charm for me 🙌

I have a suggestion about the crash query filter and the python version we're using. Otherwise this looks fantastic!!!

Thanks, Charlie! 🚢 🚢 🚢 🚢

etl/moped_projects/queries.py Outdated Show resolved Hide resolved
etl/moped_projects/Dockerfile Outdated Show resolved Hide resolved
Copy link
Member

@chiaberry chiaberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2024-11-13 16:36:06,309 INFO: 511591 rows uploaded...

The logging is nice, being able to follow whats happening. 🚢

Copy link
Contributor

@mddilley mddilley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested locally, and I saw the staging table truncate and fill back up with records. Always fun to learn more about geo python! 🙌 🚀


It is recommended to run this script using the docker container. You can build it using:

Note, if you are on Apple Silicon you may need to add `--platform linux/amd64` to get GDAL to install correctly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏 thanks for this detail because this was the case for me

query_moped = """
{
component_arcgis_online_view(where: { geometry: { _is_null: false } }) {
project_component_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commas not creating a syntax error in GraphQL is a TIL for me. Cool to know! https://spec.graphql.org/June2018/#sec-Insignificant-Commas

components.set_index("project_component_id", inplace=True)
crashes.set_index("id", inplace=True)
logger.info(f"Joining crashes spatially to project geometry")
crashes_near_projects = gpd.sjoin(crashes, components, how="inner")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! this is a pretty powerful line, and a TIL for me. 😎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants