You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Next, run the static_gtfs_analysis.ipynb. Add a cell at the bottom with %store -r summary and run it to read the summary DataFrame from the compare_scheduled_and_rt.ipynb notebook. Merge the summary DataFrame with the final_gdf GeoDataFrame from the compare_scheduled_and_rt.ipynb using summary_gdf = summary.merge(final_gdf, how="right", on="route_id")
Python
Run the following in an interpreter from the project root:
on trips crossing the hour boundary - are we suspecting that this code is double-counting trips if the trip crosses an hour boundary? despite vid being aggregated as a set?
I think that's the code. I guess maybe vid is unique only for a given hour, but it could appear in another hour for the same trip. It does seem strange though.
Investigate routes with
ratio > 1
There are some routes that have a ratio of actual trips to scheduled trips greater than one, and it would be good to know why.
Access the data
Jupyter Notebook
To access the data, run the notebook
compare_scheduled_and_rt.ipynb
. Add a cell at the bottom with%store summary
and run it. The%store
magic command allows you to share variables between notebooks https://stackoverflow.com/questions/31621414/share-data-between-ipython-notebooks.Next, run the
static_gtfs_analysis.ipynb
. Add a cell at the bottom with%store -r summary
and run it to read thesummary
DataFrame from thecompare_scheduled_and_rt.ipynb
notebook. Merge thesummary
DataFrame with thefinal_gdf
GeoDataFrame from thecompare_scheduled_and_rt.ipynb
usingsummary_gdf = summary.merge(final_gdf, how="right", on="route_id")
Python
Run the following in an interpreter from the project root:
Find routes with
ratio > 1
To filter the rows with
ratio > 1
, useA few things to look for:
ratio > 1
after reaggregting data based on a different frequency e.g. daily, see [Data] Adjust realtime/scheduled data comparison to cover multiple time frames or averages (for example, average over entire period vs. avg by day vs. avg by hour across entire period) #12The text was updated successfully, but these errors were encountered: