Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider moving some utilities to the DataSetResults class #61

Open
ericnost opened this issue Jan 25, 2024 · 1 comment
Open

consider moving some utilities to the DataSetResults class #61

ericnost opened this issue Jan 25, 2024 · 1 comment
Labels
enhancement New feature or request
Milestone

Comments

@ericnost
Copy link
Member

ericnost commented Jan 25, 2024

Currently, there are several functions in utilities.py that seem like they could be methods of the DataSetResults class because they are mostly used on data that has already been loaded in a DataSetResults instance.

For instance, instead of this:

ds = make_data_sets(["CWA Inspections"]) # Create a DataSet for handling the data
buffalo_cwa_inspections = ds["CWA Inspections"].store_results(
  region_type="Zip Code", region_value=["14201", "14202", "14303"]
) # Store results for this DataSet as a DataSetResults object

aggregated_results = aggregate_by_facility(
  records = buffalo_cwa_inspections, program = buffalo_cwa_inspections.dataset.name, other_records=True
) # Aggregate each entry using this function
point_mapper(
  aggregated_results["data"], aggregated_results["aggregator"], quartiles=True, other_fac=aggregated_results["diff"]
)

We could do something like:

ds = make_data_sets(["CWA Inspections"]) # Create a DataSet for handling the data
buffalo_cwa_inspections = ds["CWA Inspections"].store_results(
  region_type="Zip Code", region_value=["14201", "14202", "14303"]
) 
# Store results for this DataSet as a DataSetResults object. Note: one thing that would be really neat to do here is to also retrieve *spatial data* in the store_results request. Instead of just getting CWA inspections for ZIPs 14201, 14202, and 14303, we could get the outlines of those geographies. Currently exists in some form in the `reorganization` branch. 
buffalo_cwa_inspections.aggregate_by_facility() 
# This utilities.py function would become a DataSetResults method that would store the aggregated data in a `self` variable for later use
buffalo_cwa_inspections.show_facility_map() 
# This would be a basic map of each facility (with inspections). Currently exists in some form in the `reorganization` branch. Would rely on the aggregate_by_facility() function to work properly
buffalo_cwa_inspections.show_data_map() 
# Basically just what's currently called `point_mapper()`. Would symbolize facilities with inspections by circle size. If the spatial data (e.g. ZIP code boundaries) is already available, it could map those as well. 

Eventually, perhaps even other utilities like get_active_facilities() and get_top_violators() could move too. Currently, that would break the report cards generating process, I believe. It's also true that these have less to do with the program specific data that's usually stored in a DataSetResults instance. However, it's also the case that an area's facilities can be loaded using ds = make_data_sets(["Facilities"]) get_active_facilities() and get_top_violators() could then become methods for those specific DataSetResults instances.

@ericnost ericnost added the enhancement New feature or request label Jan 25, 2024
@ericnost
Copy link
Member Author

Something like this I think is a more straightforward way of getting facilities. If we moved "get_active_facilities()` or a copy of it to DataSetResults, then we could also create a "active=True" flag.

from ECHO_modules.make_data_sets import make_data_sets
ds = make_data_sets(["Facilities"]) # Create a DataSet for handling the data
erie_facs = ds["Facilities"].store_results(region_type="County", region_value=["ERIE"], state="NY", active=True) # Store results for this DataSet as a DataSetResults object
erie_facs.dataframe # Show the results as a dataframe

@ericnost ericnost added this to the v1.0.0 milestone Jan 25, 2024
@ericnost ericnost moved this to Todo in ECHO_modules Jan 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Todo
Development

No branches or pull requests

1 participant