Skip to content

BellaIngenue/SienaRivera_COGS108

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

COGS108: Group 009 Project

  • Removed Other Names for Privacy Reasons
  • Siena Rivera

Overview of this Project:

Research Question

Did California Air Quality significantly improve in different areas due to the COVID-19 Pandemic and the change in car traffic volume?

Hypothesis

We hypothesize that COVID-19 has had an overall good effect from 2020-2022 on Air Quality based on AQI (Air Quality Index). We think that the pandemic has led to more WFH (work-from-home) and remote opportunities for the general population of California, thus changing the amount of air pollution caused by a reduction in car traffic via commuting. We acknowledge that there may be some confounding variables including regional industries, population per capita, and lockdown restrictions

Ethics & Privacy

Do we have permission to use this data?

As all of the data we have collected is open to the public for fair use, our group has permission to use this data and for this purpose. All of our datasets for our variables: Air Quality, COVID-19 Rates, Tax Income, and Job Titles are from Kaggle, EPA and EIA. Which are all open to the public for viewing, education and for use.

Are there privacy concerns regarding your datasets that you need to deal with, and/or terms of use that you need to comply with?

The air quality data that will be used for our project was collected by AirNow, which is "A partnership of the U.S. Environmental Protection Agency, National Oceanic and Atmospheric Administration (NOAA), National Park Service, NASA, Centers for Disease Control, and tribal, state, and local air quality agencies"(1). This data is publicly available through government websites and is free from personally identifiable information, which protects us from many privacy concerns that we may otherwise need to be worried about. Given that none of the air quality data was ever associated with an individial, we believe that this would be safe to legally use.

As for the data regarding industries across the state of California, any data we find regarding this will be cleaned and sorted to make sure that any personal identifying information that could associate any individuals with any particular job will be removed and anonymized. We are looking for this data through websites such as Kaggle or Our World in Data and through government websites that provide open-source, free use datasets, so we expect that we shouldn't run into too many complications with privacy, as much of the data has already been released to adhere to data privacy laws.

Are there potential biases in your dataset(s), in terms of who it composes, and how it was collected, that may be problematic in terms of it allowing for equitable analysis?

Looking ahead at the kind of datasets we will be using, it is possible that there could exist potential biases. In air quality data for the state of California, it is possible that some areas are underrepresented or misrepresented due to either a) a lack of adequate air quality measurement tools or b) air quality measurement tools being placed in non-ideal locations for a fair assessment of the actual air quality of the local region. Depending on the average wealth and population of the region, there could be a difference in care of how the data was gathered, which may lead to biased results.

For data about industries in California, we will likely use census information or data gathered by a third party associating certain regions with certain industries, so the biases we must be cognizant of are who the data was collected by and for what purpose. We think that census data should be largely free of bias given that the point is simply gather information about the citizenry, but if we are looking into datasets collected by third parties, we must also be aware of why the data was gathered in the first place. If the data was gathered for a biased purpose, such as looking into which industries are "best" in California by an agricultural group, we should be wary of using that data because of the biases that may come with it. Thus, we will look into finding datasets that have data about California regional industries gathered for a non-biased purpose.

What will we be doing with this data?

We will be using this data only for the purpose of only our COGS 108 Final assignment. We do not intend to ever use this project for monetary use, this is strictly for educational purposes only.

  1. https://www.airnow.gov/about-airnow/

About

Final Project Copied and Migrated for Public Access

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published