Skip to content
This repository has been archived by the owner on Jul 18, 2024. It is now read-only.
Madison J Myers edited this page Oct 31, 2017 · 8 revisions

Welcome to the Visualizing-Food-Insecurity-with-Pixie-Dust-and-Watson-Analytics wiki!

Visualizing Food Insecurity

This journey will guide you through downloading, cleaning and visualizing data using different tools. In particular this journey showcases food insecurity in the US, along with its associated factors.

Short Description

Conduct a data science pipeline of preprocessing and visualizing data by using a notebook, libraries and an analytic platform to build charts and plots so that you can communicate your findings to your viewer.

Offering Type

Cognitive & Data Analytics

Introduction

Often in data science we do a great deal of work to glean insights that have an impact on society or a subset of it and yet, often, we end up not communicating our findings or communicating them ineffectively to non data science audiences. That's where visualizations become the most powerful. By visualizing our insights and predictions, we, as data scientists and data lovers, can make a real impact and educate those around us that might not have had the same opportunity to work on a project of the same subject. By visualizing our findings and those insights that have the most power to do social good, we can bring awareness and maybe even change. This journey walks you through how to do just that, with IBM's Data Science Experience (DSX), Pandas, Pixie Dust and Watson Analytics.

For this particular journey, food insecurity throughout the US is focused on. Low access, diet-related diseases, race, poverty, geography and other factors are considered by using open government data. For some context, this problem is a more and more relevant problem for the United States as obesity and diabetes rise and two out of three adult Americans are considered obese, one third of American minors are considered obsese, nearly ten percent of Americans have diabetes and nearly fifty percent of the African American population have heart disease. Even more, cardiovascular disease is the leading global cause of death, accounting for 17.3 million deaths per year, and rising. Native American populations more often than not do not have grocery stores on their reservation... and all of these trends are on the rise. The problem lies not only in low access to fresh produce, but food culture, low education on healthy eating as well as racial and income inequality.

Author

Madison J. Myers https://www.linkedin.com/in/madisonjmyers/

Code

https://github.com/MadisonJMyers/Visualizing-Food-Insecurity-with-Pixie-Dust-and-Watson-Analytics/blob/master/Diet-Related%20Disease%20Exploratory.ipynb

Overview and Included Components

The user will learn:

  • How to use DSX.
  • How to remove NaNs and 0s from a pandas dataframe.
  • How to visualize correlations and other findings using matplotlib, bokeh, seaborn and Pixie Dust.
  • How to download your pandas dataframe from DSX.
  • How to upload your data into Watson.
  • How to use Watson Analytics to generate visualizations and share them with others.

This journey was created for data scientists and data lovers who are interested in social justice issues and/or those who are new to DSX and Watson Analytics. This will guide the user through the power of visualizations, how to select them and how to share them.

Link to architecture diagram:

Included components

This journey takes you through starting a DSX notebook, uploading and using your data in the notebook, cleaning your data, visualizing your data with different libraries including pandas and Pixie Dust and then downloading your dataframe as a csv and using it in IBM Watson Analytics to visualize your findings further and then share them with your audience.

Featured technologies

DSX: an online browser platform where you can use notebooks or R Studio for your data science projects. DSX is unique in that it automatically starts up a Spark instance for you, allowing you to work in the cloud without any extra work.
IBM Watson Analytics: another browser platform which allows you to input your data, conduct analysis on it and then visualize your findings. If you're new to data science, Watson recommends connections and visualizations with the data it has been given.
Pandas: pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
Pixie Dust:a visualization library you can use on DSX or a jupyter notebook.

Blog https://github.com/MadisonJMyers/Visualizing-Food-Insecurity-with-Pixie-Dust-and-Watson-Analytics

Links

DSX:https://datascience.ibm.com/docs/content/analyze-data/creating-notebooks.html.
IBM Watson Analytics: https://www.ibm.com/watson-analytics
Pandas:http://pandas.pydata.org/
Pixie Dust: https://ibm-watson-data-lab.github.io/pixiedust/displayapi.html#introduction
Data:https://www.bls.gov/cex/ ; https://www.ers.usda.gov/data-products/food-environment-atlas/data-access-and-documentation-downloads/.
Clone this wiki locally