These days, the world is generating data in the scale of Zeta Bytes, and if we add all data before 2010, they would possibly be less than 0.1 Zeta Byte. Data Science is therefore emerging as a very important discipline, as somebody has to answer how to best utilize these massive wave of data which is going to be even bigger and faster in the coming days.
This course introduces students to computational methods for knowledge discovery and data mining. Topics covered include Python programming (specialized on data science), data preparation, data visualization, predictive modeling, model evaluation, clustering, and association analysis techniques.
After completing this course, students should be able to:
- Wrangle, analyze, and visualize data (primarily in Python).
- Exercise and apply supervised and unsupervised knowledge discovery techniques.
- Identify and build models using common data mining techniques.
- Evaluate models for their effectiveness and appropriateness.
- Communicate findings using effective visualizations.