Skip to content

Latest commit

 

History

History
48 lines (29 loc) · 2.85 KB

File metadata and controls

48 lines (29 loc) · 2.85 KB

Title: Data Analytics

Description: A path to become a data analyst.

Links

Sections:

  1. Statistics
  2. Data Profiling
  3. Data Collection/ Gathering /Munging
  4. Data Wrangling / Cleaning
  5. Data Exploration
  6. Data Visualization
  7. Data Storytelling

Statistics

Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data.

Data Profiling

Data profiling is the process of examining the data available from an existing information source (e.g. a database or a file) and collecting statistics or informative summaries about that data.

Data Wrangling

Data wrangling, sometimes referred to as data munging, is the process of transforming and mapping data from one "raw" data form into another format with the intent of making it more appropriate and valuable for a variety of downstream purposes such as analytics. The goal of data wrangling is to assure quality and useful data. Data analysts typically spend the majority of their time in the process of data wrangling compared to the actual analysis of the data.

Data Exploration

Data exploration is an approach similar to initial data analysis, whereby a data analyst uses visual exploration to understand what is in a dataset and the characteristics of the data, rather than through traditional data management systems.These characteristics can include size or amount of data, completeness of the data, correctness of the data, possible relationships amongst data elements or files/tables in the data.

Data Visualization

Data visualization is the presentation of data in a pictorial or graphical format. It involves producing images that communicate relationships among the represented data to viewers of the images. This communication is achieved through the use of a systematic mapping between graphic marks and data values in the creation of the visualization. The goal is to communicate information clearly and efficiently through graphical means.

Data Storytelling

Data storytelling is a methodology for communicating information, tailored to a specific audience, with a compelling narrative. It is the last ten feet of your data analysis and arguably the most important aspect.

Optional Knowledge

Machine Learning

Machine learning is the study of computer algorithms that improve automatically through experience. It is seen as a subset of artificial intelligence. Machine learning algorithms build a mathematical model based on sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to do so. Machine learning algorithms are used in a wide variety of applications, such as email filtering and computer vision, where it is difficult or infeasible to develop conventional algorithms to perform the needed tasks.