Skip to content

ETL pipeline project that extracts crypto and traditional stock prices via file reading + web scraping, transforms that into clean, aggregated data, and then loads it into a series of .sql tables, ready for analysis.

Notifications You must be signed in to change notification settings

cdenq/etl-pipeline-on-crypto-data

Repository files navigation

Data Wrangling And Cleaning

About ETL Pipeline on Crypto Data

ETL Pipeline on Crypto Data is a team-based Python data analytics project done on daily crypto data for 6 coins and 14 index funds. It:

  1. extracts raw data in three different ways: direct file reading with pandas (.csv)
  2. transforms that data into aggregates by month (average, max, min, max growth)
  3. loads that data into a final .sql database

Along with the analysis, project also involved a written report.

Built with

  • Python
    • Pandas
  • PostgreSQL
    • pgAdmin
  • Google
    • Google Docs

Technical Skills

  • Python reading and API requesting
  • Python web scraping
  • Cleaning, sorting, filtering
  • Summary statistics, aggregating
  • Loading data into .SQL

Qualitative Skills

  • Synthesizing results for tentative conclusions
  • Acknowledging potential pitfalls with results and techniques

Screenshots

image 1

About

ETL pipeline project that extracts crypto and traditional stock prices via file reading + web scraping, transforms that into clean, aggregated data, and then loads it into a series of .sql tables, ready for analysis.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •