Skip to content

An Automated ETL Data pipeline which extract complex json data from web API service (GBFS-bixi Data) and convert to CSV for loading into Data-warehouse HDFS. After-that, Hive will process the further by external and managed table. Same procedure is also applied with AWS S3 and Athena.

Notifications You must be signed in to change notification settings

ManikHossain08/Bixi-Cloud-ETL-Data-Pipeline-using-Scala-Hive-AWS_Athena_JDBC-Driver

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ETL-Data-Pipeline-using-Scala-Hive-AWS-Athena-JDBC-Driver

An Automated ETL Data pipeline which extract complex json data from web API service (GBFS-bixi Data) and convert to CSV for loading into Data-warehouse HDFS. After-that, Hive will process the further by external and managed table. Same procedure is also applied with AWS S3 and Athena.

2 types of ETL pipelines

  • On-premise ETL data pipeline using HDFS, Hive, Scala
  • AWS Cloud base ETL data pipeline using S3, Athena, Lambda

Project Description

image

image

image

About

An Automated ETL Data pipeline which extract complex json data from web API service (GBFS-bixi Data) and convert to CSV for loading into Data-warehouse HDFS. After-that, Hive will process the further by external and managed table. Same procedure is also applied with AWS S3 and Athena.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages