An Automated ETL Data pipeline which extract complex json data from web API service (GBFS-bixi Data) and convert to CSV for loading into Data-warehouse HDFS. After-that, Hive will process the further by external and managed table. Same procedure is also applied with AWS S3 and Athena.
- On-premise ETL data pipeline using HDFS, Hive, Scala
- AWS Cloud base ETL data pipeline using S3, Athena, Lambda