pyspark methods to enhance developer productivity 📣 👯 🎉
-
Updated
Jul 6, 2024 - Python
pyspark methods to enhance developer productivity 📣 👯 🎉
the portable Python dataframe library
Analyzes book review data from Amazon and the Amazon-Vine program utilizing PySpark and Amazon Web Service's Relational Database Service (AWS RDS)
This is a repo to keep track of all my projects related to data engineering
This is my profile which gives the insights of my projects created using various tools and frameworks.
Google's Borg Cluster Traces failure prediction with Apache Spark made for master's thesis
A Comprehensive Framework for Building End-to-End Recommendation Systems with State-of-the-Art Models
An open source, standard data file format for graph data storage and retrieval.
👷🌇 Set up and build a big data processing pipeline with Apache Spark, 📦 AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflows🥊
State of the Art Natural Language Processing
PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster
Open Targets python framework for post-GWAS analysis
Projects on Cloud Computing☁ , Kubernetes ☸️, Docker 🐳
Simple and Distributed Machine Learning
PySpark unofficial implementation of the study "Home monitoring for older singles: A gas sensor array system"
Add a description, image, and links to the pyspark topic page so that developers can more easily learn about it.
To associate your repository with the pyspark topic, visit your repo's landing page and select "manage topics."