Spark JDBC Read Tutorial

This repository contains the code and examples for my article on Medium, which explains how to parallelize reading data from JDBC sources in Apache Spark. You can read the full article here:
Spark: Parallelization of Reading Data from JDBC Sources

Summary of the Article:

This article demonstrates how to read data from JDBC sources into Apache Spark, and also covers parallelizing the data extraction process. Key topics covered include:

Introduction to JDBC in Spark: Learn the basics of reading data from JDBC sources in Spark.
Parallelizing Data Reads: Step-by-step instructions on how to parallelize data reads from JDBC sources using partitioning techniques.

The code in this repository allows you to follow along with the examples in the article and provides hands-on demonstration of reading data from JDBC sources into Apache Spark jobs.

Execution Parameters

--jars "path to jar/postgresql-42.5.0.jar" --driver-class-path "path to jar/postgresql-42.5.0.jar"

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spark JDBC Read Tutorial

Summary of the Article:

Execution Parameters

About

Releases

Packages

Languages

SA01/spark-read-jdbc-tutorial

Folders and files

Latest commit

History

Repository files navigation

Spark JDBC Read Tutorial

Summary of the Article:

Execution Parameters

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages