Skip to content

This repository contains the code and examples for my article on Medium, which explains how to parallelize reading data from JDBC sources in Apache Spark.

Notifications You must be signed in to change notification settings

SA01/spark-read-jdbc-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Spark JDBC Read Tutorial

This repository contains the code and examples for my article on Medium, which explains how to parallelize reading data from JDBC sources in Apache Spark. You can read the full article here:
Spark: Parallelization of Reading Data from JDBC Sources

Summary of the Article:

This article demonstrates how to read data from JDBC sources into Apache Spark, and also covers parallelizing the data extraction process. Key topics covered include:

  • Introduction to JDBC in Spark: Learn the basics of reading data from JDBC sources in Spark.
  • Parallelizing Data Reads: Step-by-step instructions on how to parallelize data reads from JDBC sources using partitioning techniques.

The code in this repository allows you to follow along with the examples in the article and provides hands-on demonstration of reading data from JDBC sources into Apache Spark jobs.

Execution Parameters

--jars "path to jar/postgresql-42.5.0.jar" --driver-class-path "path to jar/postgresql-42.5.0.jar"

About

This repository contains the code and examples for my article on Medium, which explains how to parallelize reading data from JDBC sources in Apache Spark.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages