Name		Name	Last commit message	Last commit date
parent directory ..
cdk-infra		cdk-infra
img		img
src		src
README.md		README.md

README.md

MSK Serverless to S3 (Python Table API)

This blueprint deploys a KDA app that reads from MSK Serverless using IAM auth and writes to S3 using the Python Table API:

Project details

Flink version: 1.15.2
Python version: 3.8

Key components used

New (in Flink 1.13) KafkaSource connector (FlinkKafkaSource is slated to be deprecated).
FileSink (StreamingFileSink is slated to be deprecated).

High-level deployment steps

Package and deploy Python to S3
Deploy associated infra (MSK and KDA) using CDK script
- If using existing resources, you can simply update app properties in KDA.
Perform data generation

Prerequisites

Maven
AWS SDK v2
AWS CDK v2 - for deploying associated infra (MSK and KDA app)

Step-by-step deployment walkthrough

First, let's set up some environment variables to make the deployment easier. Replace these values with your own S3 bucket, app name, etc.

export AWS_PROFILE=<<profile-name>>
export APP_NAME=<<name-of-your-app>>
export S3_BUCKET=<<your-s3-bucket-name>>
export S3_FILE_KEY=<<your-jar-name-on-s3>>

Package Python app folder into zip package

Assuming you want to name your zip file kdaApp.zip, run:

zip kdaApp.zip * lib/*

For more information on packaging PyFlink applications, please follow the instructions detailed here. And for additional information on developing PyFlink applications locally and then deploying to Kinesis Data Analytics, please see Getting Started with PyFlink.

Copy zip package to S3 so it can be referenced in CDK deployment

aws s3 cp target/<<your app zip>> ${S3_BUCKET}/{S3_FILE_KEY}

Follow instructions in the cdk-infra folder to deploy the infrastructure associated with this app - such as MSK Serverless and the Kinesis Data Analytics application.
Follow instructions in orders-datagen to create topic and generate data into MSK.
Start your Kinesis Data Analytics application from the AWS console.
Do a Flink query or S3 Select Query against S3 to view data written to S3.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

msk-serverless-to-s3-tableapi-python

msk-serverless-to-s3-tableapi-python

README.md

MSK Serverless to S3 (Python Table API)

Project details

Key components used

High-level deployment steps

Prerequisites

Step-by-step deployment walkthrough

Files

msk-serverless-to-s3-tableapi-python

Directory actions

More options

Directory actions

More options

Latest commit

History

msk-serverless-to-s3-tableapi-python

Folders and files

parent directory

README.md

MSK Serverless to S3 (Python Table API)

Project details

Key components used

High-level deployment steps

Prerequisites

Step-by-step deployment walkthrough