[FLINK-36769] support fury serializer for pyflink #25672

kaori-seasons · 2024-11-21T13:46:34Z

What is the purpose of the change

Hi, community. Currently, in the batch verification scenario of our algorithm data, we use pyflink and encounter low transmission efficiency caused by low performance of pickle4-based encoding. After research, we decided to adopt Apache fury, a serialization framework based on pickle5 encoding. The implementation of fury in python will define the transmission buffer size in the protocol for transmission to improve the performance of large data transmission.

Related communications with fury community members can be found here

Brief change log

(for example:)

The TaskInfo is stored in the blob store on job creation time as a persistent artifact
Deployments RPC transmits only the blob storage reference
TaskManagers retrieve the TaskInfo from the blob cache

Verifying this change

Please make sure both new and modified tests in this PR follow the conventions for tests defined in our code quality guide.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

Added integration tests for end-to-end deployment with large payloads (100MB)
Extended integration test for recovery after master (JobManager) failure
Added test that validates that TaskInfo is transferred only once across recoveries
Manually verified the change by running a 4 node cluster with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): (yes / no)
The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
The serializers: (yes / no / don't know)
The runtime per-record code paths (performance sensitive): (yes / no / don't know)
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
The S3 file system connector: (yes / no / don't know)

Documentation

Does this pull request introduce a new feature? (yes / no)
If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

flinkbot · 2024-11-21T14:08:35Z

CI report:

3159c22 Azure: FAILURE

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot run azure re-run the last Azure build

kaori-seasons · 2024-11-25T03:56:23Z

@flinkbot run azure

support fury serializer for pyflink

c8a1d53

kaori-seasons changed the title ~~[ISSUE#36769] support fury serializer for pyflink~~ [FLINK-36769] support fury serializer for pyflink Nov 21, 2024

kaori-seasons marked this pull request as draft November 21, 2024 13:48

flinkbot added the component=API/Python label Nov 21, 2024

solve signtrue incompatibility

eede8f9

kaori-seasons added 2 commits November 25, 2024 12:04

add java doc

d4697a1

checkstyle fix

3159c22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-36769] support fury serializer for pyflink #25672

[FLINK-36769] support fury serializer for pyflink #25672

kaori-seasons commented Nov 21, 2024

flinkbot commented Nov 21, 2024 •

edited

Loading

kaori-seasons commented Nov 25, 2024

[FLINK-36769] support fury serializer for pyflink #25672

Are you sure you want to change the base?

[FLINK-36769] support fury serializer for pyflink #25672

Conversation

kaori-seasons commented Nov 21, 2024

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

flinkbot commented Nov 21, 2024 • edited Loading

CI report:

kaori-seasons commented Nov 25, 2024

flinkbot commented Nov 21, 2024 •

edited

Loading