You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
kindbgen
changed the title
[Optimization][CDCSOURCE] Optimized CDCSOURCE from Mysql to Doris, with support for light_schema_change[Optimization][Module Name] Optimization title
[Optimization][CDCSOURCE] Optimized CDCSOURCE from Mysql to Doris, with support for light_schema_change
Feb 6, 2024
Search before asking
Description
MySQL 通过 CDCSSOURCE 整库到 Doris 字段模式演变优化,主要优化如下:
1、优化 'sink.connector' = 'datastream-doris-schema-evolution',修复报错:org.dinky.data.exception.MetaDataException: Missing DataSource Type:【datastream-doris-schema-evolution】
2、Doris 建表时默认开启 light_schema_change=true, 支持 MySQL、Oracle、SQLServer 和 PostgreSQL 整库表结构到 Doris 的 schema change
3、优化 Doris schema change, 支持表列名、列类型、默认值、注释同步,支持多列修改,支持列名重命名,需要通过配置use-new-schema-change来启用
4、修复由于 Dinky 设置 labelPrefix 过长导致 DorisWriter 写入 Doris 失败问题
5、通过自定义 MySQL、Oracle、SQLServer 和 PostgreSQL 四种 debezium 转换器,优化整库到 Doris datetime 类型相差8小时问题,并支持 datetime format 精确到毫秒
优化后的 FlinkSQL 如下:
EXECUTE CDCSOURCE demo_doris_schema_evolution WITH (
'connector' = 'mysql-cdc',
'hostname' = '127.0.0.1',
'port' = '3306',
'username' = 'root',
'password' = '123456',
'checkpoint' = '10000',
'scan.startup.mode' = 'initial',
'parallelism' = '1',
'debezium.skipped.operations'='d',
'jdbc.properties.tinyInt1isBit'= 'false',
'source.server-time-zone' = 'Asia/Shanghai',
'source.schema.changes' = 'true',
'table-name' = 'test..*',
'sink.connector' = 'datastream-doris-schema-evolution',
'sink.url' = 'jdbc:mysql://127.0.0.1:9030',
'sink.fenodes' = '127.0.0.1:8030',
'sink.username' = 'root',
'sink.password' = '123456',
'sink.doris.batch.size' = '1000',
'sink.sink.max-retries' = '1',
'sink.sink.batch.interval' = '60000',
'sink.sink.db' = 'test',
'sink.table.identifier' = '#{schemaName}.#{tableName}',
'sink.auto.create' = 'true',
'sink.timezone' = 'Asia/Shanghai',
-- 支持表列名、列类型、默认值、注释同步,支持多列修改,支持列名重命名
'sink.sink.use-new-schema-change' = 'true',
-- 解决 flink cdc 时区问题及 datetime 无法解析问题
'debezium.converters' = 'datetime',
'debezium.datetime.type' = 'org.dinky.cdc.debezium.converter.MysqlDebeziumConverter',
'debezium.datetime.database.type' = 'mysql',
'debezium.datetime.format.date' = 'yyyy-MM-dd',
'debezium.datetime.format.time' = 'HH:mm:ss',
'debezium.datetime.format.datetime' = 'yyyy-MM-dd HH:mm:ss.SSS',
'debezium.datetime.format.timestamp' = 'yyyy-MM-dd HH:mm:ss.SSS',
'debezium.datetime.format.timestamp.zone' = 'Asia/Shanghai'
);
Are you willing to submit a PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: