Skip to content

Releases: aliyun/aliyun-odps-java-sdk

v0.50.6-public

27 Nov 10:07
Compare
Choose a tag to compare
update version to 0.50.6-public

v0.51.0-public.rc1

22 Nov 08:07
Compare
Choose a tag to compare
v0.51.0-public.rc1 Pre-release
Pre-release

Changelog

[0.51.0-public.rc1] - 2024-11-22

Features and Changes

  • Column ColumnBuilder adds a new withGenerateExpression method for constructing auto-partition columns
  • TableSchema
    • Added generatePartitionSpec method, used to generate partition information from Record
    • The setPartitionColumns method now accepts List<Column> instead of ArrayList<Column>
  • TableCreator
    • Added support for GenerateExpression and introduced the method autoPartitionBy, which allows for the creation of AutoPartition tables.
    • Added support for ClusterInfo, enabling the creation of Hash/Range Cluster tables.
    • Added the option to specify TableFormat, allowing for the creation of tables in APPEND, TRANSACTION, DELTA, EXTERNAL, and VIEW formats.
    • Introduced the selectStatement parameter for create table as and create view as scenarios.
    • Added the getSql method to obtain the SQL statement for table creation.
    • Now quotes all Comment parameters to support those that contain special characters.
    • Integrated DataHub-related table creation parameters (hubLifecycle, shardNum) into DataHubInfo.
    • Renamed the withJars method to withResources to indicate it can use resources other than JAR files.
    • Renamed the withBucketNum method to withDeltaTableBucketNum to indicate this method is for Delta Tables only.
    • Modified the logic of withHints, withAlias, withTblProperties, and withSerdeProperties methods, now overwriting previous values instead of merging.
    • Removed the createExternal method; you can now use the create method instead.
  • Table
    • Introduced the getSchemaVersion method, allowing users to retrieve the current schema version of the table. The version number is updated each time a Schema Evolution occurs, and this field is used primarily for specifying when creating a StreamTunnel.
    • Added setLifeCycle, changeOwner, changeComment, touch, changeClusterInfo, rename, addColumns, dropColumns methods to support modification of table structure.
  • StreamTunnel Modified the initialization logic; if allowSchemaMismatch is set to false, it will automatically retry until the latest version of the table structure is used (with a timeout of 5 minutes).

Fixes

  • GenerationExpression Fixed the issue where an exception would be thrown when the TruncTime was uppercase during table creation and reloading the table.
  • TypeInfoParser Can now correctly handle Struct types, with fields quoted using backticks in TypeInfo.

更新日志

[0.51.0-public.rc1] - 2024-11-22

功能与变更

  • Column ColumnBuilder 新增 withGenerateExpression 方法,用于构造 auto-partition 列
  • TableSchema
    • 新增 generatePartitionSpec 方法,用于从Record中生成分区信息
    • setPartitionColumns 方法现在接收List<Column>,而不是ArrayList<Column>
  • TableCreator
    • 新增对GenerateExpression的支持,新增方法autoPartitionBy,现在可以创建 AutoPartition 表了
    • 新增对ClusterInfo的支持,现在可以创建 Hash/Range Cluster 表了
    • 新增指定 TableFormat,现在可以指定创建APPEND,TRANSACTION,DELTA,EXTERNAL,VIEW格式的表
    • 新增selectStatement参数,用于create table ascreate view as 场景
    • 新增getSql方法,用于获取创建表的 SQL 语句
    • 现在会对所有的 Comment 参数进行 quote,以支持包含特殊字符的 Comment 参数
    • 将 DataHub 相关的建表参数(hubLifecycle, shardNum) 整合为 DataHubInfo
    • 重命名withJars方法为withResources,以表示不仅可以使用JAR类型资源
    • 重命名withBucketNum方法为withDeltaTableBucketNum,以表示该方法仅用于 Delta Table
    • 修改了 withHintswithAliaswithTblPropertieswithSerdeProperties 方法的逻辑,现在会覆盖之前设置的值,而不是合并
    • 移除了createExternal方法,现在使用create方法即可
  • Table
    • 新增 getSchemaVersion 方法,用户获取当前表结构的版本,用户每次进行 SchemaEvolution 都会更新版本号,目前该字段仅用于在创建 StreamTunnel 时指定
    • 新增 setLifeCyclechangeOwnerchangeCommenttouchchangeClusterInforenameaddColumnsdropColumns方法,以支持对表结构进行修改
  • StreamTunnel 修改初始化逻辑,当指定 allowSchemaMismatchfalse 时,会自动重试直到使用最新版本的表结构(超时时间为5min)

修复

  • GenerationExpression 修复了当建表时TruncTime为大写,reload table 会抛出异常的问题
  • TypeInfoParser 能够正确处理 Struct 类型,字段被反引号quote的 TypeInfo

v0.51.0-public.rc0

18 Nov 12:18
Compare
Choose a tag to compare
v0.51.0-public.rc0 Pre-release
Pre-release

Changelog

[0.51.0-public.rc0] - 2024-11-18

Features

  • GenerateExpression added support for generating expression lists for partition columns, along with the first generated expression TruncTime. For usage, please refer to Example
  • UpsertStream supports writing values with primary keys of type TIMESTAMP_NTZ
  • Table added new methods for querying CDC-related data: getCdcSize(), getCdcRecordNum(), getCdcLatestVersion(), getCdcLatestTimestamp()
  • SQLExecutor MCQA 2.0 job supports retrieving InstanceProgress information

Changes

  • Quote added backticks for quoting names in Struct type TypeInfo and other methods that assemble SQL
  • AutoClosable to remind users to properly close resources, added corresponding close() methods to the following resource classes to prompt users to close resources correctly:
    • UpsertStream in the odps-sdk-core package,
    • LocalOutputStreamSet, ReduceDriver.ReduceContextImpl, MapDriver.DirectMapContextImpl, LocalRecordWriter in the odps-sdk-impl package
    • VectorizedOutputer, VectorizedExtractor, RecordWriter, RecordReader, Outputer, Extractor in the odps-sdk-udf package

更新日志

[0.51.0-public.rc0] - 2024-11-18

功能

  • GenerateExpression 增加对分区列的生成列表达式功能的支持,和第一个生成列表达式TruncTime,使用方式请参考Example
  • UpsertStream 支持写入主键为 TIMESTAMP_NTZ 类型的值
  • Table 新增对 cdc 相关数据的查询,getCdcSize()getCdcRecordNum()getCdcLatestVersion()getCdcLatestTimestamp()
  • SQLExecutor MCQA 2.0 作业支持获取 InstanceProgress 信息

变更

  • Quote 对 Struct 类型的 TypeInfo,和其他拼装 SQL 的方法,使用反引号对名字进行 quote
  • AutoClosable 为了提醒用户正确关闭资源,对下列资源类,增加了相应的 close() 方法,以提醒用户正确关闭资源。
    • odps-sdk-core 包下的 UpsertStream
    • odps-sdk-impl 包下的 LocalOutputStreamSetReduceDriver.ReduceContextImplMapDriver.DirectMapContextImplLocalRecordWriter
    • odps-sdk-udf 包下的 VectorizedOutputerVectorizedExtractorRecordWriterRecordReaderOutputerExtractor

v0.50.5-public

13 Nov 08:48
Compare
Choose a tag to compare

Changelog

[0.50.5-public] - 2024-11-13

Features

  • TableAPI added retry logic for errors in network requests that can be safely retried, improving the stability of the interface. A new configuration option, retryWaitTimeInSeconds, has been added to RestOptions to specify the retry wait time.
  • SQLTask added an overload of the run method that supports passing in the mcqaConnHeader parameter for submitting MCQA 2.0 jobs.
  • SQLExecutor now supports specifying the odps.task.wlm.quota hint to set the interactive quota when submitting MCQA 2.0 jobs.
  • RestClient introduced a new retryWaitTime parameter along with corresponding getter and setter methods to configure the retry wait time for network requests.
  • Configuration added a new socketRetryTimes parameter with corresponding getter and setter methods to configure the retry wait time for Tunnel network requests. If not set, it will use the configuration in RestClient; otherwise, this configuration will be used.

Changes

  • Instances removed the overloaded get method get(String projectName, String id, String quotaName, String regionId), which was added in version 0.50.2-public to retrieve MCQA 2.0 instances. Now, users do not need to distinguish whether a job is an MCQA 2.0 job when using the get method, so this method has been removed. Users can directly use the get(String projectName, String id) method to retrieve instances.

Fixes

  • Table.read fixed an issue where the configured network-related parameters (such as timeout and retry logic) did not take effect correctly during data preview.
  • Streams fixed an issue where specifying the version in the create method would cause an error. A default value of 1 has also been added for version, indicating the initial version of the table.

更新日志

[0.50.5-public] - 2024-11-13

功能

  • TableAPI 为可以安全重试的网络请求类型的报错增加了相应的重试逻辑,从而提高了接口的稳定性。在 RestOptions 中增加了 retryWaitTimeInSeconds 配置项,用于设置重试等待时间。
  • SQLTask 新增了 run 方法的重载,支持传入 mcqaConnHeader 参数,以便提交 MCQA 2.0 作业。
  • SQLExecutor 支持通过指定 hints 中的 odps.task.wlm.quota 来设置提交 MCQA 2.0 作业时的 interactive quota。
  • RestClient 新增了 retryWaitTime 参数,以及相应的 getter 和 setter 方法,以配置网络请求的重试等待时间。
  • Configuration 新增了 socketRetryTimes 参数以及相应的 getter 和 setter 方法,用于配置 Tunnel 网络请求的重试等待时间。如果未设置,则使用 RestClient 中的配置,否则使用此配置。

变更

  • Instances 移除了 get 的重载方法 get(String projectName, String id, String quotaName, String regionId) ,该方法在 0.50.2-public 版本中新增,用于获取 MCQA 2.0 实例。现在,用户在使用 get 方法时无须区分作业是否为 MCQA 2.0 作业,因此移除该方法可以直接使用 get(String projectName, String id) 方法来获取实例。

修复

  • Table.read() 修复了在数据预览时,配置的网络相关参数(如超时时间、重试逻辑)无法正确生效的问题。
  • Streams 修复了 create 方法中,如果指定了 version 会报错的问题。同时增加了 version 的默认值(1),表示表的初始版本。

v0.50.4-public

29 Oct 09:34
Compare
Choose a tag to compare

Changelog

[0.50.4-public] - 2024-10-29

Features

  • PartitionSpec Added a new constructor (String, boolean) that uses a boolean parameter to specify whether to trim partition values. This caters to scenarios (such as using char type as a partition field) where users may not want to trim partition values.

Changes

  • Instance The OdpsException thrown when calling the stop method will no longer be wrapped a second time.

Fixes

  • SQLExecutor
    • Fixed an issue in MCQA 1.0 mode where the user-specified fallbackPolicy.isFallback4AttachError did not take effect correctly.
    • Fixed an issue in MCQA 2.0 mode where the cancel method threw an exception when the job failed.
    • Fixed an issue in MCQA 2.0 mode where using instanceTunnel to fetch results resulted in an error when the isSelect check was incorrect.
  • Table Fixed an issue with the getPartitionSpecs method that trimmed partition values, causing the retrieval of non-existing partitions.

更新日志

[0.50.4-public] - 2024-10-29

功能

  • PartitionSpec 新增(String, boolean)构造方法,通过布尔参数指定是否对分区值进行trim操作,以满足某些场景(如使用char类型作为分区字段)用户不希望trim的需求。

变更

  • Instance 在调用stop方法时,抛出的OdpsException将不再被二次包装。

修复

  • SQLExecutor
    • 修复了在MCQA 1.0模式下,用户指定fallbackPolicy.isFallback4AttachError时未正确生效的问题。
    • 修复了在MCQA 2.0模式下,作业失败时cancel方法抛出异常的问题。
    • 修复了在MCQA 2.0模式下,当isSelect判断错误时,通过instanceTunnel取结果报错的问题。
  • Table 修复了getPartitionSpecs方法会trim分区值,导致无法获取存在的分区的问题。

v0.50.3-public

23 Oct 09:38
Compare
Choose a tag to compare

Changelog

[0.50.3-public] - 2024-10-23

Features

  • SQLExecutor In MCQA 1.0 mode, it is allowed to add custom fallback policies, add subclass FallbackPolicy.UserDefinedFallbackPolicy.

更新日志

[0.50.3-public] - 2024-10-23

功能

  • SQLExecutor 在 MCQA 1.0 模式下,允许增加自定义回退策略,新增类FallbackPolicy.UserDefinedFallbackPolicy

v0.50.2-public

23 Oct 03:38
Compare
Choose a tag to compare

Changelog

[0.50.2-public] - 2024-10-23

Features

  • SQLExecutor Enhanced MCQA 2.0 functionality:
    • isActive will return false, indicating that there are no active Sessions in MCQA 2.0 mode.
    • Added a cancel method to terminate ongoing jobs.
    • getExecutionLog now returns a deep copy of the current log and clears the current log,
      preventing duplicates.
    • New quota method in SQLExecutorBuilder allows reusing already loaded Quota, reducing
      load times.
    • New regionId method in SQLExecutorBuilder allows specifying the region where the quota is
      located.
  • Quotas Added getWlmQuota method with regionId parameter to fetch quota for a specified regionId.
  • Quota Introduced setMcqaConnHeader method to allow users to override quota using a custom
    McqaConnHeader, supporting MCQA 2.0.
  • Instances Added get method applicable for MCQA 2.0 jobs, requiring additional parameters for QuotaName
    and RegionId.
  • Instance Further adapted for MCQA 2.0 jobs.
  • TableSchema basicallyEquals method will no longer strictly check for identical Class types.

Optimization

  • SQLExecutor The run method's hints will now be deep-copied, preserving the user-provided Map and
    supporting immutable types (e.g., ImmutableMap).

Fixes

  • Stream Fixed potential SQL syntax errors in the create method.

更新日志

[0.50.2-public] - 2024-10-23

功能

  • SQLExecutor 增强 MCQA 2.0 功能:
    • isActive 将返回 false,指示在 MCQA 2.0 模式下没有活跃的 Session。
    • 新增 cancel 方法,用于中止正在执行的作业。
    • getExecutionLog 现在返回当前日志的深拷贝并清空当前日志,避免重复获取。
    • SQLExecutorBuilder 新增 quota 方法,支持复用已加载的 Quota,减少加载时间。
    • SQLExecutorBuilder 新增 regionId 方法,允许指定 quota 所在的 region。
  • Quotas 新增带 regionId 参数的 getWlmQuota 方法,用于获取指定 regionId 的 quota。
  • Quota 新增 setMcqaConnHeader 方法,支持用户通过自定义的 McqaConnHeader 重载 quota,以适配 MCQA 2.0。
  • Instances 新增适用于 MCQA 2.0 的 get 方法,需额外传入 MCQA 2.0 的 QuotaName 和 RegionId。
  • Instance 进一步适配 MCQA 2.0 作业。
  • TableSchema basicallyEquals 方法将不再严格检查两个类的 Class 类型一致性。

优化

  • SQLExecutor run 方法中的 hints 现在会进行深拷贝,保护用户传入的 Map,支持不可变类型(如 ImmutableMap)。

修复

  • Stream 修复 create 方法中的潜在 SQL 语法错误。

v0.50.1-public

11 Oct 02:24
Compare
Choose a tag to compare

Changelog

[0.50.1-public] - 2024-10-11

Fixes

  • TableAPI Fixed an issue where ArrayRecord could not correctly invoke toString when using SplitRecordReaderImpl to retrieve results.
  • TableAPI Fixed an issue where a get operation would throw an array index out of bounds exception when the number of Records corresponding to a Split is 0 while using SplitRecordReaderImpl to retrieve results.
  • TableAPI Fixed an issue with composite predicates CompositePredicate that could lead to an additional operator being added when encountering an empty predicate.

更新日志

[0.50.1-public] - 2024-10-11

修复

  • TableAPI 修复了使用SplitRecordReaderImpl获取结果时,拿到了ArrayRecord无法正确toString的问题。
  • TableAPI 修复了使用SplitRecordReaderImpl获取结果时,如果Split对应的Record数量为0,在get
    操作时会抛出数组越界异常的问题。
  • TableAPI 修复了复合谓词CompositePredicate在遇到空谓词时,可能额外增加一次操作符的问题。

v0.48.9-public

11 Oct 02:28
Compare
Choose a tag to compare

Changelog

[0.48.9-public] - 2024-10-11

Fixes

  • TableAPI Fixed an issue where ArrayRecord could not correctly invoke toString when using SplitRecordReaderImpl to retrieve results.
  • TableAPI Fixed an issue where a get operation would throw an array index out of bounds exception when the number of Records corresponding to a Split is 0 while using SplitRecordReaderImpl to retrieve results.
  • TableAPI Fixed an issue with composite predicates CompositePredicate that could lead to an additional operator being added when encountering an empty predicate.

更新日志

[0.48.9-public] - 2024-10-11

修复

  • TableAPI 修复了使用SplitRecordReaderImpl获取结果时,拿到了ArrayRecord无法正确toString的问题。
  • TableAPI 修复了使用SplitRecordReaderImpl获取结果时,如果Split对应的Record数量为0,在get
    操作时会抛出数组越界异常的问题。
  • TableAPI 修复了复合谓词CompositePredicate在遇到空谓词时,可能额外增加一次操作符的问题。

v0.50.0-public

09 Oct 07:36
Compare
Choose a tag to compare

[0.50.0-public] - 2024-10-09

Features

  • Added SchemaMismatchException: This exception will be thrown when using StreamUploadSession if the Record structure uploaded by the user does not match the table structure. This exception will additionally carry the latest schema version to assist users in rebuilding the Session and performing retry operations.
  • Added allowSchemaMismatch method in StreamUploadSession.Builder: This method specifies whether to tolerate mismatches between the user's uploaded Record structure and the table structure without throwing an exception. The default value is true.

Fixes

  • Fixed an issue where specifying tunnelEndpoint in Odps was ineffective when using StreamUploadSession.
  • Fixed a potential NPE issue in TunnelRetryHandler.

[0.50.0-public] - 2024-10-09

功能

  • 新增 SchemaMismatchException 异常:当使用 StreamUploadSession 时,如果用户上传的 Record 结构与表结构不匹配,将抛出该异常。此异常将额外携带最新的 schema version,方便用户重建 Session 并进行重试操作。
  • StreamUploadSession.Builder 中新增 allowSchemaMismatch 方法,用于指定是否容忍用户上传的 Record 结构与表结构不匹配时是否抛出异常。默认值为 true

修复

  • 修复了在 Odps 中指定 tunnelEndpoint 时,使用 StreamUploadSession 无法生效的问题。
  • 修复了 TunnelRetryHandler 潜在的 NPE 问题。