Skip to main content
Blog/Release Notes

Apache Doris 3.0.3 just released

Apache Doris

Dear community members, the Apache Doris 3.0.3 version was officially released on December 02, 2024, this version further enhances the performance and stability of the system.

Quick Download: https://doris.apache.org/download/

GitHub Release: https://github.com/apache/doris/releases

Behavioral Changes

  • Prohibited column updates on MOW tables with synchronous materialized views. #40190
  • Adjusted the default parameters of RoutineLoad to improve import efficiency. #42968
  • When StreamLoad fails, the return value of LoadedRows is adjusted to 0. #41946 #42291
  • Adjusted the default memory limit of Segment cache to 5%. #42308 #42436

New Features

  • Introduced the session variable enable_cooldown_replica_affinity to control the affinity of cold and hot tiered replicas. #42677

  • Added table$partition syntax for querying partition information of Hive tables. #40774

  • Supported creation of Hive tables in Text format. #41860 #42175

Asynchronous Materialized Views

  • Introduced new materialized view attribute use_for_rewrite. When use_for_rewrite is set to false, the materialized view does not participate in transparent rewriting. #40332

Query Optimizer

  • Supported correlated non-aggregate subqueries. #42236

Query Execution

  • Added functions ngram_search, normal_cdf, to_iso8601, from_iso8601_date, SESSION_USER(), last_query_id. #38226 #40695 #41075 #41600 #39575 #40739
  • The aes_encrypt and aes_decrypt functions support GCM mode. #40004
  • Profile outputs the changed session variable values. #41016 #41318

Semi-structured Data Management

  • Added array functions array_match_all and array_match_any. #40605 #43514
  • The array function array_agg supports nesting ARRAY/MAP/STRUCT within ARRAY. #42009
  • Added approximate aggregate statistical functions approx_top_k and approx_top_sum. #44082

Improvements

Storage

  • Supported bitmap_empty as the default value. #40364
  • Introduced the session variable insert_timeout to control the timeout of DELETE statements. #41063
  • Improved some error message prompts. #41048 #39631
  • Improved the priority scheduling of replica repair. #41076
  • Enhanced the robustness of timezone handling when creating tables. #41926 #42389
  • Checked the validity of partition expressions when creating tables. #40158
  • Supported Unicode-encoded column names in DELETE operations. #39381

Compute-Storage Decoupled

  • Supported ARM architecture deployment in storage and compute separation mode. #42467 #43377
  • Optimized the eviction strategy and lock competition of file cache, improving hit rate and high concurrency point query performance. #42451 #43201 #41818 #43401
  • S3 storage vault supported use_path_style, solving the problem of using custom domain names for object storage. #43060 #43343 #43330
  • Optimized storage and compute separation configuration and deployment, preventing misoperations in different modes. #43381 #43522 #43434 #40764 #43891
  • Optimized observability and provided an interface for deleting specified segment file cache. #38489 #42896 #41037 #43412
  • Optimized Meta-service operation and maintenance interface: RPC rate limiting and tablet metadata correction. #42413 #43884 #41782 #43460

Lakehouse

  • Paimon Catalog supported Alibaba Cloud DLF and OSS-HDFS storage. #41247 #42585

  • Supported reading of Hive tables in OpenCSV format. #42257 #42942

  • Optimized the performance of accessing the information_schema.columns table in External Catalog. #41659 #41962

  • Used the new Max Compute open storage API to access Max Compute data sources. #41614

  • Optimized the scheduling policy of the JNI part of Paimon tables, making scan tasks more balanced. #43310

  • Optimized the read performance of small ORC files. #42004 #43467

  • Supported reading of parquet files in brotli compressed format. #42177

  • Added file_cache_statistics table under the information_schema library to view metadata cache statistics. #42160

Query Optimizer

Query Execution

  • Optimized the memory usage of the sort operator. #39306
  • Optimized the performance of computations on ARM. #38888 #38759
  • Optimized the computational performance of a series of functions. #40366 #40821 #40670 #41206 #40162
  • Used SSE instructions to optimize the performance of the match_ipv6_subnet function. #38755
  • Supported automatic creation of new partitions during insert overwrite. #38628 #42645
  • Added the status of each PipelineTask in Profile. #42981
  • IP type supported runtime filter. #39985

Semi-structured Data Management

  • Output the real SQL of prepared statements in audit logs. #43321
  • The filebeat doris output plugin supports fault tolerance and progress reporting. #36355
  • Optimized the performance of inverted index queries. #41547 #41585 #41567 #41577 #42060 #42372
  • The array function array overlaps supports acceleration using inverted indexes. #41571
  • The IP function is_ip_address_in_range supports acceleration using inverted indexes. #41571
  • Optimized the CAST performance of the VARIANT data type. #41775 #42438 #43320
  • Optimized the CPU resource consumption of the Variant data type. #42856 #43062 #43634
  • Optimized the metadata and execution memory resource consumption of the Variant data type. #42448 #43326 #41482 #43093 #43567 #43620

Permissions

  • Added a new configuration item ldap_group_filter in LDAP for custom group filtering. #43292

Other

  • Supported displaying connection count information by user in FE monitoring items. #39200

Bug Fixes

Storage

  • Fixed the issue with using IPv6 hostnames. #40074
  • Fixed the inaccurate display of broker/s3 load progress. #43535
  • Fixed the issue where queries might hang from FE. #41303 #42382
  • Fixed the issue of duplicate auto-increment IDs under exceptional circumstances. #43774 #43983
  • Fixed occasional NPE issues with groupcommit. #43635
  • Fixed the inaccurate calculation of auto bucket. #41675 #41835
  • Fixed the issue where FE might not correctly plan multi-table flows after restart. #41677 #42290

Compute-Storage Decoupled

  • Fixed the issue that MOW primary key tables with large delete bitmaps might cause coredump. #43088 #43457 #43479 #43407 #43297 #43613 #43615 #43854 #43968 #44074 #41793 #42142
  • Fixed the issue that segment files, when being a multiple of 5MB, would fail to upload objects. #43254
  • Fixed the issue that the default retry policy of aws sdk did not take effect. #43575 #43648
  • Fixed the issue that altering storage vault could continue execution even when the wrong type was specified. #43489 #43352 #43495
  • Fixed the issue that tablet_id might be 0 during the delayed commit process of large transactions. #42043 #42905
  • Fixed the issue that constant folding RCP and FE forwarding SQL might not be executed in the expected computation group. #43110 #41819 #41846
  • Fixed the issue that meta-service did not strictly check instance_id upon receiving RPC. #43253 #43832
  • Fixed the issue that FE follower information_schema version did not update in time. #43496
  • Fixed the issue of atomicity in file cache rename and inaccurate metrics. #42869 #43504 #43220

Lakehouse

  • Prohibited implicit conversion predicates from being pushed down to JDBC data sources to avoid inconsistent query results. #42102
  • Fixed some read issues with high-version Hive transactional tables. #42226
  • Fixed the issue that the Export command might cause deadlocks. #43083 #43402
  • Fixed the issue of being unable to query Hive views created by Spark. #43552
  • Fixed the issue that Hive partition paths containing special characters led to incorrect partition pruning. #42906
  • Fixed the issue that Iceberg Catalog could not use AWS Glue. #41084

Asynchronous Materialized Views

  • Fixed the issue that asynchronous materialized views might not refresh after the base table is rebuilt. #41762

Query Optimizer

  • Fixed the issue that partition pruning results might be incorrect when using multi-column range partitioning. #43332
  • Fixed the issue of incorrect calculation results in some limit offset scenarios. #42576

Query Execution

  • Fixed the issue that hash join with array types larger than 4G could cause BE Core. #43861
  • Fixed the issue that is null predicate operations might yield incorrect results in some scenarios. #43619
  • Fixed the issue that bitmap types might produce incorrect output results in hash join. #43718
  • Fixed some issues where function results were calculated incorrectly. #40710 #39358 #40929 #40869 #40285 #39891 #40530 #41948 #43588
  • Fixed some issues with JSON type parsing. #39937
  • Fixed issues with varchar and char types in runtime filter operations. #43758 #43919
  • Fixed some issues with the use of decimal256 in scalar and aggregate functions. #42136 #42356
  • Fixed the issue that arrow flight reported Reach limit of connections errors upon connection. #39127
  • Fixed the issue of incorrect memory usage statistics for BE in k8s environments. #41123

Semi-structured Data Management

  • Adjusted the default values of segment_cache_fd_percentage and inverted_index_fd_number_limit_percent. #42224
  • logstash now supports group_commit. #40450
  • Fixed the issue of coredump when building index. #43246 #43298
  • Fixed issues with variant index. #43375 #43773
  • Fixed potential fd and memory leaks under abnormal compaction circumstances. #42374
  • Inverted index match null now correctly returns null instead of false. #41786
  • Fixed the issue of coredump when ngram bloomfilter index bf_size is set to 65536. #43645
  • Fixed the issue of potential coredump during complex data type JOINs. #40398
  • Fixed the issue of coredump with TVF JSON data. #43187
  • Fixed the precision issue of bloom filter calculations for dates and times. #43612
  • Fixed the issue of coredump with IPv6 type storage. #43251
  • Fixed the issue of coredump when using VARIANT type with light_schema_change disabled. #40908
  • Improved cache performance for high-concurrency point queries. #44077
  • Fixed the issue that bloom filter indexes were not synchronized when columns were deleted. #43378
  • Fixed instability issues with es catalog under special circumstances such as mixed array and scalar data. #40314 #40385 #43399 #40614
  • Fixed coredump issues caused by abnormal regular pattern matching. #43394

Permissions

Other