Skip to content

Conversation

@cambyzju
Copy link
Contributor

@cambyzju cambyzju commented Dec 24, 2025

What problem does this PR solve?

Usage Example:

> select * from test_pk@ro where user_id=9820660;
+---------+----------+
| user_id | behavior |
+---------+----------+
| 9820660 | hello    |
+---------+----------+
  • When there are overlapping key ranges in Paimon Primary Key table, we cannot use native reader:
> explain verbose select * from test_pk where user_id=9820660;
+------------------------------------------------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                                                                        |
+------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0                                                                                                        |
|   OUTPUT EXPRS:                                                                                                        |
|     user_id[#0]                                                                                                        |
|     behavior[#1]                                                                                                       |
|   PARTITION: UNPARTITIONED                                                                                             |
|                                                                                                                        |
|   HAS_COLO_PLAN_NODE: false                                                                                            |
|                                                                                                                        |
|   VRESULT SINK                                                                                                         |
|      MYSQL_PROTOCAL                                                                                                    |
|                                                                                                                        |
|   1:VEXCHANGE                                                                                                          |
|      offset: 0                                                                                                         |
|      distribute expr lists:                                                                                            |
|      tuple ids: 0                                                                                                      |
|                                                                                                                        |
| PLAN FRAGMENT 1                                                                                                        |
|                                                                                                                        |
|   PARTITION: RANDOM                                                                                                    |
|                                                                                                                        |
|   HAS_COLO_PLAN_NODE: false                                                                                            |
|                                                                                                                        |
|   STREAM DATA SINK                                                                                                     |
|     EXCHANGE ID: 01                                                                                                    |
|     UNPARTITIONED                                                                                                      |
|                                                                                                                        |
|   0:VPAIMON_SCAN_NODE(79)                                                                                              |
|      table: paimon_on_s3.db_paimon.test_pk                                                                             |
|      predicates: (user_id[#0] = 9820660)                                                                               |
|      inputSplitNum=1, totalFileSize=0, scanRanges=1                                                                    |
|      partition=1/0                                                                                                     |
|      backends:                                                                                                         |
|        1754986988417                                                                                                   |
|          /data-0891318b-4e3b-42b5-b85c-3bc8e9b8cc45-0.parquet start: 0 length: 0                                       |
|      cardinality=1001, numNodes=1                                                                                      |
|      pushdown agg=NONE                                                                                                 |
|      paimonNativeReadSplits=0/1                                                                                        |
|      predicatesFromPaimon:                                                                                             |
|           Equal(user_id, 9820660)                                                                                      |
|      PaimonTable: test_pk                                                                                              |
|      PaimonSplitStats:                                                                                                 |
|        SplitStat [type=JNI, rowCount=1001, mergedRowCount=NONE, rawFileConvertable=false, hasDeletionVector=false]     |
|      tuple ids: 0                                                                                                      |
|                                                                                                                        |
| Tuples:                                                                                                                |
| TupleDescriptor{id=0, tbl=test_pk}                                                                                     |
|   SlotDescriptor{id=0, col=user_id, colUniqueId=0, type=bigint, nullable=true, isAutoIncrement=false, subColPath=null} |
|   SlotDescriptor{id=1, col=behavior, colUniqueId=1, type=text, nullable=true, isAutoIncrement=false, subColPath=null}  |
|                                                                                                                        |
|                                                                                                                        |
|                                                                                                                        |
|                                                                                                                        |
| ========== STATISTICS ==========                                                                                       |
| planed with unknown column statistics                                                                                  |
+------------------------------------------------------------------------------------------------------------------------+
  • But if we try to read ro table, we could use the Native reader with some data latency:
> explain verbose select * from test_pk@ro where user_id=9820660;
+--------------------------------------------------------------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                                                                                      |
+--------------------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0                                                                                                                      |
|   OUTPUT EXPRS:                                                                                                                      |
|     user_id[#0]                                                                                                                      |
|     behavior[#1]                                                                                                                     |
|   PARTITION: UNPARTITIONED                                                                                                           |
|                                                                                                                                      |
|   HAS_COLO_PLAN_NODE: false                                                                                                          |
|                                                                                                                                      |
|   VRESULT SINK                                                                                                                       |
|      MYSQL_PROTOCAL                                                                                                                  |
|                                                                                                                                      |
|   1:VEXCHANGE                                                                                                                        |
|      offset: 0                                                                                                                       |
|      distribute expr lists:                                                                                                          |
|      tuple ids: 0                                                                                                                    |
|                                                                                                                                      |
| PLAN FRAGMENT 1                                                                                                                      |
|                                                                                                                                      |
|   PARTITION: RANDOM                                                                                                                  |
|                                                                                                                                      |
|   HAS_COLO_PLAN_NODE: false                                                                                                          |
|                                                                                                                                      |
|   STREAM DATA SINK                                                                                                                   |
|     EXCHANGE ID: 01                                                                                                                  |
|     UNPARTITIONED                                                                                                                    |
|                                                                                                                                      |
|   0:VPAIMON_SCAN_NODE(79)                                                                                                            |
|      table: paimon_on_s3.db_paimon.test_pk                                                                                           |
|      predicates: (user_id[#0] = 9820660)                                                                                             |
|      inputSplitNum=1, totalFileSize=519490, scanRanges=1                                                                             |
|      partition=1/0                                                                                                                   |
|      backends:                                                                                                                       |
|        1754986988417                                                                                                                 |
|          s3://warehouse/wh/db_paimon.db/test_pk/bucket-0/data-1ae9ee1e-e6aa-4594-9506-75b6b6ad3820-0.parquet start: 0 length: 519490 |
|      cardinality=1001, numNodes=1                                                                                                    |
|      pushdown agg=NONE                                                                                                               |
|      paimonNativeReadSplits=1/1                                                                                                      |
|      predicatesFromPaimon:                                                                                                           |
|           Equal(user_id, 9820660)                                                                                                    |
|      PaimonTable: test_pk$ro                                                                                                         |
|      PaimonSplitStats:                                                                                                               |
|        SplitStat [type=NATIVE, rowCount=1000, mergedRowCount=NONE, rawFileConvertable=true, hasDeletionVector=false]                 |
|      tuple ids: 0                                                                                                                    |
|                                                                                                                                      |
| Tuples:                                                                                                                              |
| TupleDescriptor{id=0, tbl=test_pk}                                                                                                   |
|   SlotDescriptor{id=0, col=user_id, colUniqueId=0, type=bigint, nullable=true, isAutoIncrement=false, subColPath=null}               |
|   SlotDescriptor{id=1, col=behavior, colUniqueId=1, type=text, nullable=true, isAutoIncrement=false, subColPath=null}                |
|                                                                                                                                      |
|                                                                                                                                      |
|                                                                                                                                      |
|                                                                                                                                      |
| ========== STATISTICS ==========                                                                                                     |
| planed with unknown column statistics                                                                                                |
+--------------------------------------------------------------------------------------------------------------------------------------+

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@cambyzju cambyzju marked this pull request as draft December 24, 2025 12:00
@cambyzju cambyzju force-pushed the paimon-ro-dev-master branch from a1fdae4 to e3d1431 Compare December 25, 2025 13:53
@cambyzju cambyzju marked this pull request as ready for review December 25, 2025 13:54
@cambyzju
Copy link
Contributor Author

run buildall

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for reading Paimon read-optimized (RO) system tables, which allows users to query fully compacted data without overlapping key ranges, enabling the use of native readers for better performance.

Key Changes:

  • Introduces @ro syntax for accessing Paimon read-optimized system tables
  • Adds validation to ensure @ro is used without parameters
  • Implements RO table handling in PaimonExternalTable by fetching the "ro" system table variant

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
DorisParser.g4 Modified grammar to make parentheses optional for scan parameters, enabling @ro syntax without parentheses
TableScanParams.java Added RO constant, validation logic, and isRo() method to support read-optimized table parameter
PaimonExternalTable.java Implemented RO table snapshot retrieval logic following the same pattern as branch handling
PaimonScanNode.java Added table name (including $ro suffix) to explain output for better query visibility
test_paimon_ro_read.groovy Added comprehensive test verifying RO read behavior and error handling for invalid parameter usage
test_paimon_ro_read.out Expected test output showing difference between normal and RO reads
run10.sql Test data setup script creating a Paimon table with compacted and delta data

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// - One is in map form, where the data is stored in `mapParams`.
// such as: @func_name('param1'='value1', 'param2'='value2', 'param3'='value3')
// - The other is in list form, where the data is stored in `listParams`.
// - Another other one is in list form, where the data is stored in `listParams`.
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phrase "Another other one" is grammatically incorrect. It should be either "Another one" or "The other one" to maintain grammatical clarity.

Suggested change
// - Another other one is in list form, where the data is stored in `listParams`.
// - Another one is in list form, where the data is stored in `listParams`.

Copilot uses AI. Check for mistakes.
throw new IllegalArgumentException("Invalid param type: " + paramType);
}
if (isRo() && (!mapParams.isEmpty() || !listParams.isEmpty())) {
throw new IllegalArgumentException("Please use '@ro' for Paimon read-optimized system table");
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message says "Please use '@ro' for Paimon read-optimized system table", but this validation is triggered when RO is used WITH parameters (like @ro('param'='value')). The message could be clearer by specifying that RO should be used without parameters, for example: "The '@ro' parameter for Paimon read-optimized tables should be used without arguments".

Suggested change
throw new IllegalArgumentException("Please use '@ro' for Paimon read-optimized system table");
throw new IllegalArgumentException("The '@ro' parameter for Paimon read-optimized tables must be used without arguments");

Copilot uses AI. Check for mistakes.
@doris-robot
Copy link

TPC-H: Total hot run time: 35565 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e3d143133d46ea81899bff8cebfac0b09985296c, data reload: false

------ Round 1 ----------------------------------
q1	17652	4208	4051	4051
q2	2023	353	237	237
q3	10178	1363	780	780
q4	10236	944	324	324
q5	7530	2170	2087	2087
q6	192	175	142	142
q7	1023	846	722	722
q8	9380	1474	1212	1212
q9	7188	5395	5365	5365
q10	6846	2422	1993	1993
q11	541	341	314	314
q12	689	753	564	564
q13	17785	3737	3068	3068
q14	289	312	267	267
q15	606	520	509	509
q16	733	686	646	646
q17	722	866	489	489
q18	7514	7105	7122	7105
q19	1102	989	610	610
q20	409	377	248	248
q21	4366	3896	3931	3896
q22	1073	1008	936	936
Total cold run time: 108077 ms
Total hot run time: 35565 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4139	4064	4025	4025
q2	348	395	320	320
q3	2159	2669	2319	2319
q4	1359	1774	1292	1292
q5	4324	4854	4731	4731
q6	231	180	142	142
q7	2117	1960	1912	1912
q8	3005	2632	2586	2586
q9	7835	7704	7384	7384
q10	3115	3325	2863	2863
q11	620	535	511	511
q12	687	755	604	604
q13	3589	3981	3335	3335
q14	301	302	301	301
q15	572	519	505	505
q16	656	702	655	655
q17	1223	1472	1514	1472
q18	8080	7656	7583	7583
q19	925	934	944	934
q20	2015	2055	1922	1922
q21	4941	4654	4260	4260
q22	1076	994	980	980
Total cold run time: 53317 ms
Total hot run time: 50636 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 180031 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e3d143133d46ea81899bff8cebfac0b09985296c, data reload: false

query5	4429	606	475	475
query6	366	245	259	245
query7	4229	477	274	274
query8	309	260	257	257
query9	8774	2546	2553	2546
query10	529	394	333	333
query11	15637	14936	14903	14903
query12	177	123	116	116
query13	1276	521	395	395
query14	5769	3054	2788	2788
query14_1	2794	2706	2705	2705
query15	237	202	183	183
query16	848	457	492	457
query17	1178	738	620	620
query18	2446	450	359	359
query19	236	234	215	215
query20	123	120	118	118
query21	227	146	123	123
query22	3905	3845	3948	3845
query23	16825	16191	16086	16086
query23_1	16057	16075	15965	15965
query24	7353	1683	1241	1241
query24_1	1250	1287	1280	1280
query25	597	513	468	468
query26	1272	273	168	168
query27	2754	473	307	307
query28	4488	2131	2106	2106
query29	861	580	478	478
query30	318	248	216	216
query31	811	751	649	649
query32	78	72	69	69
query33	533	340	289	289
query34	912	953	541	541
query35	767	813	714	714
query36	858	909	820	820
query37	138	93	81	81
query38	2981	3065	3037	3037
query39	770	757	736	736
query39_1	701	719	709	709
query40	228	141	125	125
query41	69	67	64	64
query42	114	111	111	111
query43	439	440	402	402
query44	1329	753	736	736
query45	190	190	183	183
query46	891	990	616	616
query47	1663	1724	1633	1633
query48	334	333	250	250
query49	621	441	357	357
query50	660	296	217	217
query51	3827	3828	3786	3786
query52	110	109	100	100
query53	322	358	303	303
query54	287	251	242	242
query55	77	72	73	72
query56	294	296	294	294
query57	1170	1143	1101	1101
query58	276	255	247	247
query59	2397	2484	2356	2356
query60	326	311	296	296
query61	164	162	165	162
query62	771	697	687	687
query63	333	297	313	297
query64	5128	1295	992	992
query65	4006	3946	3994	3946
query66	1512	443	331	331
query67	15415	15018	14907	14907
query68	8409	1000	721	721
query69	516	351	306	306
query70	1072	996	948	948
query71	381	329	285	285
query72	6087	5041	4988	4988
query73	679	598	311	311
query74	8933	8983	8813	8813
query75	3267	3151	2832	2832
query76	3901	1156	758	758
query77	626	394	307	307
query78	9419	9409	8876	8876
query79	1640	871	619	619
query80	725	664	550	550
query81	505	273	233	233
query82	484	127	101	101
query83	297	262	243	243
query84	294	119	102	102
query85	971	508	452	452
query86	344	311	270	270
query87	3180	3218	3088	3088
query88	4174	2295	2293	2293
query89	487	426	392	392
query90	2168	158	160	158
query91	170	169	147	147
query92	81	69	68	68
query93	1443	907	557	557
query94	477	319	295	295
query95	581	386	316	316
query96	600	470	208	208
query97	2314	2320	2225	2225
query98	209	192	196	192
query99	1379	1386	1322	1322
Total cold run time: 262807 ms
Total hot run time: 180031 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.61 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit e3d143133d46ea81899bff8cebfac0b09985296c, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.05	0.04
query3	0.26	0.09	0.09
query4	1.60	0.11	0.11
query5	0.28	0.26	0.26
query6	1.18	0.65	0.63
query7	0.03	0.03	0.03
query8	0.05	0.04	0.04
query9	0.56	0.49	0.51
query10	0.55	0.56	0.53
query11	0.16	0.12	0.11
query12	0.14	0.12	0.11
query13	0.60	0.60	0.61
query14	0.99	0.97	0.98
query15	0.81	0.79	0.81
query16	0.39	0.38	0.39
query17	1.05	1.04	1.05
query18	0.23	0.22	0.22
query19	1.87	1.81	1.84
query20	0.02	0.01	0.02
query21	15.48	0.29	0.14
query22	6.04	0.05	0.05
query23	15.94	0.28	0.10
query24	1.52	0.55	0.56
query25	0.11	0.10	0.08
query26	0.14	0.14	0.13
query27	0.08	0.05	0.06
query28	4.90	1.22	1.02
query29	12.65	4.00	3.25
query30	0.29	0.14	0.12
query31	2.82	0.62	0.39
query32	3.23	0.55	0.46
query33	3.05	3.03	3.10
query34	16.84	5.17	4.52
query35	4.57	4.58	4.61
query36	0.67	0.51	0.48
query37	0.12	0.07	0.06
query38	0.07	0.04	0.04
query39	0.05	0.03	0.03
query40	0.19	0.14	0.14
query41	0.08	0.03	0.03
query42	0.05	0.03	0.02
query43	0.04	0.04	0.03
Total cold run time: 99.86 s
Total hot run time: 27.61 s

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 0.00% (0/18) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants