Fix GROUP BY spacing in semantic expansion#16
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4ab83dbd26
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| fn matches_keyword_at(upper: &str, idx: usize, keyword: &str) -> bool { | ||
| if idx + keyword.len() > upper.len() { | ||
| return false; | ||
| } | ||
| if &upper[idx..idx + keyword.len()] != keyword { |
There was a problem hiding this comment.
Avoid UTF-8 slicing panic in GROUP BY scan
has_group_by_anywhere advances idx byte-by-byte over the uppercased SQL and calls matches_keyword_at, which slices the upper string using idx (&upper[idx..idx + keyword.len()]). If the SQL contains any non-ASCII characters (e.g., Unicode identifiers or string literals) before the GROUP BY, idx can land on a non‑UTF‑8 boundary and this slice will panic. This is a regression from the previous contains("GROUP BY") check and can crash semantic expansion for otherwise valid queries that include Unicode content.
Useful? React with 👍 / 👎.
Fix GROUP BY spacing detection in semantic expansion and view processing. Add SQLLogic coverage for GROUP BY spacing in semantic queries. Tests: cd /Users/nico/conductor/workspaces/yardstick/yeosu-v1/yardstick-rs && cargo build --release
mkdir -p build/release
cmake -DEXTENSION_STATIC_BUILD=1 -DDUCKDB_EXTENSION_CONFIGS='/Users/nico/conductor/workspaces/yardstick/yeosu-v1/extension_config.cmake' -DOSX_BUILD_ARCH= -DDUCKDB_EXPLICIT_PLATFORM='' -DCUSTOM_LINKER= -DOVERRIDE_GIT_DESCRIBE="" -DUNITTEST_ROOT_DIRECTORY="/Users/nico/conductor/workspaces/yardstick/yeosu-v1/" -DBENCHMARK_ROOT_DIRECTORY="/Users/nico/conductor/workspaces/yardstick/yeosu-v1/" -DENABLE_UNITTEST_CPP_TESTS=FALSE -DBUILD_EXTENSION_TEST_DEPS=default -DCMAKE_BUILD_TYPE=Release -S "./duckdb/" -B build/release
-- git hash d1dc88f950, version v1.4.3, extension folder v1.4.3
-- Extensions will be deployed to: /Users/nico/conductor/workspaces/yardstick/yeosu-v1/build/release/repository
-- Load extension 'yardstick' from '/Users/nico/conductor/workspaces/yardstick/yeosu-v1' @ 4ab83db
-- Load extension 'core_functions' from '/Users/nico/conductor/workspaces/yardstick/yeosu-v1/duckdb/extensions' @ v1.4.3
-- Load extension 'parquet' from '/Users/nico/conductor/workspaces/yardstick/yeosu-v1/duckdb/extensions' @ v1.4.3
-- Rust_CARGO_TARGET: aarch64-apple-darwin
-- OS_NAME: osx
-- OS_ARCH: arm64
-- OSX_BUILD_ARCH:
-- Rust Toolchain: stable-aarch64-apple-darwin
-- Using Corrosion as a subdirectory
-- Extensions linked into DuckDB: [yardstick, core_functions, parquet]
-- Tests loaded for extensions: [yardstick]
-- Configuring done (1.7s)
-- Generating done (0.8s)
-- Build files have been written to: /Users/nico/conductor/workspaces/yardstick/yeosu-v1/build/release
cmake --build build/release --config Release
[ 8%] Built target duckdb_zstd
[ 8%] Built target duckdb_platform_binary
[ 8%] Built target duckdb_platform
[ 8%] Built target cargo-prebuild_yardstick
Copying byproducts
libyardstick.ato /Users/nico/conductor/workspaces/yardstick/yeosu-v1/build/release/extension/yardstick[ 8%] Built target _cargo-build_yardstick
[ 8%] Built target cargo-build_yardstick
[ 8%] Built target duckdb_fmt
[ 10%] Built target duckdb_pg_query
[ 10%] Built target duckdb_sorting
[ 10%] Built target duckdb_common_types
[ 10%] Built target duckdb_common_types_column
[ 10%] Built target duckdb_common_types_row
[ 10%] Built target duckdb_value_operations
[ 13%] Built target duckdb_vector_operations
[ 13%] Built target duckdb_logging
[ 13%] Built target duckdb_execution
[ 13%] Built target duckdb_expression_executor
[ 14%] Built target duckdb_nested_loop_join
[ 14%] Built target duckdb_operator_aggregate
[ 14%] Built target duckdb_csv_buffer_manager
[ 14%] Built target duckdb_csv_encode
[ 14%] Built target duckdb_csv_scanner
[ 15%] Built target duckdb_csv_sniffer
[ 15%] Built target duckdb_csv_state_machine
[ 15%] Built target duckdb_operator_csv_table_function
[ 15%] Built target duckdb_csv_util
[ 15%] Built target duckdb_operator_filter
[ 16%] Built target duckdb_operator_helper
[ 16%] Built target duckdb_operator_join
[ 16%] Built target duckdb_operator_order
[ 16%] Built target duckdb_operator_persistent
[ 16%] Built target duckdb_operator_projection
[ 17%] Built target duckdb_operator_scan
[ 17%] Built target duckdb_operator_schema
[ 17%] Built target duckdb_operator_set
[ 17%] Built target duckdb_physical_plan
[ 18%] Built target duckdb_execution_index
[ 18%] Built target duckdb_execution_index_art
[ 18%] Built target duckdb_sample
[ 18%] Built target duckdb_main
[ 19%] Built target duckdb_main_capi
[ 19%] Built target duckdb_main_capi_cast
[ 19%] Built target duckdb_generated_extension_loader
[ 20%] Built target duckdb_main_extension
[ 21%] Built target duckdb_common_http
[ 21%] Built target duckdb_main_relation
[ 21%] Built target duckdb_main_secret
[ 22%] Built target duckdb_main_settings
[ 22%] Built target duckdb_main_buffered_data
[ 22%] Built target duckdb_main_chunk_scan_state
[ 22%] Built target duckdb_parallel
[ 22%] Built target duckdb_storage
[ 22%] Built target duckdb_storage_buffer
[ 22%] Built target duckdb_storage_checkpoint
[ 23%] Built target duckdb_storage_compression
[ 23%] Built target duckdb_storage_compression_chimp
[ 23%] Built target duckdb_storage_compression_alp
[ 24%] Built target duckdb_storage_compression_roaring
[ 24%] Built target duckdb_storage_compression_dictionary
[ 24%] Built target duckdb_storage_compression_dict_fsst
[ 24%] Built target duckdb_storage_metadata
[ 24%] Built target duckdb_storage_serialization
[ 24%] Built target duckdb_storage_statistics
[ 25%] Built target duckdb_storage_table
[ 25%] Built target duckdb_transaction
[ 27%] Built target duckdb_verification
[ 28%] Built target duckdb_core_functions_algebraic
[ 28%] Built target duckdb_core_functions_distributive
[ 29%] Built target duckdb_core_functions_holistic
[ 30%] Built target duckdb_core_functions_nested
[ 30%] Built target duckdb_core_functions_regression
[ 30%] Built target duckdb_core_functions_array
[ 30%] Built target duckdb_core_functions_bit
[ 30%] Built target duckdb_core_functions_blob
[ 31%] Built target duckdb_core_functions_date
[ 31%] Built target duckdb_core_functions_debug
[ 31%] Built target duckdb_core_functions_enum
[ 31%] Built target duckdb_core_functions_generic
[ 31%] Built target duckdb_core_functions_list
[ 31%] Built target duckdb_core_functions_map
[ 31%] Built target duckdb_core_functions_math
[ 31%] Built target duckdb_core_functions_operators
[ 31%] Built target duckdb_core_functions_random
[ 31%] Built target duckdb_core_functions_string
[ 32%] Built target duckdb_core_functions_struct
[ 32%] Built target duckdb_core_functions_union
[ 33%] Built target duckdb_parquet_decoders
[ 33%] Built target duckdb_parquet_readers
[ 33%] Built target duckdb_parquet_reader_variant
[ 33%] Built target duckdb_parquet_writers
[ 33%] Built target duckdb_optimizer
[ 34%] Built target duckdb_optimizer_compressed_materialization
[ 34%] Built target duckdb_optimizer_join_order
[ 34%] Built target duckdb_optimizer_matcher
[ 34%] Built target duckdb_optimizer_pullup
[ 34%] Built target duckdb_optimizer_pushdown
[ 35%] Built target duckdb_optimizer_rules
[ 35%] Built target duckdb_optimizer_statistics_expr
[ 35%] Built target duckdb_optimizer_statistics_op
[ 36%] Built target duckdb_planner
[ 36%] Built target duckdb_planner_expression
[ 37%] Built target duckdb_bind_expression
[ 37%] Built target duckdb_bind_query_node
[ 37%] Built target duckdb_bind_statement
[ 37%] Built target duckdb_bind_tableref
[ 37%] Built target duckdb_expression_binders
[ 37%] Built target duckdb_planner_filter
[ 37%] Built target duckdb_planner_operator
[ 38%] Built target duckdb_planner_subquery
[ 39%] Built target duckdb_parser
[ 39%] Built target duckdb_constraints
[ 39%] Built target duckdb_expression
[ 39%] Built target duckdb_parsed_data
[ 40%] Built target duckdb_query_node
[ 41%] Built target duckdb_statement
[ 41%] Built target duckdb_parser_tableref
[ 41%] Built target duckdb_transformer_constraint
[ 41%] Built target duckdb_transformer_expression
[ 42%] Built target duckdb_transformer_helpers
[ 42%] Built target duckdb_transformer_statement
[ 42%] Built target duckdb_transformer_tableref
[ 42%] Built target duckdb_function
[ 42%] Built target duckdb_func_aggr
[ 42%] Built target duckdb_aggr_distr
[ 42%] Built target duckdb_func_cast
[ 42%] Built target duckdb_union_cast
[ 42%] Built target duckdb_variant_cast
[ 43%] Built target duckdb_func_pragma
[ 43%] Built target duckdb_func_scalar
[ 43%] Built target duckdb_func_compressed_materialization
[ 44%] Built target duckdb_func_date
[ 44%] Built target duckdb_func_generic_main
[ 44%] Built target duckdb_func_list_nested
[ 44%] Built target duckdb_function_map
[ 44%] Built target duckdb_func_ops_main
[ 44%] Built target duckdb_func_seq
[ 44%] Built target duckdb_func_string_main
[ 44%] Built target duckdb_func_string_regexp
[ 45%] Built target duckdb_func_struct_main
[ 46%] Built target duckdb_func_variant_main
[ 46%] Built target duckdb_func_system
[ 46%] Built target duckdb_func_table
[ 46%] Built target duckdb_table_func_system
[ 46%] Built target duckdb_func_table_version
[ 46%] Built target duckdb_arrow_conversion
[ 46%] Built target duckdb_func_window
[ 46%] Built target duckdb_catalog
[ 46%] Built target duckdb_catalog_entries
[ 46%] Built target duckdb_catalog_entries_dependency
[ 47%] Built target duckdb_catalog_default_entries
[ 47%] Built target duckdb_common
[ 48%] Built target duckdb_adbc
[ 48%] Built target duckdb_adbc_nanoarrow
[ 49%] Built target duckdb_common_arrow
[ 49%] Built target duckdb_common_arrow_appender
[ 49%] Built target duckdb_common_crypto
[ 49%] Built target duckdb_common_enums
[ 49%] Built target duckdb_common_exception
[ 49%] Built target duckdb_common_multi_file
[ 49%] Built target duckdb_common_operators
[ 49%] Built target duckdb_progress_bar
[ 50%] Built target duckdb_common_tree_renderer
[ 50%] Built target duckdb_row_operations
[ 50%] Built target duckdb_common_serializer
[ 50%] Built target duckdb_sort
[ 55%] Built target duckdb_re2
[ 55%] Built target duckdb_miniz
[ 56%] Built target duckdb_utf8proc
[ 57%] Built target duckdb_hyperloglog
[ 58%] Built target duckdb_skiplistlib
[ 59%] Built target duckdb_fastpforlib
[ 64%] Built target duckdb_mbedtls
[ 65%] Built target duckdb_fsst
[ 65%] Built target duckdb_yyjson
[ 77%] Built target parquet_extension
[ 77%] Built target core_functions_extension
[ 77%] Building CXX object extension/yardstick/CMakeFiles/yardstick_extension.dir/src/yardstick_extension.cpp.o
[ 78%] Building CXX object extension/yardstick/CMakeFiles/yardstick_extension.dir/src/yardstick_parser_ffi.cpp.o
[ 78%] Linking CXX static library libyardstick_extension.a
[ 78%] Built target yardstick_extension
[ 78%] Built target duckdb_static
[ 78%] Linking CXX shared library loadable_extension_optimizer_demo.duckdb_extension
[ 79%] Built target loadable_extension_optimizer_demo_loadable_extension
[ 79%] Building CXX object extension/yardstick/CMakeFiles/yardstick_loadable_extension.dir/src/yardstick_extension.cpp.o
[ 79%] Building CXX object extension/yardstick/CMakeFiles/yardstick_loadable_extension.dir/src/yardstick_parser_ffi.cpp.o
[ 80%] Linking CXX shared library yardstick.duckdb_extension
[ 80%] Built target yardstick_loadable_extension
[ 80%] Linking CXX shared library core_functions.duckdb_extension
[ 81%] Built target core_functions_loadable_extension
[ 82%] Linking CXX shared library parquet.duckdb_extension
[ 93%] Built target parquet_loadable_extension
[ 93%] Linking CXX shared library loadable_extension_demo.duckdb_extension
[ 93%] Built target loadable_extension_demo_loadable_extension
[ 94%] repository
[ 94%] Built target duckdb_local_extension_repo
[ 94%] Linking CXX shared library libduckdb.dylib
[ 94%] Built target duckdb
[ 95%] Built target sqlite3_api_wrapper_sqlite3
[ 96%] Built target sqlite3_udf_api
[ 96%] Built target sqlite3_api_wrapper_static
[ 96%] Linking CXX shared library libsqlite3_api_wrapper.dylib
[ 96%] Built target sqlite3_api_wrapper
[ 96%] Linking CXX executable test_sqlite3_api_wrapper
[ 97%] Built target test_sqlite3_api_wrapper
[ 98%] Built target duckdb_linenoise
[ 99%] Linking CXX executable ../../duckdb
[ 99%] Built target shell
[100%] Built target test_helpers
[100%] Built target test_sqlite
[100%] Linking CXX executable unittest
[100%] Built target unittest
[100%] Built target imdb
./build/release/"/test/unittest" ""test/""
Filters: test/
[0/1] (0%): test/sql/measures.test
[1/1] (100%): test/sql/measures.test
All tests passed (761 assertions in 1 test case)
./build/release/test/unittest
[0/2] (0%): test/sql/measures.test
[1/2] (50%): /Users/nico/conductor/workspaces/yardstick/yeosu-v1/test/sql/mea...
[2/2] (100%): /Users/nico/conductor/workspaces/yardstick/yeosu-v1/test/sql/me...
All tests passed (1522 assertions in 2 test cases).