BigQuery: Parse WITH CONNECTION on CREATE EXTERNAL TABLE#2326
BigQuery: Parse WITH CONNECTION on CREATE EXTERNAL TABLE#232697nitt wants to merge 4 commits intoapache:mainfrom
Conversation
| /// BigQuery `WITH CONNECTION` clause for external tables. | ||
| /// <https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_external_table_statement> |
There was a problem hiding this comment.
| /// BigQuery `WITH CONNECTION` clause for external tables. | |
| /// <https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_external_table_statement> | |
| /// `WITH CONNECTION` clause. | |
| /// [BigQuery](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_external_table_statement) |
| self.external_volume = external_volume; | ||
| self | ||
| } | ||
| /// Set the BigQuery `WITH CONNECTION` clause for external tables. |
There was a problem hiding this comment.
| /// Set the BigQuery `WITH CONNECTION` clause for external tables. | |
| /// Set the `WITH CONNECTION` clause. |
| /// BigQuery `WITH CONNECTION` clause for external tables | ||
| /// <https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_external_table_statement> |
There was a problem hiding this comment.
| /// BigQuery `WITH CONNECTION` clause for external tables | |
| /// <https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_external_table_statement> | |
| /// `WITH CONNECTION` clause. | |
| /// [BigQuery](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_external_table_statement) |
| // BigQuery external tables support `WITH CONNECTION <name>` and `OPTIONS(...)` | ||
| // clauses after the (optional) column list. |
There was a problem hiding this comment.
| // BigQuery external tables support `WITH CONNECTION <name>` and `OPTIONS(...)` | |
| // clauses after the (optional) column list. |
| // BigQuery uses OPTIONS(...); Hive uses TBLPROPERTIES(...). They are | ||
| // mutually exclusive in practice, and `parse_options` returns an empty | ||
| // vec when the keyword isn't present, so trying OPTIONS first and | ||
| // falling back to TBLPROPERTIES preserves the existing Hive path | ||
| // without accepting both in the same statement. | ||
| let options = self.parse_options(Keyword::OPTIONS)?; |
There was a problem hiding this comment.
not sure I follow the goal here, is OPTIONS part of the WITH CONNECTION clause (if not isn't it already parsed elsewhere for bigquery today)?
There was a problem hiding this comment.
Good catch, reverted. The parser change is now just the new WITH CONNECTION block plus the .with_connection(...) builder line; the fallback chain handles OPTIONS unchanged.
| }; | ||
| assert_eq!(opts.len(), 1); | ||
|
|
||
| // Baseline: no WITH CONNECTION. The new parser branch must not produce |
There was a problem hiding this comment.
I think we can remove these superfluous comments from the tests
| "WITH CONNECTION `projects/proj/locations/us/connections/c` ", | ||
| r#"OPTIONS(format = "ICEBERG", uris = ["gs://b/m.json"])"#, | ||
| ); | ||
| let stmts = bigquery().parse_sql_statements(sql).unwrap(); |
There was a problem hiding this comment.
for the tests is there a reason we don't use verified_stmt? Also I think we can skip the assertions on the ast and have it solely as roundtrip tests (I don't think the ast assertions are adding much given the changes made and preexisting tests)
The pre-existing maybe_parse_options(OPTIONS) fallback already handled BigQuery OPTIONS(...) on external tables; the reorder made it unreachable without changing behavior.
Summary
Parse the
WITH CONNECTION <name>clause on BigQueryCREATE EXTERNAL TABLEstatements. Per the BigQuery DDL reference, external tables (and
external Iceberg tables in particular) carry a connection identifier between
the column list and the
OPTIONS(...)clause:Today the parser rejects this shape because nothing consumes
WITH CONNECTION,and
OPTIONS(...)on external tables only worked via the HiveTBLPROPERTIESfallback path.Changes
CreateTable/CreateTableBuilder: newwith_connection: Option<ObjectName>field, populated by the parser and rendered by
DisplaybetweenCLUSTER BYandOPTIONS(...).parse_create_external_table: consume an optionalWITH CONNECTION <name>,then prefer
OPTIONS(...)overTBLPROPERTIES(...)(mutually exclusive inpractice; preserves the existing Hive path when neither is present).
WITH CONNECTIONwithOPTIONS,WITH CONNECTIONalone (Display normalizes by adding an empty column list),the
(columns) WITH CONNECTION OPTIONS(...)round-trip, and a baselinethat asserts
with_connectionstaysNonewhen the keyword pair is absent.tests/sqlparser_duckdb.rs,tests/sqlparser_mssql.rs,tests/sqlparser_postgres.rs: updated existing struct-literal expectationsfor the new field.
Test plan
cargo testcargo fmt --checkcargo clippy --all-targets --all-features -- -D warnings