feat(source/bigquery): add apiEndpoint for custom API host#3358
feat(source/bigquery): add apiEndpoint for custom API host#3358hannank-rounds wants to merge 1 commit into
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
There was a problem hiding this comment.
Code Review
This pull request introduces support for overriding the BigQuery API endpoint via a new apiEndpoint configuration option and BIGQUERY_ENDPOINT environment variable, allowing users to route requests through a proxy or alternate front-end. Feedback from the review highlights an issue in the endpoint normalization logic where stripping the URL scheme and defaulting to port 443 breaks HTTP-only proxies and local emulators. The reviewer suggests preserving the scheme and dynamically assigning the default port (80 for HTTP, 443 for HTTPS/no scheme), along with corresponding updates to the unit tests and documentation.
| func normalizeAPIEndpoint(raw string) string { | ||
| s := strings.TrimSpace(raw) | ||
| if s == "" { | ||
| return "" | ||
| } | ||
| host := stripURLScheme(s) | ||
| host = strings.TrimSuffix(host, "/") | ||
| if isDirectAPIEndpoint(host) { | ||
| return "" | ||
| } | ||
| if !strings.Contains(host, ":") { | ||
| host = host + ":443" | ||
| } | ||
| return host | ||
| } | ||
|
|
||
| func isDirectAPIEndpoint(host string) bool { | ||
| lower := strings.ToLower(host) | ||
| if _, ok := directAPIEndpointAliases[lower]; ok { | ||
| return true | ||
| } | ||
| if i := strings.LastIndex(lower, ":"); i != -1 { | ||
| if _, ok := directAPIEndpointAliases[lower[:i]]; ok { | ||
| return true | ||
| } | ||
| } | ||
| return false | ||
| } | ||
|
|
||
| func stripURLScheme(s string) string { | ||
| for _, prefix := range []string{"https://", "http://"} { | ||
| if len(s) >= len(prefix) && strings.EqualFold(s[:len(prefix)], prefix) { | ||
| return s[len(prefix):] | ||
| } | ||
| } | ||
| return s | ||
| } |
There was a problem hiding this comment.
The current implementation of normalizeAPIEndpoint strips the URL scheme (e.g., http:// or https://) and appends :443 if no port is specified. This causes issues when a user explicitly configures an HTTP endpoint (such as http://localhost:8080 or http://proxy.example.com for local emulators or HTTP proxies).
When the scheme is stripped, the Google API client library's HTTP transport automatically prepends https:// because the endpoint does not contain ://. This forces the client to use HTTPS/TLS, which will fail against HTTP-only proxies or emulators.
We should preserve the scheme if it is present, and only append the default port (80 for HTTP, 443 for HTTPS or when no scheme is specified) if no port is explicitly provided.
func normalizeAPIEndpoint(raw string) string {
s := strings.TrimSpace(raw)
if s == "" {
return ""
}
var scheme string
host := s
for _, prefix := range []string{"https://", "http://"} {
if len(s) >= len(prefix) && strings.EqualFold(s[:len(prefix)], prefix) {
scheme = prefix
host = s[len(prefix):]
break
}
}
host = strings.TrimSuffix(host, "/")
if isDirectAPIEndpoint(host) {
return ""
}
if !strings.Contains(host, ":") {
if scheme == "http://" {
host = host + ":80"
} else {
host = host + ":443"
}
}
return scheme + host
}
func isDirectAPIEndpoint(host string) bool {
lower := strings.ToLower(host)
if _, ok := directAPIEndpointAliases[lower]; ok {
return true
}
if i := strings.LastIndex(lower, ":"); i != -1 {
if _, ok := directAPIEndpointAliases[lower[:i]]; ok {
return true
}
}
return false
}| {"https://proxy.example.com", "proxy.example.com:443"}, | ||
| {"https://proxy.example.com/", "proxy.example.com:443"}, | ||
| {"proxy.example.com:8443", "proxy.example.com:8443"}, |
There was a problem hiding this comment.
Update the test cases to align with the new scheme-preserving normalization behavior, and add test cases for HTTP endpoints to ensure they are handled correctly.
{"https://proxy.example.com", "https://proxy.example.com:443"},
{"https://proxy.example.com/", "https://proxy.example.com:443"},
{"http://proxy.example.com", "http://proxy.example.com:80"},
{"http://proxy.example.com:8080", "http://proxy.example.com:8080"},
{"proxy.example.com:8443", "proxy.example.com:8443"},| | impersonateServiceAccount | string | false | Service account email to impersonate when making BigQuery and Dataplex API calls. The authenticated principal must have the `roles/iam.serviceAccountTokenCreator` role on the target service account. [Learn More](https://cloud.google.com/iam/docs/service-account-impersonation) | | ||
| | maxQueryResultRows | int | false | The maximum number of rows to return from a query. Defaults to 50. | | ||
| | maximumBytesBilled | int64 | false | The maximum bytes billed per query. When set, queries that exceed this limit fail before executing. | | ||
| | apiEndpoint | string | false | Overrides the BigQuery API endpoint (URL or `host:port`) for proxy or alternate front-ends. Unset, empty, `direct`, `google`, `default`, or `bigquery.googleapis.com` use the default Google endpoint. HTTPS URLs are normalized to `host:port` (default port `443`). Dataplex catalog calls are not affected. | |
There was a problem hiding this comment.
Update the documentation to reflect that URLs are normalized to include a default port based on their scheme (80 for HTTP, 443 for HTTPS/no scheme) rather than always stripping the scheme and forcing 443.
| | apiEndpoint | string | false | Overrides the BigQuery API endpoint (URL or `host:port`) for proxy or alternate front-ends. Unset, empty, `direct`, `google`, `default`, or `bigquery.googleapis.com` use the default Google endpoint. HTTPS URLs are normalized to `host:port` (default port `443`). Dataplex catalog calls are not affected. | | |
| | apiEndpoint | string | false | Overrides the BigQuery API endpoint (URL or host:port) for proxy or alternate front-ends. Unset, empty, direct, google, default, or bigquery.googleapis.com use the default Google endpoint. URLs are normalized to include a default port (80 for HTTP, 443 for HTTPS/no scheme) if none is specified. Dataplex catalog calls are not affected. | |
|
I signed the CLA |
Add an optional apiEndpoint (BIGQUERY_ENDPOINT) that overrides the BigQuery API host for proxies, alternate front-ends, and local emulators. The endpoint is applied across all three auth paths (ADC, impersonation, OAuth) and to both the high-level client and the bigquery/v2 REST service. normalizeAPIEndpoint preserves the URL scheme so http-only proxies and emulators (e.g. http://localhost:9050) keep working, defaults a missing scheme to https, appends a default port when absent (:80 for http, else :443), and strips a trailing slash. Dataplex and ask_data_insights use different API surfaces and are intentionally out of scope. Co-authored-by: Cursor <cursoragent@cursor.com>
9c3cdec to
f60a64a
Compare
|
I signed the CLA |
The following contributors were found for this pull request: ✅ f60a64a Author: @hannank-rounds <ha****.k@rounds.com> You will need to remove the cursor commit or change the author. |
Summary
Adds optional
apiEndpointconfiguration to the BigQuery source so MCP clients can route BigQuery API calls through a proxy or alternate front-end (equivalent to PythonClientOptions(api_endpoint=...)).apiEndpoint: "https://my-proxy.example.com"--prebuilt bigquery):BIGQUERY_ENDPOINTenvironment variable (empty,direct, orbigquery.googleapis.com→ default Google API)host:portforoption.WithEndpoint(default port443)Motivation
Teams that use a BigQuery API proxy (e.g. for compliance or routing) already configure this in application code (
ClientOptions/ envBIGQUERY_ENDPOINT). MCP Toolbox prebuilt BigQuery had no equivalent, so IDE agents could not use the same endpoint as batch generators.Test plan
TestNormalizeAPIEndpointunit tests for alias and URL normalizationapiEndpointfieldgo test ./internal/sources/bigquery/...Made with Cursor