fix(ci): use GITHUB_TOKEN in check-vendored-yaml to avoid rate limits#252
Conversation
The check-vendored-yaml script hits the GitHub Contents API without authentication, causing rate-limit failures (403) in CI. The error manifested as a cryptic "contents is not iterable" TypeError because fetchJson didn't check HTTP status codes. Changes: - Use GITHUB_TOKEN env var for authenticated API requests (already passed by CI workflows, raises rate limit from 60 to 5000 req/hr) - Add HTTP status code checking in fetchJson with clear error messages - Validate that Contents API response is an array before iterating Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses and resolves issues related to GitHub API rate limiting in the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request enhances the check-vendored-yaml.ts script by introducing support for GitHub authentication via GITHUB_TOKEN for API requests, improving error handling for non-200 HTTP responses, and adding validation for the GitHub Contents API response type. A review comment highlights an opportunity to improve type safety by removing unnecessary as any casts and explicit any types in the https.get calls.
|
📄 Preview deployed to https://opentdf-docs-pr-252.surge.sh |
dmihalcik-virtru
left a comment
There was a problem hiding this comment.
Maybe overly paranoid given we control the inputs, but I'd make sure we only attach auth headers to the associated domain
📝 WalkthroughWalkthroughAdded conditional GitHub token authentication to HTTP requests, improved non-200 error reporting to include status and truncated response body, and validated that GitHub Contents API responses are arrays before processing. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Applied both suggestions from @dmihalcik-virtru — the |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/openapi/check-vendored-yaml.ts (2)
71-85:⚠️ Potential issue | 🟡 Minor
fetchTextlacks status code checking.Unlike
fetchJson,fetchTextdoesn't check for non-200 responses. If the request is rate-limited or fails, the error body (HTML/JSON) will be returned as text, causing a confusingyaml.loadfailure inhasApiPaths. Consider adding consistent error handling:🛠️ Proposed fix
function fetchText(url: string): Promise<string> { return new Promise((resolve, reject) => { import('https').then(https => { const headers: Record<string, string> = { 'User-Agent': 'opentdf-docs-check-vendored-yaml' }; if (GITHUB_TOKEN) { headers['Authorization'] = `token ${GITHUB_TOKEN}`; } https.get(url, { headers } as any, (response: any) => { + if (response.statusCode !== 200) { + let body = ''; + response.on('data', (chunk: string) => { body += chunk; }); + response.on('end', () => { + reject(new Error( + `Failed to fetch ${url}\n` + + ` Status: ${response.statusCode}\n` + + ` Response: ${body.slice(0, 200)}` + )); + }); + return; + } let data = ''; response.on('data', (chunk: string) => { data += chunk; }); response.on('end', () => resolve(data)); }).on('error', reject); }).catch(reject); }); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openapi/check-vendored-yaml.ts` around lines 71 - 85, The fetchText function does not check HTTP response status, so non-200 responses (rate limits/errors) return HTML/JSON as text and later cause yaml.load failures; update fetchText (the function using import('https') and https.get) to inspect the response.statusCode and, for non-2xx codes, collect the response body then reject with a descriptive Error that includes the statusCode and response body (similar to fetchJson's behavior), while preserving the User-Agent/GITHUB_TOKEN headers and existing error handlers so callers like hasApiPaths receive a clear failure instead of invalid YAML.
19-37:⚠️ Potential issue | 🟠 Major
downloadFiledoesn't use authentication.This function downloads specs from raw GitHub URLs (used in the vendored file check loop at line 129) but doesn't include the
GITHUB_TOKEN. These requests remain unauthenticated and subject to the 60 req/hr rate limit, which may still cause CI failures when checking multiple specs.🛠️ Proposed fix
function downloadFile(url: string, dest: string): Promise<void> { return new Promise((resolve, reject) => { import('https').then(https => { const file = fs.createWriteStream(dest); - https.get(url, (response: any) => { + const headers: Record<string, string> = { 'User-Agent': 'opentdf-docs-check-vendored-yaml' }; + if (GITHUB_TOKEN && url.startsWith(PLATFORM_RAW_BASE)) { + headers['Authorization'] = `token ${GITHUB_TOKEN}`; + } + https.get(url, { headers } as any, (response: any) => { if (response.statusCode !== 200) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openapi/check-vendored-yaml.ts` around lines 19 - 37, The downloadFile function is making unauthenticated GitHub raw URL requests and should include the GITHUB_TOKEN to avoid low rate limits; update downloadFile to read process.env.GITHUB_TOKEN (if present) and send an Authorization header (e.g. "token <GITHUB_TOKEN>") with the HTTPS request (use https.request or pass options to https.get) when fetching the URL, preserving the existing response/status handling and cleanup (fs.unlink on error, file.close on finish) so the vendored file check loop that calls downloadFile uses authenticated requests when a token is available.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/openapi/check-vendored-yaml.ts`:
- Around line 71-85: The fetchText function does not check HTTP response status,
so non-200 responses (rate limits/errors) return HTML/JSON as text and later
cause yaml.load failures; update fetchText (the function using import('https')
and https.get) to inspect the response.statusCode and, for non-2xx codes,
collect the response body then reject with a descriptive Error that includes the
statusCode and response body (similar to fetchJson's behavior), while preserving
the User-Agent/GITHUB_TOKEN headers and existing error handlers so callers like
hasApiPaths receive a clear failure instead of invalid YAML.
- Around line 19-37: The downloadFile function is making unauthenticated GitHub
raw URL requests and should include the GITHUB_TOKEN to avoid low rate limits;
update downloadFile to read process.env.GITHUB_TOKEN (if present) and send an
Authorization header (e.g. "token <GITHUB_TOKEN>") with the HTTPS request (use
https.request or pass options to https.get) when fetching the URL, preserving
the existing response/status handling and cleanup (fs.unlink on error,
file.close on finish) so the vendored file check loop that calls downloadFile
uses authenticated requests when a token is available.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8c906c0e-38f7-4e6e-afd8-fcb10ecb6d56
📒 Files selected for processing (1)
src/openapi/check-vendored-yaml.ts
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/openapi/check-vendored-yaml.ts (1)
74-82:⚠️ Potential issue | 🟠 MajorAdd HTTP status validation in
fetchTextto prevent error payloads from reachingyaml.load.
fetchTextat lines 71-85 resolves response bodies for all HTTP statuses, including errors. This allows error responses or non-200 status codes to be passed directly toyaml.load()inhasApiPaths(line 117), causing unclear parse failures or false negatives.Both
downloadFile(line 24) andfetchJson(line 48) in the same file already validateresponse.statusCode !== 200before processing the response.fetchTextshould follow the same pattern.Suggested patch
function fetchText(url: string): Promise<string> { return new Promise((resolve, reject) => { import('https').then(https => { const headers: Record<string, string> = { 'User-Agent': 'opentdf-docs-check-vendored-yaml' }; if (GITHUB_TOKEN && url.startsWith(PLATFORM_RAW_BASE)) { headers['Authorization'] = `token ${GITHUB_TOKEN}`; } https.get(url, { headers } as any, (response: any) => { let data = ''; response.on('data', (chunk: string) => { data += chunk; }); - response.on('end', () => resolve(data)); + response.on('end', () => { + if (response.statusCode !== 200) { + reject(new Error( + `GitHub raw request failed: ${url}\n` + + ` Status: ${response.statusCode}\n` + + ` Response: ${data.slice(0, 200)}` + )); + return; + } + resolve(data); + }); }).on('error', reject); }).catch(reject); }); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openapi/check-vendored-yaml.ts` around lines 74 - 82, fetchText currently resolves the response body regardless of HTTP status, allowing error pages to reach yaml.load in hasApiPaths; update fetchText to validate the response.statusCode (like downloadFile and fetchJson) and reject when statusCode !== 200, including the statusCode and a short message (or response body) in the error so callers get a clear failure instead of passing error payloads to yaml.load; locate the logic inside fetchText where https.get callback handles response and add the status check before collecting/resolve.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/openapi/check-vendored-yaml.ts`:
- Around line 74-82: fetchText currently resolves the response body regardless
of HTTP status, allowing error pages to reach yaml.load in hasApiPaths; update
fetchText to validate the response.statusCode (like downloadFile and fetchJson)
and reject when statusCode !== 200, including the statusCode and a short message
(or response body) in the error so callers get a clear failure instead of
passing error payloads to yaml.load; locate the logic inside fetchText where
https.get callback handles response and add the status check before
collecting/resolve.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6aa8a8ea-4b43-4327-b124-a284618656a9
📒 Files selected for processing (1)
src/openapi/check-vendored-yaml.ts
Keep the domain-scoped GITHUB_TOKEN checks from PR #252. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Keep the domain-scoped GITHUB_TOKEN checks from PR #252. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
check-vendored-yamlusing theGITHUB_TOKENenv var (already passed by both CI workflows)fetchJson— previously a 403 rate-limit response was silently parsed as JSON and passed to afor...ofloop, causing a crypticTypeError: contents is not iterableRoot cause: The script makes unauthenticated requests to the GitHub Contents API (60 req/hr limit). In CI, multiple concurrent jobs or rapid re-runs exhaust this limit, causing the build step to fail with an unhelpful error.
Fix: Use
GITHUB_TOKENfor authenticated requests (5,000 req/hr), and fail with a clear error message if the API still returns a non-200 status.Test plan
npm run check-vendored-yamlstill works (gracefully handles missing token with clear error)🤖 Generated with Claude Code
Summary by CodeRabbit