Skip to content

fix(ci): use GITHUB_TOKEN in check-vendored-yaml to avoid rate limits#252

Merged
marythought merged 4 commits into
mainfrom
fix/check-vendored-yaml-rate-limit
Mar 23, 2026
Merged

fix(ci): use GITHUB_TOKEN in check-vendored-yaml to avoid rate limits#252
marythought merged 4 commits into
mainfrom
fix/check-vendored-yaml-rate-limit

Conversation

@marythought
Copy link
Copy Markdown
Contributor

@marythought marythought commented Mar 18, 2026

Summary

  • Authenticates GitHub API requests in check-vendored-yaml using the GITHUB_TOKEN env var (already passed by both CI workflows)
  • Adds HTTP status code checking in fetchJson — previously a 403 rate-limit response was silently parsed as JSON and passed to a for...of loop, causing a cryptic TypeError: contents is not iterable
  • Validates that the Contents API response is an array before iterating

Root cause: The script makes unauthenticated requests to the GitHub Contents API (60 req/hr limit). In CI, multiple concurrent jobs or rapid re-runs exhaust this limit, causing the build step to fail with an unhelpful error.

Fix: Use GITHUB_TOKEN for authenticated requests (5,000 req/hr), and fail with a clear error message if the API still returns a non-200 status.

Test plan

  • CI build passes (check-vendored-yaml uses GITHUB_TOKEN)
  • Local npm run check-vendored-yaml still works (gracefully handles missing token with clear error)
  • Verify error message is clear when rate-limited (shows status code and response body)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Improvements
    • Enhanced error reporting for remote API requests: failures now include request URL, HTTP status, and a snippet of the response to aid diagnosis.
    • Optional authenticated GitHub requests when credentials are provided, improving access to protected or rate-limited resources.
    • Stricter validation of GitHub API responses to surface malformed or unexpected data earlier and with clearer errors.

The check-vendored-yaml script hits the GitHub Contents API without
authentication, causing rate-limit failures (403) in CI. The error
manifested as a cryptic "contents is not iterable" TypeError because
fetchJson didn't check HTTP status codes.

Changes:
- Use GITHUB_TOKEN env var for authenticated API requests (already
  passed by CI workflows, raises rate limit from 60 to 5000 req/hr)
- Add HTTP status code checking in fetchJson with clear error messages
- Validate that Contents API response is an array before iterating

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@marythought marythought requested review from a team as code owners March 18, 2026 17:37
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses and resolves issues related to GitHub API rate limiting in the check-vendored-yaml script. By incorporating authenticated requests and robust error handling, it ensures the script's reliability in CI environments and provides more informative feedback when API calls encounter problems, preventing build failures due to exhausted rate limits or malformed responses.

Highlights

  • GitHub API Authentication: Implemented the use of GITHUB_TOKEN environment variable for authenticating GitHub API requests within the check-vendored-yaml script, increasing the rate limit from 60 to 5,000 requests per hour.
  • Improved Error Handling: Added explicit HTTP status code checking in the fetchJson function to catch non-200 responses (like 403 rate limits) and provide clear error messages, preventing cryptic TypeError issues.
  • API Response Validation: Introduced validation to ensure that the GitHub Contents API response is an array before iteration, further enhancing robustness against unexpected API responses.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request enhances the check-vendored-yaml.ts script by introducing support for GitHub authentication via GITHUB_TOKEN for API requests, improving error handling for non-200 HTTP responses, and adding validation for the GitHub Contents API response type. A review comment highlights an opportunity to improve type safety by removing unnecessary as any casts and explicit any types in the https.get calls.

Comment thread src/openapi/check-vendored-yaml.ts
@github-actions
Copy link
Copy Markdown
Contributor

📄 Preview deployed to https://opentdf-docs-pr-252.surge.sh

@marythought marythought enabled auto-merge (squash) March 18, 2026 20:40
Copy link
Copy Markdown
Member

@dmihalcik-virtru dmihalcik-virtru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe overly paranoid given we control the inputs, but I'd make sure we only attach auth headers to the associated domain

Comment thread src/openapi/check-vendored-yaml.ts Outdated
Comment thread src/openapi/check-vendored-yaml.ts Outdated
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 23, 2026

📝 Walkthrough

Walkthrough

Added conditional GitHub token authentication to HTTP requests, improved non-200 error reporting to include status and truncated response body, and validated that GitHub Contents API responses are arrays before processing.

Changes

Cohort / File(s) Summary
GitHub API auth, error handling & validation
src/openapi/check-vendored-yaml.ts
Read GITHUB_TOKEN from environment and attach Authorization for GitHub API/raw-content requests; fetchJson and fetchText conditionally add auth; fetchJson now captures non-200 response body and rejects with URL, status, and first 200 chars; fetchRemoteSpecPaths validates Contents API response is an array and errors with truncated payload on mismatch.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐰 I nibble tokens, snug and keen,
Headers set where GitHub's been.
Errors trimmed to just the start,
Arrays checked with steady heart.
Vendored YAML hops along—hooray!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main change: adding GITHUB_TOKEN authentication to avoid API rate limits in the check-vendored-yaml script.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/check-vendored-yaml-rate-limit

Comment @coderabbitai help to get the list of available commands and usage tips.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@marythought
Copy link
Copy Markdown
Contributor Author

Applied both suggestions from @dmihalcik-virtru — the GITHUB_TOKEN is now scoped to PLATFORM_API_BASE in fetchJson and PLATFORM_RAW_BASE in fetchText, so it'll only attach auth headers to known GitHub URLs. See 580ac87.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/openapi/check-vendored-yaml.ts (2)

71-85: ⚠️ Potential issue | 🟡 Minor

fetchText lacks status code checking.

Unlike fetchJson, fetchText doesn't check for non-200 responses. If the request is rate-limited or fails, the error body (HTML/JSON) will be returned as text, causing a confusing yaml.load failure in hasApiPaths. Consider adding consistent error handling:

🛠️ Proposed fix
 function fetchText(url: string): Promise<string> {
   return new Promise((resolve, reject) => {
     import('https').then(https => {
       const headers: Record<string, string> = { 'User-Agent': 'opentdf-docs-check-vendored-yaml' };
       if (GITHUB_TOKEN) {
         headers['Authorization'] = `token ${GITHUB_TOKEN}`;
       }
       https.get(url, { headers } as any, (response: any) => {
+        if (response.statusCode !== 200) {
+          let body = '';
+          response.on('data', (chunk: string) => { body += chunk; });
+          response.on('end', () => {
+            reject(new Error(
+              `Failed to fetch ${url}\n` +
+              `  Status: ${response.statusCode}\n` +
+              `  Response: ${body.slice(0, 200)}`
+            ));
+          });
+          return;
+        }
         let data = '';
         response.on('data', (chunk: string) => { data += chunk; });
         response.on('end', () => resolve(data));
       }).on('error', reject);
     }).catch(reject);
   });
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openapi/check-vendored-yaml.ts` around lines 71 - 85, The fetchText
function does not check HTTP response status, so non-200 responses (rate
limits/errors) return HTML/JSON as text and later cause yaml.load failures;
update fetchText (the function using import('https') and https.get) to inspect
the response.statusCode and, for non-2xx codes, collect the response body then
reject with a descriptive Error that includes the statusCode and response body
(similar to fetchJson's behavior), while preserving the User-Agent/GITHUB_TOKEN
headers and existing error handlers so callers like hasApiPaths receive a clear
failure instead of invalid YAML.

19-37: ⚠️ Potential issue | 🟠 Major

downloadFile doesn't use authentication.

This function downloads specs from raw GitHub URLs (used in the vendored file check loop at line 129) but doesn't include the GITHUB_TOKEN. These requests remain unauthenticated and subject to the 60 req/hr rate limit, which may still cause CI failures when checking multiple specs.

🛠️ Proposed fix
 function downloadFile(url: string, dest: string): Promise<void> {
   return new Promise((resolve, reject) => {
     import('https').then(https => {
       const file = fs.createWriteStream(dest);
-      https.get(url, (response: any) => {
+      const headers: Record<string, string> = { 'User-Agent': 'opentdf-docs-check-vendored-yaml' };
+      if (GITHUB_TOKEN && url.startsWith(PLATFORM_RAW_BASE)) {
+        headers['Authorization'] = `token ${GITHUB_TOKEN}`;
+      }
+      https.get(url, { headers } as any, (response: any) => {
         if (response.statusCode !== 200) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openapi/check-vendored-yaml.ts` around lines 19 - 37, The downloadFile
function is making unauthenticated GitHub raw URL requests and should include
the GITHUB_TOKEN to avoid low rate limits; update downloadFile to read
process.env.GITHUB_TOKEN (if present) and send an Authorization header (e.g.
"token <GITHUB_TOKEN>") with the HTTPS request (use https.request or pass
options to https.get) when fetching the URL, preserving the existing
response/status handling and cleanup (fs.unlink on error, file.close on finish)
so the vendored file check loop that calls downloadFile uses authenticated
requests when a token is available.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/openapi/check-vendored-yaml.ts`:
- Around line 71-85: The fetchText function does not check HTTP response status,
so non-200 responses (rate limits/errors) return HTML/JSON as text and later
cause yaml.load failures; update fetchText (the function using import('https')
and https.get) to inspect the response.statusCode and, for non-2xx codes,
collect the response body then reject with a descriptive Error that includes the
statusCode and response body (similar to fetchJson's behavior), while preserving
the User-Agent/GITHUB_TOKEN headers and existing error handlers so callers like
hasApiPaths receive a clear failure instead of invalid YAML.
- Around line 19-37: The downloadFile function is making unauthenticated GitHub
raw URL requests and should include the GITHUB_TOKEN to avoid low rate limits;
update downloadFile to read process.env.GITHUB_TOKEN (if present) and send an
Authorization header (e.g. "token <GITHUB_TOKEN>") with the HTTPS request (use
https.request or pass options to https.get) when fetching the URL, preserving
the existing response/status handling and cleanup (fs.unlink on error,
file.close on finish) so the vendored file check loop that calls downloadFile
uses authenticated requests when a token is available.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8c906c0e-38f7-4e6e-afd8-fcb10ecb6d56

📥 Commits

Reviewing files that changed from the base of the PR and between 7ff5427 and cac81ff.

📒 Files selected for processing (1)
  • src/openapi/check-vendored-yaml.ts

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/openapi/check-vendored-yaml.ts (1)

74-82: ⚠️ Potential issue | 🟠 Major

Add HTTP status validation in fetchText to prevent error payloads from reaching yaml.load.

fetchText at lines 71-85 resolves response bodies for all HTTP statuses, including errors. This allows error responses or non-200 status codes to be passed directly to yaml.load() in hasApiPaths (line 117), causing unclear parse failures or false negatives.

Both downloadFile (line 24) and fetchJson (line 48) in the same file already validate response.statusCode !== 200 before processing the response. fetchText should follow the same pattern.

Suggested patch
 function fetchText(url: string): Promise<string> {
   return new Promise((resolve, reject) => {
     import('https').then(https => {
       const headers: Record<string, string> = { 'User-Agent': 'opentdf-docs-check-vendored-yaml' };
       if (GITHUB_TOKEN && url.startsWith(PLATFORM_RAW_BASE)) {
         headers['Authorization'] = `token ${GITHUB_TOKEN}`;
       }
       https.get(url, { headers } as any, (response: any) => {
         let data = '';
         response.on('data', (chunk: string) => { data += chunk; });
-        response.on('end', () => resolve(data));
+        response.on('end', () => {
+          if (response.statusCode !== 200) {
+            reject(new Error(
+              `GitHub raw request failed: ${url}\n` +
+              `  Status: ${response.statusCode}\n` +
+              `  Response: ${data.slice(0, 200)}`
+            ));
+            return;
+          }
+          resolve(data);
+        });
       }).on('error', reject);
     }).catch(reject);
   });
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openapi/check-vendored-yaml.ts` around lines 74 - 82, fetchText currently
resolves the response body regardless of HTTP status, allowing error pages to
reach yaml.load in hasApiPaths; update fetchText to validate the
response.statusCode (like downloadFile and fetchJson) and reject when statusCode
!== 200, including the statusCode and a short message (or response body) in the
error so callers get a clear failure instead of passing error payloads to
yaml.load; locate the logic inside fetchText where https.get callback handles
response and add the status check before collecting/resolve.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/openapi/check-vendored-yaml.ts`:
- Around line 74-82: fetchText currently resolves the response body regardless
of HTTP status, allowing error pages to reach yaml.load in hasApiPaths; update
fetchText to validate the response.statusCode (like downloadFile and fetchJson)
and reject when statusCode !== 200, including the statusCode and a short message
(or response body) in the error so callers get a clear failure instead of
passing error payloads to yaml.load; locate the logic inside fetchText where
https.get callback handles response and add the status check before
collecting/resolve.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6aa8a8ea-4b43-4327-b124-a284618656a9

📥 Commits

Reviewing files that changed from the base of the PR and between cac81ff and 580ac87.

📒 Files selected for processing (1)
  • src/openapi/check-vendored-yaml.ts

@marythought marythought merged commit af1e438 into main Mar 23, 2026
7 checks passed
@marythought marythought deleted the fix/check-vendored-yaml-rate-limit branch March 23, 2026 16:35
marythought added a commit that referenced this pull request Mar 23, 2026
Keep the domain-scoped GITHUB_TOKEN checks from PR #252.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
marythought added a commit that referenced this pull request Mar 23, 2026
Keep the domain-scoped GITHUB_TOKEN checks from PR #252.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants