Conversation
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughThis PR enhances error handling in the dataset tagging workflow across multiple layers. It introduces a new database exceptions module defining constraint violation exceptions, modifies the database layer to catch and re-raise Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 53 minutes and 50 seconds.Comment |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
tag_dataset, the TypeError message for a non-AsyncConnectionexpdb_dbuses{type(expdb_db)}inside a normal string; convert this to an f-string (or use%s) so the actual type is interpolated instead of the literal placeholder.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `tag_dataset`, the TypeError message for a non-`AsyncConnection` `expdb_db` uses `{type(expdb_db)}` inside a normal string; convert this to an f-string (or use `%s`) so the actual type is interpolated instead of the literal placeholder.
## Individual Comments
### Comment 1
<location path="src/routers/openml/datasets.py" line_range="65-66" />
<code_context>
- assert expdb_db is not None # noqa: S101
- tags = await database.datasets.get_tags_for(data_id, expdb_db)
- if tag.casefold() in [t.casefold() for t in tags]:
+ if not isinstance(expdb_db, AsyncConnection):
+ msg = "`expdb_db` should be an AsyncConnection, is {type(expdb_db)}"
+ raise TypeError(msg)
+
</code_context>
<issue_to_address>
**issue:** The TypeError message isn’t interpolating `type(expdb_db)` as intended.
`msg` is defined as a normal string, so `{type(expdb_db)}` will be rendered literally. Make this an f-string (or use `%s`/`.format`) so the actual type is shown, e.g. `f"`expdb_db` should be an AsyncConnection, is {type(expdb_db)!r}"`.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
🧹 Nitpick comments (2)
docs/contributing/tests.md (1)
31-37: ⚡ Quick winPolish wording: use “Elasticsearch” (not “elastic search”).
The docs say: “Some tests require elastic search indices to be constructed.” Standard term is Elasticsearch; this also avoids style/grammar flags (e.g., hyphenation/compound-word suggestions).
Suggested change
-Some tests require elastic search indices to be constructed. +Some tests require Elasticsearch indices to be constructed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/contributing/tests.md` around lines 31 - 37, Update wording in the docs to use the proper product name "Elasticsearch" instead of "elastic search" wherever it appears (e.g., the sentence that currently reads "Some tests require elastic search indices to be constructed.") and ensure the variable name INDEX_ES_DURING_STARTUP and reference to `docker/php/.env` remain unchanged; simply replace "elastic search" with "Elasticsearch" throughout the paragraph to correct casing and compound-word usage..github/workflows/tests.yml (1)
33-39: ⚡ Quick winMake the
.envflip more robust (avoid brittle exact-stringsed).Right now the workflow only replaces the exact substring
INDEX_ES_DURING_STARTUP=false. If the.envfile contains whitespace/quotes (e.g.,INDEX_ES_DURING_STARTUP = falseor'false') or the value is alreadytrue, the replacement may silently not occur—leading to Elasticsearch index-dependent tests behaving inconsistently.Suggested change
if [ "${{ matrix.php_api }}" = "true" ]; then - sed -i 's/INDEX_ES_DURING_STARTUP=false/INDEX_ES_DURING_STARTUP=true/' docker/php/.env + # Flip INDEX_ES_DURING_STARTUP in a whitespace/quote-tolerant way; fail if the key isn't found. + if ! sed -i -E 's/^(INDEX_ES_DURING_STARTUP[[:space:]]*=[[:space:]]*)("|'\"''\"'\"'|)?false("|'\"''\"'\"'|)?[[:space:]]*$/\1true/Ig' docker/php/.env; then + echo "Failed to update INDEX_ES_DURING_STARTUP in docker/php/.env" >&2 + exit 1 + fi services="$services php-api" fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/tests.yml around lines 33 - 39, The current sed invocation that targets the exact string INDEX_ES_DURING_STARTUP=false is brittle; update the workflow step that builds services (the block setting services="python-api" and the sed invocation) to use a more robust regex-based edit against docker/php/.env so it matches the key regardless of surrounding whitespace or quotes and sets its value to true (and if the key is missing, append INDEX_ES_DURING_STARTUP=true to the file). Ensure the change updates lines like INDEX_ES_DURING_STARTUP = 'false' or "false" and is idempotent if the value is already true.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.github/workflows/tests.yml:
- Around line 33-39: The current sed invocation that targets the exact string
INDEX_ES_DURING_STARTUP=false is brittle; update the workflow step that builds
services (the block setting services="python-api" and the sed invocation) to use
a more robust regex-based edit against docker/php/.env so it matches the key
regardless of surrounding whitespace or quotes and sets its value to true (and
if the key is missing, append INDEX_ES_DURING_STARTUP=true to the file). Ensure
the change updates lines like INDEX_ES_DURING_STARTUP = 'false' or "false" and
is idempotent if the value is already true.
In `@docs/contributing/tests.md`:
- Around line 31-37: Update wording in the docs to use the proper product name
"Elasticsearch" instead of "elastic search" wherever it appears (e.g., the
sentence that currently reads "Some tests require elastic search indices to be
constructed.") and ensure the variable name INDEX_ES_DURING_STARTUP and
reference to `docker/php/.env` remain unchanged; simply replace "elastic search"
with "Elasticsearch" throughout the paragraph to correct casing and
compound-word usage.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 15336861-a4d5-4cac-bb7b-c44320c7156f
📒 Files selected for processing (7)
.github/workflows/tests.ymldocs/contributing/tests.mdsrc/database/datasets.pysrc/routers/openml/datasets.pysrc/routers/types.pytests/constants.pytests/routers/openml/dataset_tag_test.py
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (96.29%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #322 +/- ##
==========================================
+ Coverage 93.66% 93.69% +0.02%
==========================================
Files 67 68 +1
Lines 3109 3154 +45
Branches 221 223 +2
==========================================
+ Hits 2912 2955 +43
- Misses 138 139 +1
- Partials 59 60 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Previously there was a database startup service, which would cause docker compose up to exit with error code 1 (see also the referenced issue). But since this is no longer the case, we can just rely on normal tool exit code.
There is no status that depends on it, and it's not required to make sure that the `docker compose` command exits with code 1 if startup fails.
for more information, see https://pre-commit.ci
Rely on the database to identify if the dataset is absent or the tag already exists.