Skip to content

fix: wrap all production sqlite3.connect() in context managers (closes #17)#23

Merged
wpak-ai merged 3 commits into
masterfrom
fix/sqlite-context-managers-17
May 8, 2026
Merged

fix: wrap all production sqlite3.connect() in context managers (closes #17)#23
wpak-ai merged 3 commits into
masterfrom
fix/sqlite-context-managers-17

Conversation

@timon0305
Copy link
Copy Markdown
Collaborator

@timon0305 timon0305 commented May 7, 2026

Problem

19 production sqlite3.connect() call sites in this repo used a bare local variable with no with block, no try/finally, and no guaranteed .close() on exception. On a long-lived Flask process serving many requests, leaks accumulate before the user notices (file descriptors, SQLite read-locks held longer than needed).

Change

Two patterns, each applied where it fits best:

with closing(sqlite3.connect(...)) as conn: — short-scope (15 sites)

  • api/composers.py ×3
  • utils/cli_chat_reader.py ×2
  • utils/cursor_md_exporter.py ×1
  • api/workspaces.py ×2 (small helpers)
  • api/logs.py ×2
  • api/search.py ×2 (wconn + small conn)
  • api/export_api.py ×1 (wconn)

sqlite3.Connection's own context manager only commits / rolls back; it does not .close(). contextlib.closing(...) is what guarantees release.

try / except / finally: if conn is not None: conn.close() — long-scope (4 sites)

  • api/workspaces.py:get_workspace_tabs (~400 lines using global_db)
  • api/search.py:search ×2 deep blocks (~160-line and ~70-line scopes)
  • api/export_api.py:export_chats (~340 lines)
  • scripts/export.py:_conn (cleaned up the existing duplicate close-on-success and close-on-error pattern)

Same semantics as wrapping the whole function in with closing(...), without the 100-300-line indent shift.

@contextmanager on the helper that yields a connection — 1 site

api/workspaces.py:_open_global_db previously returned (conn, path) to callers, who were responsible for .close(). Converted to a @contextmanager so callers write with _open_global_db(...) as (conn, _):. Both call sites updated.

Test-side sqlite3.connect calls are out of scope (per-test fixture lifecycles, short-lived).

Test plan

  • pytest tests/137 / 137 OK (zero behaviour change in the test paths).
  • AST parses cleanly on all 8 modified files; from contextlib import closing[, contextmanager] correctly added everywhere it's used.
  • Static guard: every sqlite3.connect() call site classified and confirmed protected by either closing(), an outer try/finally block in the same function, or @contextmanager.
  • Live HTTP smoke against the running app:
    • GET / → HTTP 200
    • GET /api/workspaces → HTTP 200 (exercises with closing() + @contextmanager + small with closing() in _build_composer_id_to_workspace_id)
    • GET /api/logs → HTTP 200 (both with closing() paths)
    • GET /api/composers → HTTP 200 (all 3 with closing() sites)
    • GET /api/search?q=test&type=all → HTTP 200 (both try/finally sites + small with closing())
    • No Python tracebacks in the server log.

Closes #17.

Summary by CodeRabbit

  • Chores
    • Improved database connection handling across APIs, export flows, logs, search, workspace listing, and CLI utilities to ensure deterministic cleanup and reduce resource leaks.
  • Documentation
    • Added a "Tests" section to README with commands to run the Python test suite.
  • Tests
    • Updated a test module to run via Python's unittest command.

…#17)

19 production sqlite3.connect() call sites used a bare local variable
with no `with` block, no try/finally, and no guaranteed .close() on
exception. On a long-lived Flask process, leaks accumulate before the
user notices.

Two patterns applied:

`with closing(sqlite3.connect(...)) as conn:` — used at 15 short-scope
sites (composers ×3, cli_chat_reader ×2, cursor_md_exporter ×1,
workspaces ×2 small helpers, logs ×2, search ×2, export_api ×1).
sqlite3.Connection's own context manager only commits/rolls back, not
closes. closing() is what guarantees .close().

`try / except / finally: if conn is not None: conn.close()` — used at
4 long-scope sites (workspaces.get_workspace_tabs ~400 lines,
search.search ×2 deep blocks, export_api.export_chats ~340 lines,
scripts/export.py). Same cleanup guarantee, no 100-300-line indent
shift. Equivalent semantics to a body-wide `with closing(...)` block.

Plus utils/workspaces._open_global_db converted to @contextmanager so
its two callers write `with _open_global_db(...) as (conn, _):` and
the connection is released automatically.

Test-side sqlite3.connect calls are out of scope (per-test fixtures,
short-lived).

Verification:
- All 19 production sites classified and confirmed protected (no bare
  `sqlite3.connect()` outside a closing()/finally guard).
- pytest tests/ → 137 / 137 OK.
- AST clean on all 8 modified files; contextlib imports correctly
  added wherever used.
- Live HTTP smoke against the running app: home, /api/workspaces,
  /api/logs, /api/composers, /api/search all return 200 with no
  Python tracebacks in the log (exercises every wrapping pattern).
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR wraps production SQLite connections with contextlib.closing() and adds a _open_global_db context manager, ensuring deterministic closure of per-workspace and global DB connections across API endpoints, scripts, and utilities.

Changes

SQLite Connection Lifecycle Management

Layer / File(s) Summary
Contextlib Imports
api/composers.py, api/logs.py, api/search.py, api/workspaces.py
Added contextlib.closing/contextmanager imports to enable scoped connection cleanup.
Per-Workspace Connection Wrapping
api/composers.py, api/export_api.py, api/logs.py, api/search.py, api/workspaces.py, utils/cli_chat_reader.py, utils/cursor_md_exporter.py
Replaced manual sqlite3.connect(...)/conn.close() patterns with with closing(sqlite3.connect(...)) as conn: for per-workspace/read-only DB accesses.
Global Database Context Manager
api/workspaces.py
Introduced _open_global_db(workspace_path) as a @contextmanager that yields (conn, path) (or (None, path)) and guarantees closure in a finally block.
Global Database Integration
api/workspaces.py, api/composers.py, api/logs.py, api/search.py
Moved globalStorage/globalCursor reads into the _open_global_db() scoped context and replaced manual open/close usage in listing/tabs/logs/search fallbacks.
Exception-Safe Cleanup
api/export_api.py, api/search.py, scripts/export.py
Added conn = None initializations and finally blocks to ensure global/long-lived connections are closed on all exit paths (returns and exceptions).
Docs / Tests
README.md, tests/test_exclusion_rules.py
Added a "Tests" section to README and updated a test-run command to use python -m unittest for the module.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hop through code to tuck each conn,

with closing() I stop leaks anon.
Finally blocks snug them tight,
Now databases sleep at night,
Hooray — no file-descriptor brawns!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: wrap all production sqlite3.connect() in context managers (closes #17)' accurately and clearly describes the main change: wrapping SQLite connections with context managers to ensure proper resource cleanup.
Linked Issues check ✅ Passed The PR comprehensively addresses issue #17 by wrapping 19 production sqlite3.connect() sites across multiple files using contextlib.closing() for short-scope connections and explicit try/finally for long-scope operations.
Out of Scope Changes check ✅ Passed The PR includes a documentation update to README.md (adding a Tests section) and a test runner command change in test_exclusion_rules.py, which are minor and supportive changes that don't conflict with the primary resource-management objective.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/sqlite-context-managers-17

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@timon0305 timon0305 self-assigned this May 7, 2026
@timon0305 timon0305 requested review from bradjin8 and clean6378-max-it and removed request for clean6378-max-it May 7, 2026 22:03
Comment thread api/workspaces.py
Comment thread api/workspaces.py
Comment thread scripts/export.py
…est in README

- Route GET /api/workspaces/<id>/tabs through the existing _open_global_db
  context manager instead of a duplicate sqlite3.connect + outer finally,
  matching list_workspaces and closing issue #17 review follow-ups.
- README: add Tests section with python -m unittest discover tests -v.
- test_exclusion_rules: docstring prefers unittest over pytest.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
api/workspaces.py (1)

1031-1066: ⚡ Quick win

Parse messageRequestContext once.

This block runs the same messageRequestContext:% query twice and JSON-decodes each row twice just to populate two maps. Folding both structures into one pass would remove a full-table scan from a hot endpoint with no behavior change.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/workspaces.py` around lines 1031 - 1066, The code currently queries
"cursorDiskKV" for keys like "messageRequestContext:%" twice and JSON-decodes
each row twice; combine these loops by iterating global_db.execute("SELECT key,
value FROM cursorDiskKV WHERE key LIKE 'messageRequestContext:%'") once,
JSON.parse row["value"] a single time, then populate both
message_request_context_map (append the ctx + "contextId") and
project_layouts_map (extract ctx.get("projectLayouts"), JSON-decode each string
layout and append layout["rootPath"] when valid) inside the same try/except
block; keep the existing checks (parts length, isinstance checks) and existing
error handling (except Exception: pass) but remove the duplicated second loop
and its JSON loads.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/workspaces.py`:
- Around line 163-166: The read-only SQLite URI is being built by interpolating
local_db_path directly into the "file:" string which breaks when the path
contains reserved URI chars; update the sqlite3.connect call to construct a
proper file URI by converting local_db_path to a file URI (use
pathlib.Path(local_db_path).as_uri()) and then append "?mode=ro" before passing
it to sqlite3.connect(..., uri=True) so the with closing(... ) block and
subsequent lconn.execute(...) use a correctly escaped URI.

---

Nitpick comments:
In `@api/workspaces.py`:
- Around line 1031-1066: The code currently queries "cursorDiskKV" for keys like
"messageRequestContext:%" twice and JSON-decodes each row twice; combine these
loops by iterating global_db.execute("SELECT key, value FROM cursorDiskKV WHERE
key LIKE 'messageRequestContext:%'") once, JSON.parse row["value"] a single
time, then populate both message_request_context_map (append the ctx +
"contextId") and project_layouts_map (extract ctx.get("projectLayouts"),
JSON-decode each string layout and append layout["rootPath"] when valid) inside
the same try/except block; keep the existing checks (parts length, isinstance
checks) and existing error handling (except Exception: pass) but remove the
duplicated second loop and its JSON loads.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 692dc39b-ea04-4cc5-8160-a8bb6e3fb5b0

📥 Commits

Reviewing files that changed from the base of the PR and between c72b305 and c4a1059.

📒 Files selected for processing (3)
  • README.md
  • api/workspaces.py
  • tests/test_exclusion_rules.py
✅ Files skipped from review due to trivial changes (2)
  • README.md
  • tests/test_exclusion_rules.py

Comment thread api/workspaces.py Outdated
…n local_db_path contains characters that must be escaped in URIs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
api/workspaces.py (1)

480-482: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Escape the remaining read-only SQLite URIs too.

Line 481 and Line 512 still build file: URIs by concatenating raw filesystem paths. That leaves _build_composer_id_to_workspace_id() and _open_global_db() exposed to the same ? / # / space path bug you already fixed at Line 166, so workspaces under such paths can still fail to open.

Suggested fix
 def _build_composer_id_to_workspace_id(workspace_path: str, workspace_entries: list) -> dict:
     """Build mapping: composerId -> workspaceId from per-workspace state.vscdb."""
     mapping = {}
     for entry in workspace_entries:
         db_path = os.path.join(workspace_path, entry["name"], "state.vscdb")
         if not os.path.isfile(db_path):
             continue
         try:
             # closing() guarantees .close() on scope exit (issue `#17`).
-            with closing(sqlite3.connect(f"file:{db_path}?mode=ro", uri=True)) as conn:
+            db_uri = Path(db_path).resolve().as_uri() + "?mode=ro"
+            with closing(sqlite3.connect(db_uri, uri=True)) as conn:
                 row = conn.execute(
                     "SELECT value FROM ItemTable WHERE [key] = 'composer.composerData'"
                 ).fetchone()
@@
     if not os.path.isfile(global_db_path):
         yield None, global_db_path
         return
-    conn = sqlite3.connect(f"file:{global_db_path}?mode=ro", uri=True)
+    global_db_uri = Path(global_db_path).resolve().as_uri() + "?mode=ro"
+    conn = sqlite3.connect(global_db_uri, uri=True)
In Python's sqlite3 module, when uri=True is used, should filesystem paths be converted to proper file: URIs (for example via pathlib.Path(...).resolve().as_uri()) before appending ?mode=ro so reserved characters like ? and # are escaped?

Also applies to: 512-512

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/workspaces.py` around lines 480 - 482, Replace the raw string
concatenation that builds SQLite file: URIs (e.g.,
sqlite3.connect(f"file:{db_path}?mode=ro", uri=True)) with a proper file URI
derived from the filesystem path to escape reserved characters: use
pathlib.Path(db_path).resolve().as_uri() to produce the base file://... URI and
then append the query string (e.g., ?mode=ro) before calling
sqlite3.connect(..., uri=True); apply this change in both
_build_composer_id_to_workspace_id and _open_global_db so paths containing ?, #,
spaces, etc. are correctly escaped.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@api/workspaces.py`:
- Around line 480-482: Replace the raw string concatenation that builds SQLite
file: URIs (e.g., sqlite3.connect(f"file:{db_path}?mode=ro", uri=True)) with a
proper file URI derived from the filesystem path to escape reserved characters:
use pathlib.Path(db_path).resolve().as_uri() to produce the base file://... URI
and then append the query string (e.g., ?mode=ro) before calling
sqlite3.connect(..., uri=True); apply this change in both
_build_composer_id_to_workspace_id and _open_global_db so paths containing ?, #,
spaces, etc. are correctly escaped.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7a052282-ce3f-4659-8d41-18a44c09aeab

📥 Commits

Reviewing files that changed from the base of the PR and between c4a1059 and e10099a.

📒 Files selected for processing (1)
  • api/workspaces.py

@bradjin8 bradjin8 requested a review from wpak-ai May 8, 2026 18:38
@wpak-ai wpak-ai merged commit 634fcb8 into master May 8, 2026
1 check passed
@wpak-ai wpak-ai deleted the fix/sqlite-context-managers-17 branch May 8, 2026 18:43
timon0305 added a commit that referenced this pull request May 8, 2026
README.md: combine both adjacent additions to the Quick Start area —
keep our FLASK_DEBUG opt-in note above the new Tests section that
landed on master via PR #23.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Resource leak: wrap all production sqlite3.connect() calls in context managers

3 participants