feat(rag-embedding): add RAG embedding function with chunking strategies#34
feat(rag-embedding): add RAG embedding function with chunking strategies#34theothersideofgod wants to merge 10 commits into
Conversation
Add new serverless function for generating text embeddings: - handler.ts: calls @agentic-kit/ollama to generate embeddings - Supports custom model via param or EMBEDDING_MODEL env var - DRY_RUN mode for CI testing (returns mock 768-dim embedding) - Unit tests with Jest mock for @agentic-kit/ollama - Registered in job-service for job queue processing Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add Kubernetes infrastructure for Ollama: - ollama.yaml: Deployment + Service for Ollama server - ollama-pull-job.yaml: Job to pre-pull nomic-embed-text model - config.yaml: OLLAMA_URL, EMBEDDING_MODEL, TEXT_EMBEDDING_DRY_RUN - skaffold.yaml: Add Ollama to text-embedding and local-simple profiles only (not shared kustomization, to avoid breaking non-AI functions in CI) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add end-to-end test that verifies: - Job insertion into app_jobs.jobs queue - Job service picks up and dispatches to text-embedding - Function processes and completes the job - Custom model parameter handling Uses DRY_RUN mode in CI (no real Ollama needed). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ding - Add rag-embedding function with chunking strategies (fixed, sentence, paragraph) - Fix overlap logic in chunkBySentence and chunkByParagraph - Replace text-embedding with rag-embedding in job service registry - Add RAG_EMBEDDING_DRY_RUN config for testing - Update e2e tests for rag-embedding Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Required for rag-embedding GraphQL query support. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace text-embedding with rag-embedding in skaffold profiles. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
|
Warning Review the following alerts detected in dependencies. According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.
|
- Add mock for @constructive-io/graphql-query to avoid dependency chain issues - Fix test payloads to use correct trigger format (table, schema, id, chunks_table) - Add proper table metadata with relations in mock Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a new rag-embedding knative job function to chunk record text (fixed/sentence/paragraph) and generate embeddings (via Ollama, with a dry-run mode), and wires it into the local dev + job-service setup along with new E2E coverage.
Changes:
- Introduce
functions/rag-embeddinghandler + unit tests and add job-service registry support for the new function. - Add E2E SDK utilities and a new
rag-embeddingE2E test that provisions a DB/schema and validates chunk creation. - Add local-simple Kubernetes + Skaffold support for running Ollama and the
rag-embeddingfunction locally (plus a newpackages/text-chunkerutility package).
Reviewed changes
Copilot reviewed 20 out of 21 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/e2e/utils/sdk.ts | Adds SDK helpers for auth/provisioning/schema/table/embedding_chunk setup used by E2E tests. |
| tests/e2e/utils/jobs.ts | Adds waitForJobCreated polling helper for trigger-created jobs. |
| tests/e2e/tests/rag-embedding.e2e.test.ts | New end-to-end flow validating rag-embedding job execution and trigger behavior. |
| tests/mocks/@agentic-kit/ollama.ts | Jest mock for Ollama embedding generation. |
| templates/node-sql/package.json | Adds a node-sql template package manifest for generated functions. |
| templates/node-sql/index.ts | Adds node-sql template server wiring (pool + withUserContext). |
| skaffold.yaml | Adds a rag-embedding profile and updates port-forwards for local dev. |
| packages/text-chunker/tsconfig.json | New TS config for the text chunker utility package. |
| packages/text-chunker/src/index.ts | Implements reusable chunking strategies (fixed/sentence/paragraph). |
| packages/text-chunker/package.json | Adds text-chunker workspace package definition/scripts. |
| package.json | Adds @constructive-io/node for E2E SDK usage and minor devDependency reshuffle. |
| k8s/overlays/local-simple/ollama.yaml | Adds Ollama Deployment/Service intended for local-simple cluster usage. |
| k8s/overlays/local-simple/ollama-pull-job.yaml | Adds a Job to pre-pull the embedding model into Ollama. |
| k8s/overlays/local-simple/config.yaml | Adds RAG_EMBEDDING_DRY_RUN + Ollama/model env config entries. |
| job/service/src/types.ts | Extends FunctionName union to include rag-embedding. |
| job/service/src/index.ts | Registers rag-embedding module name + default port in job service. |
| jest.config.ts | Maps @agentic-kit/ollama to the new Jest mock. |
| functions/rag-embedding/handler.ts | New RAG embedding handler: introspects schema, chunks content, generates embeddings, inserts chunk rows. |
| functions/rag-embedding/handler.json | Declares the new function (type/port/deps) for generation/runtime. |
| functions/rag-embedding/tests/handler.test.ts | Adds unit tests for the new handler (currently misaligned with handler’s param contract). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const result = await handler( | ||
| { | ||
| table_name: 'article', | ||
| record_id: 'test-id', | ||
| }, | ||
| context | ||
| ); |
There was a problem hiding this comment.
Unit tests are invoking the handler with legacy param names (e.g. table_name, record_id, content_field) but functions/rag-embedding/handler.ts currently requires the trigger payload shape (table, schema, id, chunks_table, etc.). As written, these tests will fail early with "Missing required params". Update the test inputs (and mocked GraphQL responses) to match the handler’s expected payload, or add a backward-compatible param mapping in the handler.
| } | ||
|
|
||
| function chunkBySentence(text: string, chunkSize: number, chunkOverlap: number): string[] { | ||
| const sentences = text.match(/[^.!?]+[.!?]+/g) || [text]; |
There was a problem hiding this comment.
chunkBySentence uses /[^.!?]+[.!?]+/g, which drops any trailing text that doesn’t end with punctuation (e.g. "Hello. Last sentence" loses "Last sentence"). This causes data loss in chunking. Include the trailing remainder after the last regex match (or switch to a sentence-splitting approach that retains final fragments).
| const sentences = text.match(/[^.!?]+[.!?]+/g) || [text]; | |
| const sentences = text.match(/[^.!?]+(?:[.!?]+|$)/g) || [text]; |
| ORDER BY chunk_index`, | ||
| [articleId] | ||
| ); | ||
|
|
There was a problem hiding this comment.
This E2E test doesn’t assert any post-conditions after querying the chunks table; it only logs the row count. That means it can pass even if no chunks were created for the updated content. Add an expectation (e.g., rows.length > 0 and/or that chunk content includes the updated text) to actually validate the behavior under test.
| expect(chunks.rows.length).toBeGreaterThan(0); | |
| expect(chunks.rows.some((chunk) => chunk.content.includes(updatedContent))).toBe(true); |
| apiVersion: apps/v1 | ||
| kind: Deployment | ||
| metadata: | ||
| name: ollama | ||
| labels: | ||
| app: ollama | ||
| spec: | ||
| replicas: 1 | ||
| selector: | ||
| matchLabels: | ||
| app: ollama | ||
| template: | ||
| metadata: | ||
| labels: | ||
| app: ollama | ||
| spec: | ||
| containers: | ||
| - name: ollama |
There was a problem hiding this comment.
These Ollama resources won’t be applied by k8s/overlays/local-simple unless they’re added to kustomization.yaml (Kustomize doesn’t auto-discover files). If the intent is to run Ollama in the local-simple profile, add ./ollama.yaml (and the pull job, if desired) to the overlay’s resources: list.
| apiVersion: batch/v1 | ||
| kind: Job | ||
| metadata: | ||
| name: ollama-pull-model | ||
| spec: | ||
| ttlSecondsAfterFinished: 300 | ||
| template: | ||
| spec: | ||
| restartPolicy: OnFailure | ||
| initContainers: | ||
| - name: wait-for-ollama | ||
| image: busybox:1.36 | ||
| command: ['sh', '-c', 'until nc -z ollama 11434; do echo "Waiting for Ollama..."; sleep 2; done'] | ||
| containers: |
There was a problem hiding this comment.
This model-pull Job manifest also needs to be referenced from k8s/overlays/local-simple/kustomization.yaml to actually run in the local-simple environment. If it’s optional, consider documenting how/when it should be applied (or keep it out of the overlay entirely and apply manually).
| "clean": "rimraf dist" | ||
| }, | ||
| "devDependencies": { | ||
| "@types/node": "^22.10.4", |
There was a problem hiding this comment.
packages/text-chunker defines "clean": "rimraf dist" but rimraf isn’t listed in this package’s devDependencies and doesn’t appear to be present in the repo lockfile. As-is, pnpm -r run clean will fail. Add rimraf as a (dev) dependency somewhere that’s guaranteed to be installed (package-local or workspace root), or change the clean script to use a tool that’s already available.
| "@types/node": "^22.10.4", | |
| "@types/node": "^22.10.4", | |
| "rimraf": "^6.0.1", |
rag-embedding e2e tests require SDK auth via constructive-server. Add port-forward for svc/constructive-server on port 3002. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 22 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let currentChunk = ''; | ||
|
|
||
| for (const paragraph of paragraphs) { | ||
| const trimmed = paragraph.trim(); | ||
| if ((currentChunk + '\n\n' + trimmed).length > chunkSize && currentChunk.length > 0) { | ||
| chunks.push(currentChunk.trim()); | ||
| currentChunk = trimmed; | ||
| } else { | ||
| currentChunk = currentChunk ? currentChunk + '\n\n' + trimmed : trimmed; | ||
| } | ||
| } | ||
|
|
||
| if (currentChunk.trim().length > 0) { | ||
| chunks.push(currentChunk.trim()); |
There was a problem hiding this comment.
chunkByParagraph takes chunkOverlap but never uses it when splitting across paragraph boundaries (it just starts the next chunk at the next paragraph). This means the paragraph strategy does not actually provide overlap except when a chunk is later re-split by chunkByFixed. If overlap is a supported/advertised option for paragraph chunking, add overlap handling similar to the sentence strategy (carry forward prior paragraphs up to the overlap budget).
| let currentChunk = ''; | |
| for (const paragraph of paragraphs) { | |
| const trimmed = paragraph.trim(); | |
| if ((currentChunk + '\n\n' + trimmed).length > chunkSize && currentChunk.length > 0) { | |
| chunks.push(currentChunk.trim()); | |
| currentChunk = trimmed; | |
| } else { | |
| currentChunk = currentChunk ? currentChunk + '\n\n' + trimmed : trimmed; | |
| } | |
| } | |
| if (currentChunk.trim().length > 0) { | |
| chunks.push(currentChunk.trim()); | |
| const separator = '\n\n'; | |
| let currentParagraphs: string[] = []; | |
| const joinParagraphs = (items: string[]): string => items.join(separator); | |
| const getJoinedLength = (items: string[]): number => joinParagraphs(items).length; | |
| for (const paragraph of paragraphs) { | |
| const trimmed = paragraph.trim(); | |
| const candidateParagraphs = [...currentParagraphs, trimmed]; | |
| if (getJoinedLength(candidateParagraphs) > chunkSize && currentParagraphs.length > 0) { | |
| chunks.push(joinParagraphs(currentParagraphs).trim()); | |
| const overlapParagraphs: string[] = []; | |
| let overlapLength = 0; | |
| for (let i = currentParagraphs.length - 1; i >= 0; i--) { | |
| const candidate = currentParagraphs[i]; | |
| const additionalLength = | |
| candidate.length + (overlapParagraphs.length > 0 ? separator.length : 0); | |
| if (overlapLength + additionalLength > chunkOverlap) { | |
| break; | |
| } | |
| overlapParagraphs.unshift(candidate); | |
| overlapLength += additionalLength; | |
| } | |
| let nextParagraphs = [...overlapParagraphs, trimmed]; | |
| while ( | |
| overlapParagraphs.length > 0 && | |
| getJoinedLength(nextParagraphs) > chunkSize && | |
| trimmed.length <= chunkSize | |
| ) { | |
| overlapParagraphs.shift(); | |
| nextParagraphs = [...overlapParagraphs, trimmed]; | |
| } | |
| currentParagraphs = nextParagraphs; | |
| } else { | |
| currentParagraphs = candidateParagraphs; | |
| } | |
| } | |
| if (currentParagraphs.length > 0) { | |
| chunks.push(joinParagraphs(currentParagraphs).trim()); |
| apiVersion: apps/v1 | ||
| kind: Deployment | ||
| metadata: | ||
| name: ollama | ||
| labels: | ||
| app: ollama | ||
| spec: | ||
| replicas: 1 | ||
| selector: | ||
| matchLabels: | ||
| app: ollama | ||
| template: | ||
| metadata: | ||
| labels: | ||
| app: ollama | ||
| spec: | ||
| containers: | ||
| - name: ollama | ||
| image: ollama/ollama:latest | ||
| ports: |
There was a problem hiding this comment.
These Ollama manifests are added under k8s/overlays/local-simple/, but k8s/overlays/local-simple/kustomization.yaml currently doesn't include ollama.yaml (or ollama-pull-job.yaml) in its resources: list. As a result, skaffold run -p ... with the local-simple overlay will not deploy Ollama, despite OLLAMA_URL pointing at it. Add the new resources to the kustomization or document that they must be applied separately.
| afterAll(async () => { | ||
| if (pg) { | ||
| await deleteTestJobs(pg, TEST_PREFIX); | ||
| // Clean up test database to prevent OOM from accumulated schema caches |
There was a problem hiding this comment.
deleteTestJobs(pg, TEST_PREFIX) won't clean up any jobs created by this suite, because the test creates jobs with task_identifier 'rag-embedding' (and trigger-created jobs will also be 'rag-embedding'). This will leave rows in app_jobs.jobs after the test run and can cause cross-test interference. Use a prefix that matches the task identifier(s) created here, or change the added jobs to use the prefix consistently (and update waitForJobCreated accordingly).
| // Chunking strategies | ||
| function chunkText( | ||
| text: string, | ||
| chunkSize: number, | ||
| chunkOverlap: number, | ||
| strategy: string | ||
| ): string[] { | ||
| if (!text || text.trim().length === 0) { | ||
| return []; | ||
| } | ||
|
|
||
| switch (strategy) { | ||
| case 'sentence': | ||
| return chunkBySentence(text, chunkSize, chunkOverlap); | ||
| case 'paragraph': | ||
| return chunkByParagraph(text, chunkSize, chunkOverlap); | ||
| case 'semantic': | ||
| // Fallback to fixed for now - semantic requires more sophisticated handling | ||
| return chunkByFixed(text, chunkSize, chunkOverlap); | ||
| case 'fixed': | ||
| default: | ||
| return chunkByFixed(text, chunkSize, chunkOverlap); | ||
| } | ||
| } |
There was a problem hiding this comment.
Chunking logic is duplicated here even though this PR introduces @constructive-io/text-chunker with the same strategies. Keeping two implementations will likely diverge (and they already differ for sentence trailing handling). Prefer importing and using the shared chunker package from the handler, or remove the unused package if it's not intended to be used yet.
| // 2. Delete existing chunks (for re-chunking) | ||
| try { | ||
| // Query existing chunks using buildSelect | ||
| const getChunksQuery = buildSelect(chunksTable, tables, { | ||
| where: {}, | ||
| fieldSelection: { select: ['id'] } | ||
| }); | ||
| const chunksResult = await client.request<{ | ||
| [key: string]: { nodes: Array<{ id: string }> } | null | ||
| }>(getChunksQuery.toString(), { where: { [parentFkFieldName]: { equalTo: id } } }, schemaHeaders); | ||
|
|
||
| const existingChunks = chunksResult[chunksFieldNamePlural!]?.nodes || []; | ||
|
|
||
| // Delete each chunk by ID using buildPostGraphileDelete | ||
| const deleteMutation = buildPostGraphileDelete(chunksTable, tables); | ||
| for (const chunk of existingChunks) { | ||
| await client.request(deleteMutation.toString(), { input: { id: chunk.id } }, schemaHeaders); | ||
| } | ||
|
|
||
| if (existingChunks.length > 0) { | ||
| log.info('[generate-chunks] Deleted existing chunks', { count: existingChunks.length, parentId: id }); | ||
| } | ||
| } catch (err) { | ||
| // Ignore if no chunks exist or delete mutation doesn't exist | ||
| log.info('[generate-chunks] No existing chunks to delete or delete not available', { error: String(err) }); | ||
| } |
There was a problem hiding this comment.
The delete/re-chunk step catches all errors and proceeds, which can silently leave stale chunks and then insert additional rows (duplicates) if deletion fails for reasons other than "no chunks" (e.g., auth/permission or GraphQL errors). Consider only treating the specific "not found"/"mutation missing" cases as non-fatal, and failing the job for other deletion errors so the system doesn't accumulate incorrect chunk state.
023d140 to
86e812a
Compare
|
Heads-up: repository history was rewritten on 2026-05-12 to scrub leaked secrets (Postgres/pgAdmin default passwords, an AWS access key ID, generated This PR shows "DIRTY" / merge-conflict status because git fetch --all --prune
git checkout <this-branch>
git reset --hard origin/<this-branch> # your local branch tip moved too — pick up the rewritten version
git rebase origin/main # rebase onto rewritten main
# resolve conflicts (usually trivial — mostly secret-placeholder + the 4 deleted interweb-*.yaml files)
git push --force-with-leaseOr merge instead of rebase if the branch is shared with others: git merge origin/main
git pushNotes:
|
Summary
rag-embeddingfunction to replacetext-embeddingwith improved chunking supportfixed,sentence,paragraphwith configurable overlapchunkBySentenceandchunkByParagraphrag-embeddinginstead oftext-embeddingRAG_EMBEDDING_DRY_RUNconfig for testing without actual embedding callsTest plan
pnpm test:unitpnpm dev && pnpm test:e2e🤖 Generated with Claude Code