Skip to content

feat(rag-embedding): add RAG embedding function with chunking strategies#34

Open
theothersideofgod wants to merge 10 commits into
mainfrom
feat/rag-embedding
Open

feat(rag-embedding): add RAG embedding function with chunking strategies#34
theothersideofgod wants to merge 10 commits into
mainfrom
feat/rag-embedding

Conversation

@theothersideofgod
Copy link
Copy Markdown
Contributor

Summary

  • Add rag-embedding function to replace text-embedding with improved chunking support
  • Support three chunking strategies: fixed, sentence, paragraph with configurable overlap
  • Fix overlap logic bugs in chunkBySentence and chunkByParagraph
  • Update job service registry to use rag-embedding instead of text-embedding
  • Add RAG_EMBEDDING_DRY_RUN config for testing without actual embedding calls

Test plan

  • Run unit tests: pnpm test:unit
  • Run e2e tests with local environment: pnpm dev && pnpm test:e2e
  • Verify chunking strategies work correctly with different text inputs
  • Verify overlap logic produces expected results

🤖 Generated with Claude Code

theothersideofgod and others added 8 commits April 24, 2026 11:59
Add new serverless function for generating text embeddings:

- handler.ts: calls @agentic-kit/ollama to generate embeddings
- Supports custom model via param or EMBEDDING_MODEL env var
- DRY_RUN mode for CI testing (returns mock 768-dim embedding)
- Unit tests with Jest mock for @agentic-kit/ollama
- Registered in job-service for job queue processing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add Kubernetes infrastructure for Ollama:

- ollama.yaml: Deployment + Service for Ollama server
- ollama-pull-job.yaml: Job to pre-pull nomic-embed-text model
- config.yaml: OLLAMA_URL, EMBEDDING_MODEL, TEXT_EMBEDDING_DRY_RUN
- skaffold.yaml: Add Ollama to text-embedding and local-simple profiles only
  (not shared kustomization, to avoid breaking non-AI functions in CI)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add end-to-end test that verifies:

- Job insertion into app_jobs.jobs queue
- Job service picks up and dispatches to text-embedding
- Function processes and completes the job
- Custom model parameter handling

Uses DRY_RUN mode in CI (no real Ollama needed).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ding

- Add rag-embedding function with chunking strategies (fixed, sentence, paragraph)
- Fix overlap logic in chunkBySentence and chunkByParagraph
- Replace text-embedding with rag-embedding in job service registry
- Add RAG_EMBEDDING_DRY_RUN config for testing
- Update e2e tests for rag-embedding

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Required for rag-embedding GraphQL query support.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace text-embedding with rag-embedding in skaffold profiles.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 28, 2026 02:34
@socket-security
Copy link
Copy Markdown

socket-security Bot commented Apr 28, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addednpm/​@​pgpmjs/​logger@​2.6.07310010095100
Addednpm/​@​agentic-kit/​ollama@​1.0.37310010088100
Addednpm/​@​pgpmjs/​env@​2.18.0741009796100
Addednpm/​@​constructive-io/​node@​0.10.107410010098100
Addednpm/​@​constructive-io/​graphql-query@​3.14.87710010098100

View full report

@socket-security
Copy link
Copy Markdown

socket-security Bot commented Apr 28, 2026

Warning

Review the following alerts detected in dependencies.

According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.

Action Severity Alert  (click "▶" to expand/collapse)
Warn High
Obfuscated code: npm markdown-it is 91.0% likely obfuscated

Confidence: 0.91

Location: Package overview

From: pnpm-lock.yamlnpm/@constructive-io/graphql-query@3.14.8npm/markdown-it@14.1.1

ℹ Read more on: This package | This alert | What is obfuscated code?

Next steps: Take a moment to review the security alert above. Review the linked package source code to understand the potential risk. Ensure the package is not malicious before proceeding. If you're unsure how to proceed, reach out to your security team or ask the Socket team for help at support@socket.dev.

Suggestion: Packages should not obfuscate their code. Consider not using packages with obfuscated code.

Mark the package as acceptable risk. To ignore this alert only in this pull request, reply with the comment @SocketSecurity ignore npm/markdown-it@14.1.1. You can also ignore all packages with @SocketSecurity ignore-all. To ignore an alert for all future pull requests, use Socket's Dashboard to change the triage state of this alert.

View full report

- Add mock for @constructive-io/graphql-query to avoid dependency chain issues
- Fix test payloads to use correct trigger format (table, schema, id, chunks_table)
- Add proper table metadata with relations in mock

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new rag-embedding knative job function to chunk record text (fixed/sentence/paragraph) and generate embeddings (via Ollama, with a dry-run mode), and wires it into the local dev + job-service setup along with new E2E coverage.

Changes:

  • Introduce functions/rag-embedding handler + unit tests and add job-service registry support for the new function.
  • Add E2E SDK utilities and a new rag-embedding E2E test that provisions a DB/schema and validates chunk creation.
  • Add local-simple Kubernetes + Skaffold support for running Ollama and the rag-embedding function locally (plus a new packages/text-chunker utility package).

Reviewed changes

Copilot reviewed 20 out of 21 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/e2e/utils/sdk.ts Adds SDK helpers for auth/provisioning/schema/table/embedding_chunk setup used by E2E tests.
tests/e2e/utils/jobs.ts Adds waitForJobCreated polling helper for trigger-created jobs.
tests/e2e/tests/rag-embedding.e2e.test.ts New end-to-end flow validating rag-embedding job execution and trigger behavior.
tests/mocks/@agentic-kit/ollama.ts Jest mock for Ollama embedding generation.
templates/node-sql/package.json Adds a node-sql template package manifest for generated functions.
templates/node-sql/index.ts Adds node-sql template server wiring (pool + withUserContext).
skaffold.yaml Adds a rag-embedding profile and updates port-forwards for local dev.
packages/text-chunker/tsconfig.json New TS config for the text chunker utility package.
packages/text-chunker/src/index.ts Implements reusable chunking strategies (fixed/sentence/paragraph).
packages/text-chunker/package.json Adds text-chunker workspace package definition/scripts.
package.json Adds @constructive-io/node for E2E SDK usage and minor devDependency reshuffle.
k8s/overlays/local-simple/ollama.yaml Adds Ollama Deployment/Service intended for local-simple cluster usage.
k8s/overlays/local-simple/ollama-pull-job.yaml Adds a Job to pre-pull the embedding model into Ollama.
k8s/overlays/local-simple/config.yaml Adds RAG_EMBEDDING_DRY_RUN + Ollama/model env config entries.
job/service/src/types.ts Extends FunctionName union to include rag-embedding.
job/service/src/index.ts Registers rag-embedding module name + default port in job service.
jest.config.ts Maps @agentic-kit/ollama to the new Jest mock.
functions/rag-embedding/handler.ts New RAG embedding handler: introspects schema, chunks content, generates embeddings, inserts chunk rows.
functions/rag-embedding/handler.json Declares the new function (type/port/deps) for generation/runtime.
functions/rag-embedding/tests/handler.test.ts Adds unit tests for the new handler (currently misaligned with handler’s param contract).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +59 to +65
const result = await handler(
{
table_name: 'article',
record_id: 'test-id',
},
context
);
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit tests are invoking the handler with legacy param names (e.g. table_name, record_id, content_field) but functions/rag-embedding/handler.ts currently requires the trigger payload shape (table, schema, id, chunks_table, etc.). As written, these tests will fail early with "Missing required params". Update the test inputs (and mocked GraphQL responses) to match the handler’s expected payload, or add a backward-compatible param mapping in the handler.

Copilot uses AI. Check for mistakes.
}

function chunkBySentence(text: string, chunkSize: number, chunkOverlap: number): string[] {
const sentences = text.match(/[^.!?]+[.!?]+/g) || [text];
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chunkBySentence uses /[^.!?]+[.!?]+/g, which drops any trailing text that doesn’t end with punctuation (e.g. "Hello. Last sentence" loses "Last sentence"). This causes data loss in chunking. Include the trailing remainder after the last regex match (or switch to a sentence-splitting approach that retains final fragments).

Suggested change
const sentences = text.match(/[^.!?]+[.!?]+/g) || [text];
const sentences = text.match(/[^.!?]+(?:[.!?]+|$)/g) || [text];

Copilot uses AI. Check for mistakes.
ORDER BY chunk_index`,
[articleId]
);

Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This E2E test doesn’t assert any post-conditions after querying the chunks table; it only logs the row count. That means it can pass even if no chunks were created for the updated content. Add an expectation (e.g., rows.length > 0 and/or that chunk content includes the updated text) to actually validate the behavior under test.

Suggested change
expect(chunks.rows.length).toBeGreaterThan(0);
expect(chunks.rows.some((chunk) => chunk.content.includes(updatedContent))).toBe(true);

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +18
apiVersion: apps/v1
kind: Deployment
metadata:
name: ollama
labels:
app: ollama
spec:
replicas: 1
selector:
matchLabels:
app: ollama
template:
metadata:
labels:
app: ollama
spec:
containers:
- name: ollama
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These Ollama resources won’t be applied by k8s/overlays/local-simple unless they’re added to kustomization.yaml (Kustomize doesn’t auto-discover files). If the intent is to run Ollama in the local-simple profile, add ./ollama.yaml (and the pull job, if desired) to the overlay’s resources: list.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +14
apiVersion: batch/v1
kind: Job
metadata:
name: ollama-pull-model
spec:
ttlSecondsAfterFinished: 300
template:
spec:
restartPolicy: OnFailure
initContainers:
- name: wait-for-ollama
image: busybox:1.36
command: ['sh', '-c', 'until nc -z ollama 11434; do echo "Waiting for Ollama..."; sleep 2; done']
containers:
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This model-pull Job manifest also needs to be referenced from k8s/overlays/local-simple/kustomization.yaml to actually run in the local-simple environment. If it’s optional, consider documenting how/when it should be applied (or keep it out of the overlay entirely and apply manually).

Copilot uses AI. Check for mistakes.
"clean": "rimraf dist"
},
"devDependencies": {
"@types/node": "^22.10.4",
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

packages/text-chunker defines "clean": "rimraf dist" but rimraf isn’t listed in this package’s devDependencies and doesn’t appear to be present in the repo lockfile. As-is, pnpm -r run clean will fail. Add rimraf as a (dev) dependency somewhere that’s guaranteed to be installed (package-local or workspace root), or change the clean script to use a tool that’s already available.

Suggested change
"@types/node": "^22.10.4",
"@types/node": "^22.10.4",
"rimraf": "^6.0.1",

Copilot uses AI. Check for mistakes.
rag-embedding e2e tests require SDK auth via constructive-server.
Add port-forward for svc/constructive-server on port 3002.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 28, 2026 03:38
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 22 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +99 to +112
let currentChunk = '';

for (const paragraph of paragraphs) {
const trimmed = paragraph.trim();
if ((currentChunk + '\n\n' + trimmed).length > chunkSize && currentChunk.length > 0) {
chunks.push(currentChunk.trim());
currentChunk = trimmed;
} else {
currentChunk = currentChunk ? currentChunk + '\n\n' + trimmed : trimmed;
}
}

if (currentChunk.trim().length > 0) {
chunks.push(currentChunk.trim());
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chunkByParagraph takes chunkOverlap but never uses it when splitting across paragraph boundaries (it just starts the next chunk at the next paragraph). This means the paragraph strategy does not actually provide overlap except when a chunk is later re-split by chunkByFixed. If overlap is a supported/advertised option for paragraph chunking, add overlap handling similar to the sentence strategy (carry forward prior paragraphs up to the overlap budget).

Suggested change
let currentChunk = '';
for (const paragraph of paragraphs) {
const trimmed = paragraph.trim();
if ((currentChunk + '\n\n' + trimmed).length > chunkSize && currentChunk.length > 0) {
chunks.push(currentChunk.trim());
currentChunk = trimmed;
} else {
currentChunk = currentChunk ? currentChunk + '\n\n' + trimmed : trimmed;
}
}
if (currentChunk.trim().length > 0) {
chunks.push(currentChunk.trim());
const separator = '\n\n';
let currentParagraphs: string[] = [];
const joinParagraphs = (items: string[]): string => items.join(separator);
const getJoinedLength = (items: string[]): number => joinParagraphs(items).length;
for (const paragraph of paragraphs) {
const trimmed = paragraph.trim();
const candidateParagraphs = [...currentParagraphs, trimmed];
if (getJoinedLength(candidateParagraphs) > chunkSize && currentParagraphs.length > 0) {
chunks.push(joinParagraphs(currentParagraphs).trim());
const overlapParagraphs: string[] = [];
let overlapLength = 0;
for (let i = currentParagraphs.length - 1; i >= 0; i--) {
const candidate = currentParagraphs[i];
const additionalLength =
candidate.length + (overlapParagraphs.length > 0 ? separator.length : 0);
if (overlapLength + additionalLength > chunkOverlap) {
break;
}
overlapParagraphs.unshift(candidate);
overlapLength += additionalLength;
}
let nextParagraphs = [...overlapParagraphs, trimmed];
while (
overlapParagraphs.length > 0 &&
getJoinedLength(nextParagraphs) > chunkSize &&
trimmed.length <= chunkSize
) {
overlapParagraphs.shift();
nextParagraphs = [...overlapParagraphs, trimmed];
}
currentParagraphs = nextParagraphs;
} else {
currentParagraphs = candidateParagraphs;
}
}
if (currentParagraphs.length > 0) {
chunks.push(joinParagraphs(currentParagraphs).trim());

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +20
apiVersion: apps/v1
kind: Deployment
metadata:
name: ollama
labels:
app: ollama
spec:
replicas: 1
selector:
matchLabels:
app: ollama
template:
metadata:
labels:
app: ollama
spec:
containers:
- name: ollama
image: ollama/ollama:latest
ports:
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These Ollama manifests are added under k8s/overlays/local-simple/, but k8s/overlays/local-simple/kustomization.yaml currently doesn't include ollama.yaml (or ollama-pull-job.yaml) in its resources: list. As a result, skaffold run -p ... with the local-simple overlay will not deploy Ollama, despite OLLAMA_URL pointing at it. Add the new resources to the kustomization or document that they must be applied separately.

Copilot uses AI. Check for mistakes.
Comment on lines +146 to +149
afterAll(async () => {
if (pg) {
await deleteTestJobs(pg, TEST_PREFIX);
// Clean up test database to prevent OOM from accumulated schema caches
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleteTestJobs(pg, TEST_PREFIX) won't clean up any jobs created by this suite, because the test creates jobs with task_identifier 'rag-embedding' (and trigger-created jobs will also be 'rag-embedding'). This will leave rows in app_jobs.jobs after the test run and can cause cross-test interference. Use a prefix that matches the task identifier(s) created here, or change the added jobs to use the prefix consistently (and update waitForJobCreated accordingly).

Copilot uses AI. Check for mistakes.
Comment on lines +23 to +46
// Chunking strategies
function chunkText(
text: string,
chunkSize: number,
chunkOverlap: number,
strategy: string
): string[] {
if (!text || text.trim().length === 0) {
return [];
}

switch (strategy) {
case 'sentence':
return chunkBySentence(text, chunkSize, chunkOverlap);
case 'paragraph':
return chunkByParagraph(text, chunkSize, chunkOverlap);
case 'semantic':
// Fallback to fixed for now - semantic requires more sophisticated handling
return chunkByFixed(text, chunkSize, chunkOverlap);
case 'fixed':
default:
return chunkByFixed(text, chunkSize, chunkOverlap);
}
}
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chunking logic is duplicated here even though this PR introduces @constructive-io/text-chunker with the same strategies. Keeping two implementations will likely diverge (and they already differ for sentence trailing handling). Prefer importing and using the shared chunker package from the handler, or remove the unused package if it's not intended to be used yet.

Copilot uses AI. Check for mistakes.
Comment on lines +243 to +268
// 2. Delete existing chunks (for re-chunking)
try {
// Query existing chunks using buildSelect
const getChunksQuery = buildSelect(chunksTable, tables, {
where: {},
fieldSelection: { select: ['id'] }
});
const chunksResult = await client.request<{
[key: string]: { nodes: Array<{ id: string }> } | null
}>(getChunksQuery.toString(), { where: { [parentFkFieldName]: { equalTo: id } } }, schemaHeaders);

const existingChunks = chunksResult[chunksFieldNamePlural!]?.nodes || [];

// Delete each chunk by ID using buildPostGraphileDelete
const deleteMutation = buildPostGraphileDelete(chunksTable, tables);
for (const chunk of existingChunks) {
await client.request(deleteMutation.toString(), { input: { id: chunk.id } }, schemaHeaders);
}

if (existingChunks.length > 0) {
log.info('[generate-chunks] Deleted existing chunks', { count: existingChunks.length, parentId: id });
}
} catch (err) {
// Ignore if no chunks exist or delete mutation doesn't exist
log.info('[generate-chunks] No existing chunks to delete or delete not available', { error: String(err) });
}
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The delete/re-chunk step catches all errors and proceeds, which can silently leave stale chunks and then insert additional rows (duplicates) if deletion fails for reasons other than "no chunks" (e.g., auth/permission or GraphQL errors). Consider only treating the specific "not found"/"mutation missing" cases as non-fatal, and failing the job for other deletion errors so the system doesn't accumulate incorrect chunk state.

Copilot uses AI. Check for mistakes.
@Anmol1696
Copy link
Copy Markdown
Contributor

Heads-up: repository history was rewritten on 2026-05-12 to scrub leaked secrets (Postgres/pgAdmin default passwords, an AWS access key ID, generated k8s/manifests/interweb-*.yaml files) from every commit. Every branch on origin was force-pushed with new commit SHAs.

This PR shows "DIRTY" / merge-conflict status because main has a cleanup commit and this branch is based on pre-rewrite main. To clear it:

git fetch --all --prune
git checkout <this-branch>
git reset --hard origin/<this-branch>    # your local branch tip moved too — pick up the rewritten version
git rebase origin/main                    # rebase onto rewritten main
# resolve conflicts (usually trivial — mostly secret-placeholder + the 4 deleted interweb-*.yaml files)
git push --force-with-lease

Or merge instead of rebase if the branch is shared with others:

git merge origin/main
git push

Notes:

  • The secrets that leaked were all either dead (rotated AWS key) or were defaults that have since been rotated/replaced; no active credential is exposed.
  • Old commit SHAs are still accessible by direct URL on GitHub for ~90 days (can be expedited via GitHub Support if needed).
  • See k8s/SECRET-EXPOSURE-AUDIT.md on main for the full incident audit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants