Launch a production-ready RAG server in minutes
Larkup RAG is an open source toolkit that eliminates the complexities of manual infrastructure setup. Easily configure vector stores, chunking strategies, and embedding models, and immediately connect your AI agents.
Plug into every model & vector store
Everything in the box
A complete retrieval engine, not just a wrapper
Larkup RAG ships the primitives you actually need to take RAG from prototype to production typed, composable, and observable end to end.
Type-safe SDK
Connect your AI agents to your RAG server in seconds using our official SDKs for TypeScript and Python.
import { LarkupRAGClient } from "@larkup/rag-sdk";
const client = new LarkupRAGClient({
baseUrl: "http://localhost:8080",
apiKey: "your-api-key",
});
const results = await client.query("What is Larkup RAG?", 5);median retrieval latency at p50, fully cached
Hybrid search
Dense + sparse fusion with built-in reranking for higher recall.
10+ vector stores
Pinecone, pgvector, Qdrant, LanceDB, Weaviate, Chroma and more — swap with one line.
Deploy anywhere
Ship a standalone Node server from the CLI — includes Dockerfile, docker-compose, and vercel.json.
MIT Licensed
Fully open source, self-host on your own infra. No paywalls, no feature limits.
How it works
From raw documents to grounded answers
Four typed stages, one pipeline. Pick a step to see what Larkup RAG does under the hood.
FAQ
Frequently asked questions
Everything you need to know about Larkup RAG. Can't find an answer? Talk to our team.
Ready to build your RAG app?
Whether you're wiring up a custom knowledge-base bot, an agentic research assistant, or a fully observable production pipeline. Larkup RAG gets you there, fast.