Open Source RAG Framework

Launch a production-ready RAG server in minutes

Larkup RAG is an open source toolkit that eliminates the complexities of manual infrastructure setup. Easily configure vector stores, chunking strategies, and embedding models, and immediately connect your AI agents.

Get Started

Plug into every model & vector store

AWSAWS
AzureAzure
ChromaChroma
CohereCohere
DeepSeekDeepSeek
Digital OceanDigital Ocean
DockerDocker
GCPGCP
GeminiGemini
GitHubGitHub
GoogleGoogle
GroqGroq
HetznerHetzner
JinaJina
LanceDBLanceDB
MilvusMilvus
MistralMistral
NomicNomic
OpenAIOpenAI
PGVectorPGVector
PineconePinecone
QdrantQdrant
QwenQwen
SupabaseSupabase
VercelVercel
VoyageVoyage
WeaviateWeaviate
xAIxAI
AWSAWS
AzureAzure
ChromaChroma
CohereCohere
DeepSeekDeepSeek
Digital OceanDigital Ocean
DockerDocker
GCPGCP
GeminiGemini
GitHubGitHub
GoogleGoogle
GroqGroq
HetznerHetzner
JinaJina
LanceDBLanceDB
MilvusMilvus
MistralMistral
NomicNomic
OpenAIOpenAI
PGVectorPGVector
PineconePinecone
QdrantQdrant
QwenQwen
SupabaseSupabase
VercelVercel
VoyageVoyage
WeaviateWeaviate
xAIxAI

Everything in the box

A complete retrieval engine, not just a wrapper

Larkup RAG ships the primitives you actually need to take RAG from prototype to production typed, composable, and observable end to end.

Type-safe SDK

Connect your AI agents to your RAG server in seconds using our official SDKs for TypeScript and Python.

index.ts
import { LarkupRAGClient } from "@larkup/rag-sdk";

const client = new LarkupRAGClient({
   baseUrl: "http://localhost:8080",
   apiKey: "your-api-key",
});

const results = await client.query("What is Larkup RAG?", 5);
42ms

median retrieval latency at p50, fully cached

Hybrid search

Dense + sparse fusion with built-in reranking for higher recall.

10+ vector stores

Pinecone, pgvector, Qdrant, LanceDB, Weaviate, Chroma and more — swap with one line.

Deploy anywhere

Ship a standalone Node server from the CLI — includes Dockerfile, docker-compose, and vercel.json.

MIT Licensed

Fully open source, self-host on your own infra. No paywalls, no feature limits.

How it works

From raw documents to grounded answers

Four typed stages, one pipeline. Pick a step to see what Larkup RAG does under the hood.

Configure — Embedding & Vector Store
OpenAI
OpenAI1536 dims · 8k tokens
Pinecone
PineconeConnected
Lexical
Semantic
Hybrid

FAQ

Frequently asked questions

Everything you need to know about Larkup RAG. Can't find an answer? Talk to our team.

Ready to build your RAG app?

Whether you're wiring up a custom knowledge-base bot, an agentic research assistant, or a fully observable production pipeline. Larkup RAG gets you there, fast.