Launch a production-ready RAG server in minutes

Larkup RAG is an open source toolkit that eliminates the complexities of manual infrastructure setup. Easily configure vector stores, chunking strategies, and embedding models, and immediately connect your AI agents.

Get Started

Plug into every model & vector store

AWS

Azure

Chroma

Cohere

DeepSeek

Digital Ocean

Docker

GCP

Gemini

GitHub

Google

Groq

Hetzner

Jina

LanceDB

Milvus

Mistral

Nomic

OpenAI

PGVector

Pinecone

Qdrant

Qwen

Supabase

Vercel

Voyage

Weaviate

xAI

AWS

Azure

Chroma

Cohere

DeepSeek

Digital Ocean

Docker

GCP

Gemini

GitHub

Google

Groq

Hetzner

Jina

LanceDB

Milvus

Mistral

Nomic

OpenAI

PGVector

Pinecone

Qdrant

Qwen

Supabase

Vercel

Voyage

Weaviate

xAI

Everything in the box

A complete retrieval engine, not just a wrapper

Larkup RAG ships the primitives you actually need to take RAG from prototype to production typed, composable, and observable end to end.

Type-safe SDK

Connect your AI agents to your RAG server in seconds using our official SDKs for TypeScript and Python.

index.ts

import { LarkupRAGClient } from "@larkup/rag-sdk";

const client = new LarkupRAGClient({
   baseUrl: "http://localhost:8080",
   apiKey: "your-api-key",
});

const results = await client.query("What is Larkup RAG?", 5);

42ms

median retrieval latency at p50, fully cached

Hybrid search

Dense + sparse fusion with built-in reranking for higher recall.

10+ vector stores

Pinecone, pgvector, Qdrant, LanceDB, Weaviate, Chroma and more — swap with one line.

Deploy anywhere

Ship a standalone Node server from the CLI — includes Dockerfile, docker-compose, and vercel.json.

MIT Licensed

Fully open source, self-host on your own infra. No paywalls, no feature limits.

How it works

From raw documents to grounded answers

Four typed stages, one pipeline. Pick a step to see what Larkup RAG does under the hood.

Configure — Embedding & Vector Store

Embedding model

OpenAI1536 dims · 8k tokens

Vector store

PineconeConnected

Index type

Lexical

Semantic

Hybrid

FAQ

Frequently asked questions

Everything you need to know about Larkup RAG. Can't find an answer? Talk to our team.

Ready to build your RAG app?

Whether you're wiring up a custom knowledge-base bot, an agentic research assistant, or a fully observable production pipeline. Larkup RAG gets you there, fast.

Get started free