Self-hosted RAG

Ask your documents anything.

PIKA is a self-hosted document intelligence platform. Upload internal documents, ask questions in plain language, and get answers grounded in your data — with citations. All inference runs locally via Ollama.

Read the docs How it works

Features

Document intelligence, on your terms.

RAG-powered Q&A that runs entirely within your infrastructure. No data leaves your network. No cloud dependency.

RAG with citations

Answers are grounded in your documents with source citations. See exactly which passages informed each response.

Retrieval-augmented Source citations

Local embeddings

Documents are chunked and embedded using Sentence Transformers, stored in ChromaDB. No external API calls — everything stays local.

ChromaDB Sentence Transformers

Streaming responses

Answers stream token-by-token via Ollama. Choose your model — from lightweight 3B models to larger, more capable ones.

Ollama Any GGUF model

Multi-user access control

User accounts managed through the ai.doo Hub. Admin and user roles with centralised authentication. Session management and CSRF protection built in.

Answer feedback

Users can rate responses with thumbs up/down. Feedback is logged for quality monitoring and helps identify documents that need updating.

Privacy by architecture.

PIKA doesn't send data to external APIs. Embeddings are computed locally, inference runs on your GPU via Ollama, and documents are stored in your own ChromaDB instance. There's no telemetry, no phone-home, no cloud fallback.

Workflow

Upload. Index. Ask. Cite.

From raw documents to cited answers in four steps.

Upload

Upload PDF, DOCX, TXT, or Markdown files through the web interface or API.

Index

Documents are chunked, embedded with Sentence Transformers, and stored in ChromaDB for fast retrieval.

Ask

Ask questions in plain language. PIKA retrieves the most relevant chunks and sends them to Ollama with your query.

Cite

Get streaming answers with inline citations pointing back to the source documents and passages.

Architecture

Built for self-hosted deployment.

A single Docker Compose file gets you running. Connects to the shared Ollama instance for inference.

FastAPI

Async Python backend with streaming response support and Prometheus metrics.

ChromaDB

Vector store for document embeddings. Persistent storage, fast similarity search.

Ollama

Local LLM inference. GPU-accelerated. Use any GGUF-compatible model.

Hub auth

Centralised user management, licensing, and audit logging via ai.doo Hub.

Pricing

Licensed per seat. Self-hosted.

PIKA is licensed through the ai.doo Hub. No per-query fees, no cloud costs. You own your infrastructure and your data.

Professional

Per-seat annual license

Full PIKA deployment with RAG, citations, streaming, multi-user access, and ongoing updates.

Unlimited documents and queries
RAG with inline citations
Multi-user with Hub auth
Streaming responses
Priority email support

Get pricing

Enterprise

Custom deployment

Bespoke document pipelines, custom integrations, dedicated support, and on-site deployment assistance.

Everything in Professional
Custom document pipelines
Dedicated support channel
Deployment assistance
SLA available

Ready to query your documents privately?

Email hello@aidoo.biz to discuss PIKA for your organisation.

Get in touch Read the docs