Docify: Building a Production RAG System for Knowledge Management

Docify - Local-First AI Second Brain

Your personal research assistant that remembers everything you’ve ever read.

Docify is an open-source, local-first AI application that lets you upload any resource (PDFs, URLs, documents, images, code), ask questions about them, and receive cited, grounded answers—all while keeping your data completely private.

✨ Key Features

🔒 Privacy-First: All processing happens locally (embeddings, LLM, storage)
🧠 Smart Deduplication: Content-based fingerprinting prevents duplicate processing
📚 Multi-Format Support: PDF, URL, Word, Excel, Markdown, images (OCR), code, and more
💬 Cited Answers: Every response includes citations to source documents
🔍 Hybrid Search: Combines semantic (vector) and keyword (BM25) se…

Docify - Local-First AI Second Brain

Your personal research assistant that remembers everything you’ve ever read.

✨ Key Features

🔒 Privacy-First: All processing happens locally (embeddings, LLM, storage)
🧠 Smart Deduplication: Content-based fingerprinting prevents duplicate processing
📚 Multi-Format Support: PDF, URL, Word, Excel, Markdown, images (OCR), code, and more
💬 Cited Answers: Every response includes citations to source documents
🔍 Hybrid Search: Combines semantic (vector) and keyword (BM25) search
🤖 Local LLM: Runs Mistral 7B via Ollama (optional cloud LLM support)
🌐 Workspace Model: Personal, team, or hybrid collaboration
🚀 One-Command Setup: Docker Compose orchestration

🏗️ Architecture Overview

Docify’s RAG pipeline integrates 11 core services:

Resource Ingestion - Upload, parse, deduplicate
Chunking - Semantic boundary preservation
Embeddings (Async) - Vector generation via Celery
Query Expansion - Better recall with variants
Hybrid Search - Semantic + keyword (BM25)
Re-Ranking - 5-factor scoring + conflict detection
Context Assembly - Token budget management
Prompt Engineering - Anti-hallucination prompts
LLM Service - Ollama/OpenAI/Anthropic support
Citation Verification - Verify claims against sources
Message Generation - Full pipeline orchestration

See ARCHITECTURE.md for complete technical details.

🚀 Quick Start

Prerequisites

Docker & Docker Compose
8GB RAM minimum (16GB recommended)
20GB disk space (for models and data)

Docker Setup (Recommended)

# Clone the repository
git clone https://github.com/yourusername/docify.git
cd docify

# Copy environment configuration
cp .env.example .env

# Start all services
docker-compose up -d --build

# Wait for services to be healthy (~2-3 minutes)
docker-compose ps

# Initialize database (one-time setup)
docker-compose exec postgres psql -U docify -d docify -c "CREATE EXTENSION IF NOT EXISTS vector"
docker-compose exec backend alembic upgrade head

# Download optimized models (one-time, ~2GB total)
docker-compose exec ollama ollama pull mistral:7b-instruct-q4_0
docker-compose exec ollama ollama pull all-minilm:22m

# Restart services with models loaded
docker-compose restart backend celery-worker

Verify Setup

# Check if all containers are running
docker-compose ps

# Test API health
curl http://localhost:8000/api/health

# Monitor system resources
docker stats docify-ollama docify-backend

# View logs
docker-compose logs -f backend
docker-compose logs -f celery-worker

Access

Frontend: http://localhost:3000
API Docs & Testing: http://localhost:8000/docs
Health Endpoint: http://localhost:8000/api/health

🛠️ Local Development

Backend (Python/FastAPI)

cd backend

# Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Start development server (requires running docker-compose services)
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Frontend (React/TypeScript)

cd frontend

# Install dependencies
npm install

# Start development server
npm run dev

📦 Tech Stack

Backend

FastAPI (Python 3.10+)
PostgreSQL 15+ with pgvector
Celery + Redis (async tasks)
Ollama (local LLM: mistral:7b-instruct-q4_0, all-minilm:22m)
sentence-transformers optional (OpenAI/Anthropic support)

Frontend

React 18+ with TypeScript
Vite, Tailwind CSS
React Query, Zustand

Infrastructure

Docker & Docker Compose
Alembic (database migrations)

📖 API Usage

Upload a Resource

curl -X POST "http://localhost:8000/api/resources/upload" \
-F "file=@research_paper.pdf" \
-F "workspace_id=<your-workspace-id>"

Search

curl -X POST "http://localhost:8000/api/search" \
-H "Content-Type: application/json" \
-d '{"query": "What is RAG?", "workspace_id": "<id>"}'

Ask Questions

curl -X POST "http://localhost:8000/api/conversations/<id>/messages" \
-H "Content-Type: application/json" \
-d '{"content": "Explain the main findings", "role": "user"}'

🐳 Docker & Troubleshooting

Common Commands

# Start all services
docker-compose up -d

# View logs (all services)
docker-compose logs -f

# View logs for specific service
docker-compose logs -f backend
docker-compose logs -f celery-worker

# Stop all services
docker-compose down

# Stop and remove data (WARNING: deletes all data)
docker-compose down -v

# Restart specific service
docker-compose restart backend

Port Conflicts

If you get "port already in use" errors:

# PostgreSQL: Docify uses 5433 (standard is 5432)
# Redis: Docify uses 6380 (standard is 6379)
# Backend: Docify uses 8000
# Frontend: Docify uses 3000
# Ollama: Docify uses 11434

# Check what's using a port (macOS/Linux)
lsof -i :8000

# Kill process (if needed)
kill -9 <PID>

Manual API Testing

Use the built-in API documentation:

Open http://localhost:8000/docs in your browser
Try requests directly in Swagger UI
All endpoints are documented with request/response schemas

Alternatively, use curl:

# Health check
curl http://localhost:8000/api/health

# List workspaces
curl http://localhost:8000/api/workspaces

# Create workspace
curl -X POST http://localhost:8000/api/workspaces \
-H "Content-Type: application/json" \
-d '{"name":"My Workspace","workspace_type":"personal"}'

📄 License

MIT License - see LICENSE file for details

🙏 Acknowledgments

Built with FastAPI
Powered by Ollama
Vector search by pgvector
Embeddings by sentence-transformers

Made with ❤️ for researchers, students, and knowledge workers

Docify - Local-First AI Second Brain

✨ Key Features

Docify - Local-First AI Second Brain

✨ Key Features

🏗️ Architecture Overview

🚀 Quick Start

Prerequisites

Docker Setup (Recommended)

Verify Setup

Access

🛠️ Local Development

Backend (Python/FastAPI)

Frontend (React/TypeScript)

📦 Tech Stack

📖 API Usage

Upload a Resource

Search

Ask Questions

🐳 Docker & Troubleshooting

Common Commands

Port Conflicts

Manual API Testing

📄 License

🙏 Acknowledgments

Similar Posts