Show HN: Ragctl – document ingestion CLI for RAG (OCR, chunking, Qdrant)
github.comΒ·3dΒ·
Discuss: Hacker News
πŸ“„Document Streaming
Preview
Report Post

πŸš€ RAG Studio

Production-ready document processing CLI for RAG applications

Process documents, extract text with advanced OCR, chunk intelligently, and prepare data for RAG systems - all from the command line with ragctl.


🎯 What is RAG Studio?

RAG Studio (ragctl) is a command-line tool for processing documents into chunks ready for Retrieval-Augmented Generation (RAG) systems. It handles the dirty work of document ingestion, OCR, and intelligent chunking so you can focus on building your RAG application.

Key capabilities:

  • πŸ“„ Universal document loading (PDF, DOCX, images, HTML, Markdown, etc.)
  • πŸ” Advanced OCR with automatic fallback (EasyOCR β†’ PaddleOCR β†’ pytesseract)
  • βœ‚οΈ Intelligent semantic chunking using LangChain
  • πŸ“¦ Production-ready batch p…

Similar Posts

Loading similar posts...