Press enter or click to view image in full size
Member-only story
5 min readJust now
–
Overview
I built a Retrieval-Augmented Generation (RAG) system that answers physics questions by retrieving relevant passages from an AP Physics textbook and generating responses using an LLM. The application processes 500 pages of Electricity chapters content, creates vector embeddings, stores them in a FAISS index, and serves answers through a Streamlit interface deployed on Hugging Face Spaces.
Tech Stack:
- Python 3.13
- FAISS for vector similarity search
- Sentence Transformers (all-MiniLM-L6-v2) for embeddings
- Groq API with Llama 3.1 for text generation
- Streamlit for the web interface
- Docker for containerization
Architecture
The system follows a standard RAG pipeli…
Press enter or click to view image in full size
Member-only story
5 min readJust now
–
Overview
I built a Retrieval-Augmented Generation (RAG) system that answers physics questions by retrieving relevant passages from an AP Physics textbook and generating responses using an LLM. The application processes 500 pages of Electricity chapters content, creates vector embeddings, stores them in a FAISS index, and serves answers through a Streamlit interface deployed on Hugging Face Spaces.
Tech Stack:
- Python 3.13
- FAISS for vector similarity search
- Sentence Transformers (all-MiniLM-L6-v2) for embeddings
- Groq API with Llama 3.1 for text generation
- Streamlit for the web interface
- Docker for containerization
Architecture
The system follows a standard RAG pipeline:
User Query → Embedding Model → Vector Search (FAISS) → Retrieved Chunks → LLM with Context → Generated Answer
Implementation
1. Document Processing and Chunking
First, I extracted text from the PDF and split it into overlapping chunks: