Recently we launched a new File Search Tool, a fully managed RAG system built directly into the Gemini API that abstracts away the retrieval pipeline so you can focus on building. For all of the details, check out the blog post, or read on for a tutorial from Mark McDonald.
Imagine effortlessly recalling specific details from your favorite podcasts or getting a quick recap when returning to a long series’ storyline. An AI conversation, equipped with the full context of your podcasts, makes these easy.
In this tutorial, we’ll build a tool for this exact case. We’ll create a Python app that ingests a podcast RSS feed, transcribes …
Recently we launched a new File Search Tool, a fully managed RAG system built directly into the Gemini API that abstracts away the retrieval pipeline so you can focus on building. For all of the details, check out the blog post, or read on for a tutorial from Mark McDonald.
Imagine effortlessly recalling specific details from your favorite podcasts or getting a quick recap when returning to a long series’ storyline. An AI conversation, equipped with the full context of your podcasts, makes these easy.
In this tutorial, we’ll build a tool for this exact case. We’ll create a Python app that ingests a podcast RSS feed, transcribes the episodes, and indexes them using the File Search Tool. This allows us to ask natural language questions and get answers based on the actual content of the podcasts, complete with citations pointing back to the specific episodes.
Overview of the Solution
The application consists of two main parts:
- Ingestion (
ingest.py): Downloads episodes, transcribes them, and uploads the transcripts to a File Search Store. - Querying (
query.py): Takes a user question, searches the File Search Store, and generates an answer using Gemini.
Step 1: Create a file search store
A File Search Store is the container for scoping your documents. In this example, we are using a single store for all of our podcasts so we can search across them all at once.
First, get set up with the Python SDK.
from google import genai
from google.genai import types
client = genai.Client()
To create a new store, we use client.file_search_stores.create. We’ll use the optional display name to identify our podcast index.
store = client.file_search_stores.create(
config={'display_name': 'My Podcast Store'}
)
Step 2: Transcribe episodes
To index the content, we need to turn the audio into text. We download the audio file and then use the Gemini 2.5 Flash-Lite model to transcribe it. We use Flash-Lite as it’s extremely fast and cost-effective for this task.
In ingest.py, the transcribe_audio function handles this, and you can add any prompt direction to Gemini to help manage the quality of your transcripts, e.g. to skip the intro or label the speakers.
response = client.models.generate_content(
model='gemini-2.5-flash-lite',
contents=[
types.Part.from_uri(
file_uri=audio_file.uri,
mime_type=audio_file.mime_type
),
"Transcribe this audio. Output only the transcription. Label the speakers. Do not include any obvious ad-reads or promotional segments in the transcription (if unsure, leave them in)."
]
)
Step 3: Upload transcripts with metadata
Once we have the transcript, we can upload it to our store. A powerful feature of the File Search tool is that you can provide custom metadata that can be used to filter at generation-time, allowing us to restrict the source data to a specific podcast or date range.
To upload a file, we use client.file_search_stores.upload_to_file_search_store. This handles uploading the file content and attaches the custom metadata in the same call.
Here’s a sample of how we prepare the metadata and upload the file in ingest.py. The full code adds a number of other fields.
metadata = [
{'key': 'title', 'string_value': ep.title},
{'key': 'podcast', 'string_value': feed_info.title},
]
# Bring any tags from the feed itself
if 'tags' in ep:
for tag in ep.tags:
metadata.append({'key': 'tag', 'string_value': tag.term})
op = client.file_search_stores.upload_to_file_search_store(
file_search_store_name=store_name,
file=transcript_filename,
config={
'custom_metadata': metadata,
'display_name': ep.title
}
)
Step 4: Query the store
Now for the fun part: asking questions!
To enable file search in a generation request, we pass the FileSearch tool that defines the file store to search, along with any filtering we need.
From query.py:
if args.podcast:
metadata_filter = f"podcast = {args.podcast}"
file_search = types.FileSearch(
file_search_store_names=[store.name],
metadata_filter=metadata_filter # Optional filter
)
tool = types.Tool(file_search=file_search)
response = client.models.generate_content(
model='gemini-2.5-flash',
contents=question,
config=types.GenerateContentConfig(
tools=[tool]
)
)
When we call client.models.generate_content with this tool, Gemini will automatically search our store for relevant information to answer the user’s question.
Step 5: Display results and citations
The response from Gemini includes not just the answer, but also citations showing exactly which parts of the uploaded files were used.
print("\nAnswer:")
print(response.text)
print("\nCitations:")
for i, chunk in enumerate(response.candidates[0].grounding_metadata.grounding_chunks):
if chunk.retrieved_context:
title = chunk.retrieved_context.title or "Unknown Episode"
print(f"\nCitation {i+1}:")
print(f"Episode: {title}")
print(f"Text: {chunk.retrieved_context.text}")
This allows users to verify the answer and explore the source material further.
Run the application
- Ingest a podcast:
python ingest.py "https://feeds.example.com/podcast.rss" --limit 5
This will download the last 5 episodes, transcribe them, and upload them to the “Podcasts” store.
- Ask a question:
python query.py "Why are red delicious apples so bad?" --podcast="..."
Gemini will retrieve relevant chunks from the indexed transcripts, pass them in the input context with the query, and provide an answer with citations.
What next?
By using the Gemini File Search API, we’ve turned a collection of audio files into a rich, searchable knowledge base. We didn’t have to worry about chunking, embedding, or setting up a vector database - the API handled it all. With the addition of metadata, we built a powerful search tool with minimal code.
For more content, check out:
- Official Documentation: Dive deeper into the File Search API capabilities.
- Demo Applet: Try out a live demo of a similar application in Google AI Studio (or vibe-code your own!).
This article was brought to you by Mark McDonald. Grab the code here:
Podcast Search with Gemini File Search API
This project is a simple command-line tool that demonstrates how to build a searchable podcast knowledge base using the Gemini File Search API.
It consists of two main scripts:
ingest.py: Ingests a podcast RSS feed, downloads audio, transcribes episodes using the fast Gemini 2.5 Flash-Lite model, and uploads them to a File Search Store.query.py: Allows you to ask natural language questions about the ingested podcast content and receive grounded answers with citations.
Prerequisites
- Python 3.10+
- A Gemini API key (get one from Google AI Studio).
Setup
Create and activate a virtual environment:
python3 -m venv .venv
source .venv/bin/activate
Install dependencies:
pip install -r requirements.txt
Create a .env file and add your API key:
echo "GOOGLE_API_KEY=your_api_key_here" > .env
Usage
1. Ingest a Podcast
Run ingest.py with the RSS feed URL of the podcast you want to index…