Grok Collections API

Today, we’re excited to announce Collections API. With Collections, you can upload and search through entire datasets. From PDFs and Excel sheets to entire codebases, you can upload your files into a knowledge base that supports precise and fast search. This allows developers to build RAG applications without the headache of managing indexing and retrieval infrastructure.

To help you get started, we’re making file indexing and storage free for the first week*, with retrieval priced at a flat rate of $2.50 per 1,000 searches.

Indexing

Powerful document understanding: We use OCR and layout-aware parsing to extract text while preserving structure such as the layout of a PDF, hierarchy of an Excel table, or the syntax of code.
Smart file management: Easil…

To help you get started, we’re making file indexing and storage free for the first week*, with retrieval priced at a flat rate of $2.50 per 1,000 searches.

Indexing

Powerful document understanding: We use OCR and layout-aware parsing to extract text while preserving structure such as the layout of a PDF, hierarchy of an Excel table, or the syntax of code.
Smart file management: Easily upload, update, and download files. And when a file changes, our system efficiently reindexes it to ensure your collection is never stale.
Broad format support: Collections supports a wide range of file types. (see full list)

Retrieval

Choose the retrieval method that best fits your use case:

Semantic search: To search using the meaning and intent behind a query.
Keyword search: For precise term matching.
Hybrid search: For the highest accuracy, combine keyword and semantic search. We support both a dedicated reranker model and reciprocal rank fusion.

What is our financial forecast for Q1 2026?

Financial_plan_2026.txt

Our company’s annual financial projections indicate a robust growth trajectory for the upcoming fiscal year, with expected revenue increases driven by expanded market share in emerging sectors. Analysts predict a 15% rise in Q1 2026, bolstered by strategic investments in technology and supply chain optimization. Key metrics such as EBITDA and net profit margins are forecasted to improve.

Benchmark Results

Our Collections API delivers state-of-the-art retrieval performance, matching or outperforming leading models in real-world RAG tasks across finance, legal, and coding domains.

These fields are especially challenging due to their long, dense documents. To avoid hallucinations and deliver reliable answers, models must retrieve the exact passages and reason over them accurately.

Accuracy*

(Higher is better)

Task	xAIGrok 4.1 Fast	GoogleGemini Pro 3
93.0	85.9	84.7
73.9	74.5	71.2
86	85	81

*Internal source.

Financial Analysis

Extracting tabular and numerical data from files can be challenging with semantic search alone. Hybrid search enables you to accurately retrieve this data from documents such as SEC filings*, allowing the model to precisely reference information.

Retrieval Score

xAI Collections & Grok 4.1

Google File Search & Gemini Pro 3**

OpenAI VS & GPT 5.1

*Based on an internal dataset.

**Gemini does not expose the actual retrieved files so this metric measures the files cited by Gemini rather than the raw retrieved files. We set the default top k for Gemini to be 20 passages.

Legal Analysis (LegalBench)

The LegalBench dataset tests retrieval and reasoning over nuanced legal language and complex cross-references, consisting of 128 challenging question-answer pairs drawn from an extensive corpus of authentic commercial contracts across multiple datasets.

Retrieval Score

xAI Collections & Grok 4.1

Google File Search & Gemini Pro 3*

OpenAI VS & GPT 5.1

*Gemini does not expose the actual retrieved files so this metric measures the files cited by Gemini rather than the raw retrieved files. We set the default top k for Gemini to be 20 passages.

Codebase (DeepCodeBench)

Code understanding is crucial for applications such as code summarization and generation. We use the DeepCodeBench dataset to comprehensively benchmark for this. It features a diverse set of tasks drawn from real-world open-source repositories, API usage, and complex algorithmic problems.

End to End Answer Performance

Accuracy Score

Grok 4.1

Gemini Pro 3

GPT 5.1

*We evaluated the code understanding capability of agentic search on 232 code Q&A datapoints from DeepCodeBench across 8 repositories containing 8,000 files.

from xai_sdk import Client

client = Client()
collection = client.collections.create(
name="Research Papers",
model_name="grok-embedding-small"
)

document = client.collections.upload_document(
collection.collection_id,
name="ml-fundamentals.txt",
data=b"...",
)

results = client.collections.search(
query="What is machine learning?",
collection_ids=[collection.collection_id],
retrieval_mode="hybrid",
)

Using Collections in Chat

from xai_sdk import Client
from xai_sdk.chat import user, system
from xai_sdk.tools import collections_search

client = Client()
chat = client.chat.create(
model="grok-4-1-fast",
messages=[
system("You are a personal tutor teaching the Fundamentals of AI and ML, check our notes."),
user("Explain to me the core concepts."),
],
tools=[
collections_search(
collection_ids=["collection_27ce08b9-87fd-4d1a-a9e7-8a199107e54f"],
retrieval_mode="hybrid",
),
],
)

Direct API Usage

curl -X POST https://api.x.ai/v1/documents/search \
-H "Authorization: Bearer $XAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "Fundamentals of AI and ML.",
"source": {
"collection_ids": [
"collection_27ce08b9-87fd-4d1a-a9e7-8a199107e54f"
]
},
"retrieval_mode": { "type": "hybrid" }
}'

*You may be charged after the free trial period. We will follow up with more information.

Indexing

Indexing

Retrieval

Benchmark Results

Accuracy*

Financial Analysis

Legal Analysis (LegalBench)

Codebase (DeepCodeBench)

End to End Answer Performance

Data Privacy

Start Building

Creating and Searching Collections

Using Collections in Chat

Direct API Usage

Similar Posts