Introducing Replicate's new search API to find the best models

Posted September 16, 2025 by

zeke

We’ve added a new search API to help you find the best models. This API is currently in beta, but it’s already available to all users in our TypeScript and Python SDKs, and our MCP servers.

Here’s an example of how to use it with cURL:

Here’s a video of the search API in action using our MCP server with Claude Desktop:

The new search API returns results for models, collections, and documentation pages that match your query.

For model results, the API returns a model object with all the data you would normally expect like url, description, and run_count, but it also returns a new metadata object f…

Posted September 16, 2025 by

zeke

We’ve added a new search API to help you find the best models. This API is currently in beta, but it’s already available to all users in our TypeScript and Python SDKs, and our MCP servers.

Here’s an example of how to use it with cURL:

Here’s a video of the search API in action using our MCP server with Claude Desktop:

The new search API returns results for models, collections, and documentation pages that match your query.

For model results, the API returns a model object with all the data you would normally expect like url, description, and run_count, but it also returns a new metadata object for each model that includes properties like a longer and more detailed generated_description, tags, and more.

Here’s an example using cURL and jq to get specific data for Google’s 🍌 Nano Banana 🍌 model:

And here’s the response:

Search with MCP

The new search API is already supported in our remote and local MCP servers, so you can discover and explore the best models and collections using your favorite tools like Claude Desktop, Claude Code, VS Code, Cursor, OpenAI Codex CLI, and Google’s Gemini CLI.

Our MCP server does sophisticated API response filtering to keep your LLM’s context window from getting overloaded by large response objects. It dynamically constructs a jq filter query based on each API operation’s response schema, and only returns the most relevant parts of the response. Here’s an example of the kind of filter query it uses for the search API:

To get started using Replicate’s MCP server, read our MCP announcement blog post or head over to mcp.replicate.com.

TypeScript SDK support

The new search API is available in the latest alpha release of our TypeScript SDK as replicate.search().

To use it, install the latest alpha from npm:

Then use it like this:

The TypeScript SDK also includes type hints for the search API, so your editor can guide you through the writing the method signature and handling the response schema:

TypeScript SDK type hints

Python SDK support

The new search API is available in the latest alpha release of our Python SDK as replicate.search().

To use it, install the latest pre-release from PyPI:

Then, you can use it like this:

🐍 Check out the Google Colab notebook for a Python example you can run right in your browser.

API documentation

You can find docs for the new search API on our HTTP API reference page at replicate.com/docs/reference/http#search

If you prefer structured reference documentation, you can also find the docs in our OpenAPI schema at api.replicate.com/openapi.json

Backwards compatibility

The old QUERY /v1/models search endpoint is still active and will continue to work, but we recommend using the new GET /v1/search endpoint instead for better search results.

You can still use the old search endpoint via the HTTP API directly, or with our Python or TypeScript SDKs.

The old search endpoint is now disabled in our MCP server, in favor of the new search endpoint.

Feedback

This new search API is still in beta, so we’d love to hear your feedback.

If you encounter any issues or unexpected search results, please let us know.