Building a Streaming AI Agent with LangChain, MistralAI and Next.js

Building a Streaming AI Agent with LangChain, MistralAI and Next.js with Tool calling | Part 2 | Building Personal Assistant

Create an intelligent assistant with tool calling capabilities using Mistral AI, LangChain’s createAgent, and Next.js streaming APIs

AI Personal Assistant — using Next.js, LangChain and MistralAI — AI Personal Assistant — Next.js, LangChain, MistralAI

TL;DR

Building a Streaming AI Agent with LangChain, MistralAI and Next.js with Tool calling | Part 2 | Building Personal Assistant

Create an intelligent assistant with tool calling capabilities using Mistral AI, LangChain’s createAgent, and Next.js streaming APIs

TL;DR

In this blog, I’m sharing the core backend logic that I used for building my personal AI Agent. Using LangChain.js’s modern tool() utility for a robust, scalable tool structure, and createAgent (powered by LangGraph) with Mistral's devstral-medium-latest model to handle the reasoning and execution. By the end of this post, you will be able to understand how to build an AI Agent that can intelligently use tools (weatherTool in our case) and stream its responses back to the frontend.

Introduction

Welcome back to Building a Personal AI Assistant series! If you missed the previous article (Part 1.2), we set up a slick frontend with a streaming Markdown display using streamdown.ai. Now, we build an agent capable of handling tools.

Here is the architecture we are building today:

Tool Design: By importing the tool function from the langchain package and zod for schema.
The Agent Core: Using LangChain’s powerful createAgent for reliable tool-use orchestration.
LLM Selection: Powering the intelligence with Mistral AI.

My goal here was to create a modular structure that allows me to drop in new tools (like Google Calendar, Drive, or a Web Search) without refactoring the agent core.

Tool Design

What are tools?

Tools extend the AIs capabilities beyond the training dataset. They’re a function that an AI (with tool calling abilities) can call to interact with the world beyond its training data — fetching weather, querying databases, calling custom APIs, or running calculations.

To ensure future scalability, I decided to place all my custom tools inside a dedicated src/lib/tools directory. This makes it clean to import them into the main agent file.

For details on how models handle tool calls, see Tool calling. Here’s the code for the weather tool that I have created.

// src/lib/tools/weather-tool.ts import { tool } from '@langchain/core/tools'; import { z } from 'zod'; // ... (WEATHER_CODES for readability) export const weatherTool = tool( async ({ location }) => { // Geocoding API Call (Open-Meteo) const geoRes = await fetch( `https://geocoding-api.open-meteo.com/v1/search?name=${encodeURIComponent( location )}&count=1&format=json` ); const geo = await geoRes.json(); if (!geo.results?.[0]) return `Could not find location "${location}". Please try being more specific...`; const { latitude, longitude, name, country, admin1 } = geo.results[0]; // Weather Forecast API Call const weatherRes = await fetch( `https://api.open-meteo.com/v1/forecast?latitude=${latitude}&longitude=${longitude}&current=temperature_2m,relative_humidity_2m,apparent_temperature,weather_code,wind_speed_10m&temperature_unit=celsius&wind_speed_unit=mph` ); const { current } = await weatherRes.json(); // Return structured JSON result return JSON.stringify( { location: admin1 ? `${name}, ${admin1}, ${country}` : `${name}, ${country}`, condition: WEATHER_CODES[current.weather_code] || 'Unknown', temp: `${current.temperature_2m}°C`, feels_like: `${current.apparent_temperature}°C`, humidity: `${current.relative_humidity_2m}%`, wind: `${current.wind_speed_10m} mph`, }, null, 2 ); }, { name: 'get_weather', description: 'Get current weather for a location', schema: z.object({ location: z.string().describe('City and state, e.g. San Francisco, CA'), }), } );

We’re using Open-Meteo because it’s free, open-source, and requires no API key which is perfect for development and production without billing concerns or rate limit problem.

The Agent Core

Inside src/lib/agent.ts , First we install all the necessary packages and setup the model and its fundamental identity of a personal agent. I am using two critical components from the LangChain ecosystem: the LLM wrapper (ChatMistralAI) and the agent builder (createAgent).

We need to take our specialized tool (like weatherTool) and integrate it with a powerful LLM (Mistral) to create an autonomous system capable of Reasoning and Acting. This concept is known as the ReAct pattern.

Installing ChatMistralAI

npm i @langchain/mistralai

Import necessary packages

// src/lib/agent.ts import { createAgent } from 'langchain'; import { MemorySaver } from '@langchain/langgraph'; import { ChatMistralAI } from '@langchain/mistralai'; // Import tools import { weatherTool } from './tools/weather-tool'; // Import utility import { getCurrentDateTime } from './utils';

The Agent’s Identity/ The System Prompt

We need to define its role (helpful personal assistant), its tone, and its explicit instructions on tool usage. By telling it to always use get_weather for weather queries, we maximize its ability to perform its core function. The {current\_datetime} variable is a great pattern for keeping the agent contextually aware.

// src/lib/agent.ts const SYSTEM_PROMPT = `You are 'Bhai Saab', a helpful AI assistant with a friendly Indian personality. You're efficient, witty, and sometimes use Hindi phrases like "acha bhai", "okay bhai", "theek hai bhai" when appropriate. Current date and time: ${getCurrentDateTime()} When using tools: - Use get_weather when asked about weather conditions - Always provide clear, conversational responses - If you can't find information, be honest about it You have access to tools for real-time information (weather, etc.). Use them when needed, but integrate results naturally into conversation—don't announce tool usage robotically. Keep responses concise and helpful. Be witty and a capable friend who gets things done. Examples: - "Acha bhai, let me check that for you..." [uses tool] "It's 72°F and sunny!" - "Theek hai bhai, I'll help you with that right away."`;

The checkpointer or Memory

The checkpointer (currently MemorySaver) ensures that the state including the full history of messages and tool-call results is maintained. For a production app, you would swap MemorySaver for a database-backed checkpointer (like PostgreSQL or Redis) to give the agent long-term memory across multiple user sessions.

// src/lib/agent.ts const checkpointer = new MemorySaver();

Setup the Model and Agent

I chose Mistral’s open models (devstral-medium-latest), as it have native Function/Tool Calling capabilities with FREE input/output tokens. This means the model is explicitly trained to output structured JSON instructing the agent on which tool to run and with what arguments, making the tool-use process highly reliable.

// src/lib/agent.ts // Create the LLM instance export const createLLM = () => { const apiKey = process.env.MISTRAL_API_KEY; return new ChatMistralAI({ apiKey, model: 'devstral-medium-latest', // Excellent for tool-calling temperature: 0.7, // Allows for a friendly, conversational tone }); };

Create MISTRAL_API_KEY in Mistral AI Studio and use it inside .env.local

Orchestrating the ReAct Loop

The createAgent function is the highest-level abstraction for building agents in LangChain.js. It's the utility that transforms our simple list of tools and our LLM into a stateful, durable system. It leverages LangGraph under the hood, which handles the complex, stateful workflow needed for tool-use.

The core function is createAgentExecutor

// src/lib/agent.ts export async function createAgentExecutor() { const llm = createLLM(); const agent = createAgent({ model: llm, // MistralAI model tools: [weatherTool], // can add more tools checkpointer, // Enables memory across runs systemPrompt: SYSTEM_PROMPT, }); return agent; }

LangChain ReAct Agent Streaming Route

We are going to create a Next.js App Router route handler at src/app/api/agent/route.ts that uses Server-Sent Events (SSE) and the native Streams API to stream the execution events from a LangChain Agent directly to the frontend.

We have already implemented streaming logic in Part 1.1, but there’s a slight difference when streaming agents. Since Agents are equipped with tools (like weather-tool), for a better UX these tools should be streamed too.

Imports and Setup

// src/app/api/agent/route.ts import { createAgentExecutor } from '@/lib/agent'; import { HumanMessage } from 'langchain'; import { NextResponse } from 'next/server';

POST Route Handler

A simple try catch block to handle any errors we encounter while calling an agent.

// src/app/api/agent/route.ts export async function POST(req: Request) { try { // ... (function body) } catch (error) { console.error('Agent execution error:', error); return NextResponse.json({ error: 'Internal Server Error' }, { status: 500 }); } }

Request Parsing and Validation

The route expects a JSON body containing the user’s message and an optional threadId for conversation continuity. Basic validation ensures the message is present and of the correct type.

// src/app/api/agent/route.ts export async function POST(req: Request) { try { // 1. Parse the request body const { message, threadId } = await req.json(); if (!message || typeof message !== 'string') { return NextResponse.json({ error: 'Message is required' }, { status: 400 }); } // Use threadId for conversation continuity, or generate a new one const conversationId = threadId || `thread_${Date.now()}_${Math.random().toString(36).substring(7)}`; } catch (error) { console.error('Agent execution error:', error); return NextResponse.json({ error: 'Internal Server Error' }, { status: 500 }); } }

Agent and Stream Pipeline Initialization

Lets initialize our agent using createAgentExecutor() and a stream pipeline using TransformStream() — that acts as a two-ended pipe, giving us a writable side where we push data and a readable side that becomes the actual HTTP response along with encoder to make the data streamable.

// src/app/api/agent/route.ts export async function POST(req: Request) { try { // 1. Parse the request body const { message, threadId } = await req.json(); if (!message || typeof message !== 'string') { return NextResponse.json({ error: 'Message is required' }, { status: 400 }); } // Use threadId for conversation continuity, or generate a new one const conversationId = threadId || `thread_${Date.now()}_${Math.random().toString(36).substring(7)}`; // Initialize the Agent const agent = await createAgentExecutor(); // Create a TransformStream for streaming const encoder = new TextEncoder(); const stream = new TransformStream(); const writer = stream.writable.getWriter(); // Immediately Invoked Function Execution (async ()=> { ... })(); } catch (error) { console.error('Agent execution error:', error); return NextResponse.json({ error: 'Internal Server Error' }, { status: 500 }); } }

Running the loop

Server Sent Events (SSE) follows a particular format to stream the data: `data: <json>\n\n` We need a utility function that encapsulates the logic for formatting and sending a Server-Sent Event (SSE).

It takes a writer (the stream's writer), an encoder (to convert text to bytes), and a payload (the data object).
The payload is stringified and prefixed with the standard SSE prefix: data: .
It is then terminated by two newline characters (\n\n), which signals to the client's EventSource object that the event is complete and ready to be processed.

// src/app/api/agent/route.ts export async function sendSSE( writer: WritableStreamDefaultWriter<Uint8Array>, encoder: TextEncoder, payload: Record<string, unknown> ) { const data = `data: ${JSON.stringify(payload)}\n\n`; await writer.write(encoder.encode(data)); }

Immediately Invoked Function Execution (IIFE)

To start the execution loop, we invoke agent.streamEvents() function which takes the user message, and a config object which specifies v2 version and thread_id as conversationId .

// src/app/api/agent/route.ts (async () => { try { const eventStream = await agent.streamEvents( { messages: [new HumanMessage(message)], }, { version: 'v2', configurable: { thread_id: conversationId, }, } ); } catch (error) { console.error('Agent execution error:', error); await sendSSE(writer, encoder, { type: 'error', content: error instanceof Error ? error.message : 'Unknown error', }); } finally { await writer.close(); } })();

The code block below is what translates LangChain’s internal activity into structured SSE messages that client can understand.

// src/app/api/agent/route.ts (async () => { try { const eventStream = await agent.streamEvents( { messages: [new HumanMessage(message)], }, { version: 'v2', configurable: { thread_id: conversationId, }, } ); let finalOutput = ''; for await (const event of eventStream) { // Handle different event types if (event.event === 'on_chat_model_stream') { // Stream LLM tokens const content = event.data?.chunk?.content; if (content) { finalOutput += content; await sendSSE(writer, encoder, { type: 'token', content }); } } else if (event.event === 'on_tool_start') { // Tool invocation started await sendSSE(writer, encoder, { type: 'tool_start', tool: event.name, input: event.data?.input, }); } else if (event.event === 'on_tool_end') { // Tool invocation completed await sendSSE(writer, encoder, { type: 'tool_end', tool: event.name, output: event.data?.output, }); } } // Send final metadata await sendSSE(writer, encoder, { type: 'done', threadId: conversationId }); } catch (error) { console.error('Agent execution error:', error); await sendSSE(writer, encoder, { type: 'error', content: error instanceof Error ? error.message : 'Unknown error', }); } finally { await writer.close(); } })();

There are different types of events that we are handling:

on_chat_model_stream — The most common event type is what gives us the “typing” effect in the chat window.
on_tool_start — As reAct agents have the ability to call tools when necessary, we need to let the user know when and why this is happening.
on_tool_end — Once the tool call or computation is complete, the agent receives the result, which it uses to inform its next reasoning step.

Finally, once the for await...of loop completes—meaning the agent has finished its entire execution path and outputting all tokens—we send one last, definitive event.

await sendSSE(writer, encoder, { type: 'done', threadId: conversationId });

Return the stream

return new Response(stream.readable, { headers: { 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache', Connection: 'keep-alive', }, });

Complete Code for route.ts

import { createAgentExecutor } from '@/lib/agent'; // Adjust path as needed import { HumanMessage } from 'langchain'; import { NextResponse } from 'next/server'; export async function sendSSE( writer: WritableStreamDefaultWriter<Uint8Array>, encoder: TextEncoder, payload: Record<string, unknown> ) { const data = `data: ${JSON.stringify(payload)}\n\n`; await writer.write(encoder.encode(data)); } export async function POST(req: Request) { try { // 1. Parse the request body const { message, threadId } = await req.json(); if (!message || typeof message !== 'string') { return NextResponse.json({ error: 'Message is required' }, { status: 400 }); } // Use threadId for conversation continuity, or generate a new one const conversationId = threadId || `thread_${Date.now()}_${Math.random().toString(36).substring(7)}`; // Initialize the Agent const agent = await createAgentExecutor(); // Create a TransformStream for streaming const encoder = new TextEncoder(); const stream = new TransformStream(); const writer = stream.writable.getWriter(); (async () => { try { const eventStream = await agent.streamEvents( { messages: [new HumanMessage(message)], }, { version: 'v2', configurable: { thread_id: conversationId, }, } ); let finalOutput = ''; for await (const event of eventStream) { // Handle different event types if (event.event === 'on_chat_model_stream') { // Stream LLM tokens const content = event.data?.chunk?.content; if (content) { finalOutput += content; await sendSSE(writer, encoder, { type: 'token', content }); } } else if (event.event === 'on_tool_start') { // Tool invocation started await sendSSE(writer, encoder, { type: 'tool_start', tool: event.name, input: event.data?.input, }); } else if (event.event === 'on_tool_end') { // Tool invocation completed await sendSSE(writer, encoder, { type: 'tool_end', tool: event.name, output: event.data?.output, }); } } // Send final metadata await sendSSE(writer, encoder, { type: 'done', threadId: conversationId }); } catch (error) { console.error('Agent execution error:', error); await sendSSE(writer, encoder, { type: 'error', content: error instanceof Error ? error.message : 'Unknown error', }); } finally { await writer.close(); } })(); // Return the stream return new Response(stream.readable, { headers: { 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache', Connection: 'keep-alive', }, }); } catch (error) { console.error('Agent execution error:', error); return NextResponse.json({ error: 'Internal Server Error' }, { status: 500 }); } }

What’s Next in the Series? 🧠

The server is configured and streaming beautifully, but now we need the client to listen in real-time! The immediate next step in Part 2.1 is to build the frontend to handle stream events from backend using SSE.

Part 2.1: Building the useAgent Hook

We will create a custom React hook from scratch. This hook will be responsible for connecting to our SSE endpoint, parsing the structured tool_start, tool_end, and token events, and managing the conversation state in real-time. This is where we bring the agent's thought process to life on the screen!

💻 Complete Codebase

You can find the entire codebase for this project on GitHub. Each part of this series has its own dedicated branch (e.g., blog/1.2-stream-agent-events) so you can follow along step-by-step or jump straight to a specific implementation.

Feel free to star the repo if you’re building along with me — it helps keep the motivation high! See you in Part 2.1 as we start writing our client-side hook!

GitHub - dakShh/bhai-saab

Building a Streaming AI Agent with LangChain, MistralAI and Next.js was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Building a Streaming AI Agent with LangChain, MistralAI and Next.js with Tool calling | Part 2 | Building Personal Assistant

Create an intelligent assistant with tool calling capabilities using Mistral AI, LangChain’s createAgent, and Next.js streaming APIs

TL;DR

Building a Streaming AI Agent with LangChain, MistralAI and Next.js with Tool calling | Part 2 | Building Personal Assistant

Create an intelligent assistant with tool calling capabilities using Mistral AI, LangChain’s createAgent, and Next.js streaming APIs

TL;DR

Introduction

Tool Design

What are tools?

The Agent Core

Installing ChatMistralAI

Import necessary packages

The Agent’s Identity/ The System Prompt

The checkpointer or Memory

Setup the Model and Agent

Orchestrating the ReAct Loop

LangChain ReAct Agent Streaming Route

Imports and Setup

POST Route Handler

Request Parsing and Validation

Agent and Stream Pipeline Initialization

Running the loop

Immediately Invoked Function Execution (IIFE)

Return the stream

Complete Code for route.ts

What’s Next in the Series? 🧠

Part 2.1: Building the useAgent Hook

💻 Complete Codebase

Similar Posts