Building a useAgent Hook to Stream Agent Thought, Tools, and Output | Part 3

Building a useAgent Hook to Stream Agent Thought, Tools, and Output | Part 3 | Building Personal AI Assistant

Streaming LLM Tokens is NOT THE SAME as streaming agent events.

AI Personal Assistant — Next.js, LangChain, MistralAI

TL;DR

After writing the backend to stream AI agent events, the next challenge was handling those events on the frontend. In this post, I build a useAgent hook from scratch to consume streamed agent events, update the UI in real time, and make the agent’s behavior visible — what it’s thinking, which tools it’s using, and ultimately the final response.

I...

Building a useAgent Hook to Stream Agent Thought, Tools, and Output | Part 3 | Building Personal AI Assistant

Streaming LLM Tokens is NOT THE SAME as streaming agent events.

TL;DR

Introduction

Welcome back to Building a Personal AI Assistant series! In the previous part, we initialized a LangChain Agent and invoke it using .streamEvents() , allowing the backend to emit structured events instead of single final response.

Streaming events is key to understanding how an agent arrives at its final output. Unlike raw token streaming, agent events tell us what stage the agent is in, which tool it decided to use, and why the output looks the way it does. They also make it much easier to reason about and filter meaningful information from the continuous stream of data the agent produces.

At this point, the backend was doing the right thing — but the frontend had no idea how to interpret these events. I decided to build a small abstraction that understands the agent streaming natively. So I build a custom react hook called useAgent . Let’s dive in..

Building the useAgent Hook

To keep the streaming logic separate from the Chat component, it was crucial to introduce a custom hook. This hook would own the conversation state, handle loading and streaming status, and expose a single function to start the agent execution.

Since the agent can both respond with text and invoke tools, I needed a message shape that could represent more than just plain content.

// src/hooks/useAgent.ts import { useState, useCallback, useRef } from 'react'; export interface AgentMessage { role: 'user' | 'assistant'; content: string; toolCalls?: Array<{ tool: string; input: any; output?: any }>; }

Each message represents a single turn in the conversation. While most messages only contain text, assistant messages can optionally include toolCalls, which allows the UI to surface which tools were used and what data flowed through them:

// src/hooks/useAgent.ts import { useState, useCallback, useRef } from 'react'; export interface AgentMessage { role: 'user' | 'assistant'; content: string; toolCalls?: Array<{ tool: string; input: any; output?: any }>; } export const useAgent = () => { const [messages, setMessages] = useState<AgentMessage[]>([ { role: 'assistant', content: 'Hello Bhai! How can I assist you today?' }, ]); const [isLoading, setIsLoading] = useState(false); const [currentResponse, setCurrentResponse] = useState(''); const threadIdRef = useRef<string | null>(null); const sendMessage = useCallback(async (message: string) => { }, []); const clearMessages = useCallback(() => { setMessages([]); setCurrentResponse(''); threadIdRef.current = null; }, []); return { messages, isLoading, currentResponse, sendMessage, clearMessages, threadId: threadIdRef.current, }; };

The hook owns all state related to the agent lifecycle. messages stores the finalized conversation, while currentResponse is used to progressively build the assistant’s output as events stream in. This separation avoids unnecessary re-renders and keeps the message list stable until the agent finishes its turn.

I also use a ref to store the threadId, since it needs to persist across renders without causing updates. This allows the agent to maintain conversational context while keeping the UI simple.

At this stage, sendMessage is intentionally left empty. Before dealing with streaming events, I wanted to clearly define what the hook owns and what it exposes.

Implementing sendMessage: Streaming Agent Events on the Frontend

// src/hooks/useAgent.ts const sendMessage = useCallback(async (message: string) => { setIsLoading(true); setCurrentResponse(''); // Add user message setMessages((prev) => [...prev, { role: 'user', content: message }]); try { // streaming logic.. } catch (error) { console.error('Error sending message:', error); setMessages((prev) => [ ...prev, { role: 'assistant', content: `Error: ${error instanceof Error ? error.message : 'Unknown error'}`, }, ]); } finally { setIsLoading(false); setCurrentResponse(''); } }, []);

With the state and hook structure in place, the next step is implementing sendMessage — this is where the agent stream begins and where frontend event handling actually happens.

Sending a message is also the moment an agent “turn” begins. As soon as the user submits input, I mark the agent as loading and immediately append the user message to the conversation. This keeps the UI responsive and makes the interaction feel synchronous, even though the agent response is streamed asynchronously.

// src/hooks/useAgent.ts try { // trigger the agent api const response = await fetch('/api/agent', { method: 'POST', headers: { 'Content-Type': 'application/json', }, body: JSON.stringify({ message, threadId: threadIdRef.current, }), }); // handle error if (!response.ok) { throw new Error(`HTTP error! status: ${response.status}`); } } catch (error){ ... }

Instead of treating each request as stateless, the hook sends the current threadId along with the message. This allows the backend agent to maintain context across turns without the UI needing to manage that complexity.

// src/hooks/useAgent.ts const reader = response.body?.getReader(); const decoder = new TextDecoder(); if (!reader) { throw new Error('No reader available'); }

At this point, we are not waiting for the response, the response is a stream of agent events continuously being sent by the LangChain Agent. A reader and a decoder is initialized to read and handle these events in a continuous loop.

// src/hooks/useAgent.ts let assistantMessage = ''; const toolCalls: Array<{ tool: string; input: any; output?: any }> = []; let currentToolIndex = -1;

While streaming, I maintain a local assistantMessage buffer to accumulate tokens, and a toolCalls array to track tool usage. This allows the UI to reflect both the agent’s reasoning process and its final response once the stream completes.

At the core of this is a continuous read loop that listens for agent events as they arrive.

// src/hooks/useAgent.ts while (true) { const { done, value } = await reader.read(); if (done) break; const chunk = decoder.decode(value); const lines = chunk.split('\n'); for (const line of lines) { if (line.startsWith('data: ')) { const data = line.slice(6); try { const parsed = JSON.parse(data); if (parsed.type === 'token') { // Stream individual tokens } else if (parsed.type === 'tool_start') { // Tool started } else if (parsed.type === 'tool_end') { // Tool completed } else if (parsed.type === 'done') { // Save threadId for conversation continuity } else if (parsed.type === 'error') { // Handle error } } catch (e) { console.error('Failed to parse SSE data:', e); } } } }

Since agent execution is event-driven, the frontend stays in a read loop for the duration of the stream. Each decoded chunk may contain multiple events, which are parsed and handled independently. This design allows the UI to react immediately to tokens, tool usage, and lifecycle events without waiting for the agent to finish its entire execution.

// src/hooks/useAgent.ts // Add assistant message setMessages((prev) => [ ...prev, { role: 'assistant', content: assistantMessage, toolCalls: toolCalls.length > 0 ? toolCalls : undefined, }, ]);

Once the agent finishes streaming, the accumulated tokens stored in the local assistantMessage buffer are committed to the conversation state. At this point, the assistant’s response is considered final and is appended to the message list along with any tool calls that occurred during execution.

This separation between streamed output and finalized messages keeps the UI responsive while preserving a clean, stable conversation history.

Using useAgent in the Chat UI

// src/components/chat/chat-box.tsx // Hook imports import { useAgent } from '@/hooks/useAgent'; export default function ChatBox() { const [userInput, setUserInput] = useState<string>(''); const { messages, sendMessage, isLoading, currentResponse } = useAgent(); function handleSubmit(){ e.preventDefault(); if (!userInput.trim()) return; // Prevent sending empty messages sendMessage(userInput); setUserInput(''); } return ( <div className="max-w-5xl h-[90vh] mx-auto flex flex-col"> ... </div> ); }

With the hook in place, the Chat component becomes intentionally minimal. It doesn’t know about streaming, agent events, or tools — it simply consumes state and triggers sendMessage when the user submits input.

// src/components/chat/chat-box.tsx <ConversationList conversation={messages} currentResponse={currentResponse} isLoading={isLoading} />

The conversation list receives both finalized messages and the currently streaming response. This allows the UI to render partial output in real time without knowing anything about how that output is produced.

Displaying tool calls

// src/components/chat/chat-box.tsx { conversation.map((msg, index) => { // Only the last message should show as streaming const isLastMessage = index === conversation.length - 1; const isStreaming = isLoading && isLastMessage; return ( <div key={index}> <MessageContent message={msg} isStreaming={isStreaming} currentResponse={currentResponse} /> {/* Tool Usage Indicator */} {msg.toolCalls && msg.toolCalls.length > 0 && ( <div className="mt-2 ml-4"> {msg.toolCalls.map((tool, idx) => ( <div key={idx} className="inline-flex items-center gap-2 text-xs text-gray-500 dark:text-gray-400 bg-gray-100 dark:bg-gray-700 rounded-full px-3 py-1 mr-2" > <Wrench className="w-3 h-3" /> <span>Used {tool.tool}</span> </div> ))} </div> )} </div> ); }); }

Because tool calls are stored alongside messages inside useAgent, the UI can surface tool usage without any additional logic. The conversation list simply checks for the presence of toolCalls and renders an indicator.

// src/components/chat/conversation-list.tsx /* Current Response (Streaming) */ { isLoading && currentResponse && ( <div className={cn('px-6 py-4 rounded-lg transition-colors', 'border border-gray-200')}> <Streamdown isAnimating={true} parseIncompleteMarkdown={true}> {currentResponse} </Streamdown> </div> ); }

While the agent is still executing, the UI renders currentResponse separately from the finalized conversation. This allows partial tokens to be displayed in real time without committing them as a message until the agent finishes its turn.

Since responses are streamed token-by-token, the UI renders markdown incrementally using a streaming-aware renderer, allowing partial formatting without breaking the layout.

Streaming AI Agent — using LangChain, MistralAI, Next.js

Wrapping Up

In this part, the focus was not on building a flashy chat UI, but on designing a frontend that actually understands how an AI agent behaves. By introducing a useAgent hook, the UI becomes a passive observer of agent execution rather than an active participant in streaming logic.

The hook owns the entire agent lifecycle — sending user input, streaming tokens, tracking tool usage, and finalizing responses — while the UI simply reacts to state changes. This separation makes the system easier to reason about, easier to extend, and significantly easier to debug as the agent grows more complex.

Instead of treating the AI as a black box that returns text, the frontend now treats it as a process: one that thinks, uses tools, and arrives at an answer over time.

What’s Next

With streaming and frontend orchestration in place, the next step is giving the agent more real-world capabilities by expanding its toolset.

In the upcoming parts, I’ll focus on building new tools for the AI agent, starting with integrations into external systems like Google Workspace. This includes things like:

reading and summarizing calendar events
checking availability
interacting with emails or documents
performing actions on the user’s behalf through authenticated APIs

These tools will plug directly into the same agent event pipeline you’ve seen here, which means the frontend won’t need to change — tool usage will automatically surface through streamed events and UI indicators.

Codebase

The entire codebase for this series is open-source and available on GitHub. It includes the backend agent implementation, streaming infrastructure, and the complete frontend built around the useAgent hook.

You can find the repository here: https://github.com/dakShh/bhai-saab

As always, feel free to explore, fork, or adapt parts of it for your own projects.

Building a useAgent Hook to Stream Agent Thought, Tools, and Output | Part 3 | Building Personal… was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Building a useAgent Hook to Stream Agent Thought, Tools, and Output | Part 3 | Building Personal AI Assistant

Streaming LLM Tokens is NOT THE SAME as streaming agent events.

TL;DR

I...

Building a useAgent Hook to Stream Agent Thought, Tools, and Output | Part 3 | Building Personal AI Assistant

Streaming LLM Tokens is NOT THE SAME as streaming agent events.

TL;DR

Introduction

Building the useAgent Hook

Implementing sendMessage: Streaming Agent Events on the Frontend

Using useAgent in the Chat UI

Displaying tool calls

Wrapping Up

What’s Next

Codebase

Similar Posts