How to Create AI Agents Using Mastra and TypeScript

Introduction

Ever wish there was a smarter way to search for tech jobs? I built an AI agent that understands natural language queries like “Find me 5 latest Flutter jobs” and returns relevant postings from public job feeds.

This isn’t a tutorial on how to build agents from scratch. Instead, I’m walking you through how this specific project works—the decisions I made, the gotchas I hit, and real code examples you can learn from.

The agent workflow is simple but practical:

User asks: “Show me remote backend roles”
Agent extracts keywords: ['backend', 'remote']
Agent calls a tool that searches cached RSS feeds
Tool returns ranked, deduplicated jobs
Agent formats results for the user

You’ll need Node.js 18+, an OpenAI API key, and comfort reading TypeScript. …

Introduction

The agent workflow is simple but practical:

User asks: “Show me remote backend roles”
Agent extracts keywords: ['backend', 'remote']
Agent calls a tool that searches cached RSS feeds
Tool returns ranked, deduplicated jobs
Agent formats results for the user

You’ll need Node.js 18+, an OpenAI API key, and comfort reading TypeScript. We’re not reinventing the wheel here—just building on top of some solid libraries.

What You’ll Need

Before diving in, have these ready:

Node.js 18+ and npm
An OpenAI API key (get one at https://platform.openai.com/)
Basic TypeScript (don’t need to be expert—just comfortable with types)

If you’ve never used Mastra before, that’s fine. I’ll explain the key pieces as we go.

The Big Picture: How It All Fits Together

Imagine you’re building a small app with AI. You don’t just ask the AI a question and hope for the best. You:

Give it a tool it can use (in our case: “search these job feeds”)
Give it strict instructions (in our case: “always use the tool, don’t make stuff up”)
Cache external data so you’re not hammering RSS feeds every second
Validate everything with schemas so the code doesn’t break

Here’s the folder structure I’m working with:

src/mastra/
├── agents/
│   └── jobs-agent.ts       ← The agent + its instructions
├── tools/
│   └── rss-tool.ts         ← The "search jobs" tool
├── workflows/
│   └── jobs-workflow.ts    ← Compose steps that use the agent
├── utils/
│   ├── keyword-extractor.ts   ← Parse "Find 5 Flutter jobs"
│   ├── feed-cache.ts          ← Cache RSS data locally
│   └── feed-scheduler.ts      ← Auto-refresh feeds every 30 min
├── scorers/
│   └── jobs-scorer.ts      ← Grade agent responses
└── index.ts                ← Wire it all together

Don’t worry about memorizing this. You’ll see it in action.

Walking Through the Code: Step by Step

Step 1: The Agent Asks a Question

When a user types a query, it goes to the jobsAgent. Here’s what that agent looks like:

export const jobsAgent = new Agent({
name: 'Jobs Agent',
description: "Fetches recent remote and tech-related job listings from public RSS feeds",
instructions: `
You are a jobs search assistant. ALWAYS use the fetch-jobs tool to search for jobs.
Never make up job listings. If the tool returns 0 jobs, clearly state that no matches were found.
Format results with: Title, Company, Location, Description, Posted Date.
`,
model: 'openai/gpt-4o-mini',
tools: { rssTool },
memory: new Memory({
storage: new LibSQLStore({
url: 'file:../mastra.db',
}),
}),
});

What’s happening here:

We’re creating an agent that uses GPT-4o-mini (a fast, cheap OpenAI model)
We give it strict instructions to always use the fetch-jobs tool (this prevents hallucination)
It can access a memory store to remember past conversations

The key insight: the instructions are very explicit. We don’t say “maybe search for jobs.” We say “ALWAYS use the fetch-jobs tool.” This keeps the agent honest.

Step 2: Parse the User’s Query

Before calling the tool, we need to extract what the user actually wants. If someone says “Find me 5 latest Flutter jobs,” we extract:

Keywords: ['flutter']
Limit: 5
Location: maybe 'remote'

This happens in keyword-extractor.ts:

export function extractKeywords(input: string): string[] {
const stopwords = new Set([
"find","show","latest","remote","job","jobs","for","me","the",
// ... 40+ more common words
]);

const stemmer = natural.PorterStemmer;

const words = input
.toLowerCase()
.replace(/[^\w\s+.#]/g, "")  // keep tech symbols like + and #
.split(/\s+/)
.filter(word => word.length > 2 && !stopwords.has(word))
.map(word => stemmer.stem(word));  // "jobs" → "job", "running" → "run"

return [...new Set(words)];  // remove duplicates
}

Breaking this down:

We remove “noise words” like “find”, “show”, “the” (these don’t help find jobs)
We use a stemmer to reduce words to their root (so “Flutter”, “flutter”, “FLUTTER” all become “flutter”)
We return only the meaningful keywords

Example:

Input: "Find 5 latest Flutter jobs"
Output: ['flutter']   // (removed: find, latest, jobs)

This is surprisingly effective because job titles usually contain the tech you’re searching for.

Step 3: The Tool Does the Heavy Lifting

The rssTool is where the magic happens. It’s a Mastra tool, which means it has:

An input schema (what data it accepts)
An output schema (what it returns)
An execute function (what it actually does)

export const rssTool = createTool({
id: 'fetch-jobs',
inputSchema: z.object({
query: z.string().describe('e.g., "Flutter developer"'),
limit: z.number().default(10),
}),
outputSchema: z.object({
jobs: z.array(jobListingSchema),
total: z.number(),
query: z.string(),
}),
execute: async ({ context }) => {
const { query, limit } = context;
const keywords = extractKeywords(query);

// Load all cached jobs from RSS feeds
let allJobs = [];
for (const feedUrl of rssFeeds) {
const feedJobs = await fetchFeedWithCache(feedUrl);
allJobs = allJobs.concat(feedJobs);
}

// Remove duplicates (same job posted to multiple feeds)
const uniqueJobs = deduplicateJobs(allJobs);

// Filter: only keep jobs that match keywords
const matchedJobs = uniqueJobs.filter(job => {
const title = job.title.toLowerCase();
const desc = job.description.toLowerCase();
return keywords.some(kw => title.includes(kw) || desc.includes(kw));
});

// Sort by relevance (title matches > description matches), then by date
const sorted = matchedJobs
.sort((a, b) => {
const aScore = (a.title.toLowerCase().match(keywords[0]) ? 2 : 0);
const bScore = (b.title.toLowerCase().match(keywords[0]) ? 2 : 0);
return bScore - aScore;
})
.slice(0, limit);

return {
jobs: sorted,
total: sorted.length,
query,
};
},
});

What this does:

Takes your keywords and limit
Loads all cached job listings (from .cache/jobs, refreshed every 30 minutes)
Removes duplicates (a job might be posted to multiple feeds)
Filters to only jobs matching your keywords
Ranks by relevance (title matches are more important than description matches)
Returns the top N results

Notice: we don’t fetch live RSS here. We use a cache. This is important because:

RSS feeds can be slow (5-10 second timeouts)
We’d hit rate limits if we fetched on every query
Users expect fast responses

Step 4: Feed Caching

The cache lives in .cache/jobs and stores raw job listings for ~4 hours. Here’s how it works:

export async function fetchFeedWithCache(feedUrl: string): Promise<CachedJob[]> {
// Try cache first
const cached = await loadFromCache(feedUrl);
if (cached !== null) {
return cached;  // Cache hit! Return instantly
}

// Cache miss: fetch live, then save
try {
const response = await axios.get(feedUrl, { timeout: 8000 });
const feed = await parseFeed(response.data);

const jobs = feed.items.map(item => ({
title: item.title || 'No title',
link: item.url || '',
description: item.description?.substring(0, 500) || 'No description',
pubDate: item.published?.toISOString(),
source: feedUrl,
}));

// Save to cache for next time
await saveToCache(feedUrl, jobs);
return jobs;
} catch (error) {
console.error(`Failed to fetch ${feedUrl}:`, error.message);
return [];  // Return empty list, don't crash
}
}

Why this matters:

First request for a feed? We fetch and parse the RSS (takes 2-3 seconds)
Follow-up requests within 4 hours? We serve from cache (instant)
After 4 hours? We refresh from live feeds again

Step 5: Auto-Refresh with a Scheduler

We don’t want stale data. So every 30 minutes, we automatically refresh all feeds:

export function startFeedScheduler(intervalMinutes: number = 30): void {
if (isSchedulerActive) return;

const cronExpression = `*/${intervalMinutes} * * * *`;  // */30 = every 30 min

cron.schedule(cronExpression, async () => {
const result = await refreshAllFeeds();
console.log(`Refreshed: ${result.refreshed} feeds, Failed: ${result.failed}`);
});

isSchedulerActive = true;
}

How it works:

Uses node-cron to schedule a background job
Every 30 minutes, it refreshes to see if the 4 hours interval has elapsed so it can fetch new feeds and updates the cache
If a feed fails, we catch the error and continue (don’t crash the app)
This happens in the background—users don’t wait for it

Step 6: Validation with Zod

Every tool and workflow uses Zod schemas. This is important because TypeScript types disappear at runtime, but Zod validates at runtime:

const jobSchema = z.object({
title: z.string(),
link: z.string(),
description: z.string(),
pubDate: z.string().optional(),
source: z.string(),
});

const jobSearchResult = z.object({
jobs: z.array(jobSchema),
total: z.number(),
query: z.string(),
});

Why this matters:

If something returns bad data, Zod will catch it
The agent can’t accidentally pass malformed data downstream
You get helpful error messages instead of mysterious crashes later

Actually Running This Thing

Let’s get it working locally first:

npm install
npm run dev

This starts a dev server on http://localhost:4111/ with a web playground. Go there, select jobsAgent from the dropdown, and try typing:

Find 5 latest Flutter jobs

You’ll see the agent:

Extract “flutter” as the keyword
Search the cached jobs
Return formatted results with links

If you want to test the API directly:

curl -X POST http://localhost:4111/api/agents/jobsAgent/stream \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "Show me remote backend roles"}
]
}'

For production, build it:

npm run build

This creates .mastra/output with all the bundled code, and tells you how to start it:

node --import=./.mastra/output/instrumentation.mjs .mastra/output/index.mjs

Environment Setup

Create a .env file in the project root:

OPENAI_API_KEY=sk-your-key-here

That’s it. Everything else has sensible defaults.

How to Extend This project

Add a New Tool

Say you want a tool that looks up company data. Here’s the pattern:

// src/mastra/tools/company-lookup.ts
import { createTool } from '@mastra/core/tools';
import { z } from 'zod';

export const companyLookupTool = createTool({
id: 'lookup-company',
description: 'Get details about a company',
inputSchema: z.object({
companyName: z.string(),
}),
outputSchema: z.object({
name: z.string(),
foundingYear: z.number(),
description: z.string(),
}),
execute: async ({ context }) => {
const { companyName } = context;
// Call your API or database
return {
name: companyName,
foundingYear: 2020,
description: 'A cool company',
};
},
});

Then register it in src/mastra/index.ts:

import { companyLookupTool } from './tools/company-lookup.js';

// In the mastra config:
agents: {
jobsAgent: new Agent({
// ...
tools: { rssTool, companyLookupTool },  // ← Add here
}),
},

Add a New Agent

Same idea—create the agent, give it strict instructions, register it:

// src/mastra/agents/recruiter-agent.ts
export const recruiterAgent = new Agent({
name: 'Recruiter Agent',
instructions: `You help recruiters find candidates. Use the lookup-company and fetch-jobs tools.`,
model: 'openai/gpt-4o-mini',
tools: { companyLookupTool, rssTool },
// ... memory, scorers, etc
});

Then in index.ts:

agents: {
jobsAgent,
recruiterAgent,  // ← Add here
},

Add Observability

The scorers are already set up to grade agent responses. But you can log what happened:

const response = await jobsAgent.generate("Find Flutter jobs");
console.log(`Agent response:`, response.text);
console.log(`Tool called:`, response.toolResults?.length);
console.log(`Artifacts:`, response.artifacts);

What I Learned

1. Explicit instructions matter more than model size. A cheap model with crystal-clear instructions (“ALWAYS use this tool, NEVER make up jobs”) beats a fancy model with vague instructions every time.

2. Cache external data aggressively. RSS feeds are slow and unreliable. Caching with a 4-hour TTL means users get instant responses after the first request.

3. Zod schemas reduce integration bugs. Catching bad data at the tool boundary prevents cascading failures downstream.

4. Schedulers are better than webhooks. Periodic refresh via cron is simpler than trying to monitor feed changes.

5. Testing is hard with LLMs. You can’t easily unit test an agent’s text generation. Use scorers and manual testing with curl instead.

Next Steps

Now that you understand how this works, try:

Add more feeds to src/mastra/data/rss-feeds.ts
Improve keyword extraction using embeddings or NER instead of keyword matching
Add company metadata by looking up each job’s company in a database
Deploy to production using the build output

Github Repo

References

Mastra docs: https://docs.mastra.ai
@rowanmanning/feed-parser: https://www.npmjs.com/package/@rowanmanning/feed-parser
natural (PorterStemmer): https://www.npmjs.com/package/natural
Zod validation: https://zod.dev
node-cron: https://www.npmjs.com/package/node-cron

Introduction

Introduction

What You’ll Need

The Big Picture: How It All Fits Together

Walking Through the Code: Step by Step

Step 1: The Agent Asks a Question

Step 2: Parse the User’s Query

Step 3: The Tool Does the Heavy Lifting

Step 4: Feed Caching

Step 5: Auto-Refresh with a Scheduler

Step 6: Validation with Zod

Actually Running This Thing

Environment Setup

How to Extend This project

Add a New Tool

Add a New Agent

Add Observability

What I Learned

Next Steps

References

Similar Posts