How to Build Travel AI Agents Using Phidata and Qdrant

13 min read9 hours ago

–

I have seen the travel industry undergoing a massive shift in the last few years and AI now sits at the center of that transformation. Nearly every company is racing to build the next personal travel concierge: a system that understands your preferences, budget, and travel style, and plans the perfect trip with a single prompt.

Instead of curated, high-quality itineraries, users often receive:

Hotels that don’t exist
Restaurants that closed years ago
Generic “copy-paste” trip plans
Recommendations that don’t match their budget or interests

The core issue is that most of these systems rely purely on the language model without connecting it to real-world travel data.

When you converse with a language model, it’s astonishingly good at grasping what…

13 min read9 hours ago

–

I have seen the travel industry undergoing a massive shift in the last few years and AI now sits at the center of that transformation. Nearly every company is racing to build the next personal travel concierge: a system that understands your preferences, budget, and travel style, and plans the perfect trip with a single prompt.

Instead of curated, high-quality itineraries, users often receive:

Hotels that don’t exist
Restaurants that closed years ago
Generic “copy-paste” trip plans
Recommendations that don’t match their budget or interests

The core issue is that most of these systems rely purely on the language model without connecting it to real-world travel data.

When you converse with a language model, it’s astonishingly good at grasping what you intend to do: it can reason, summarize, and even pick up what you want to say from just a few words. But there is one enormous limitation: if there’s no actual, fact-checked information for the model to get its data from, it then begins to guess. And that is where misinformation or half-correct recommendations start creeping in.

That is exactly why RAG (Retrieval Augmented Generation) changes the game. Instead of letting the model rely only on what it “remembers,” RAG makes it pause, understand your request, and then actually look things up — searching through reliable databases, documents, or sources. The model doesn’t imagine answers; it grounds them in real data before presenting anything to you.

But here’s the thing:

RAG alone still won’t give you a great trip.

A memorable journey isn’t a bunch of search results stitched together. You want pacing that fits your energy, timing that respects your style, and trade-offs that make sense for you. Maybe you are a sunrise person. Maybe you like slow mornings and lively nights. Maybe food matters more than museums for you. A plain RAG system won’t understand that.

That is where multi-agent AI systems completely change the experience.

Think of it like having your own personal travel team.

One agent gets to know you: your vibe, budget, preferences, what kind of moments you like.

Another becomes your hotel hunter, filtering out noise and only bringing the stays that match your mood to you.

A third one becomes your explorer, diving into attractions, food spots, hidden gems, nightlife, and even offbeat places you didn’t know existed.

Then there’s the orchestrator agent, the planner with the big map who takes everything the others found and puts it into a day-by-day itinerary that feels like it was tailored just for you.

The bottom line?

Not a robotic list, nor a hollow summary. But a trip that reflects you: your pace, your preferences, your travel personality. It feels less like using a tool… and more like having a personal travel concierge that knows you better with each conversation.

And the best part is that you can build a system like this and even outperform many commercial AI planners using Phidata + Qdrant, with no large engineering team or complex infrastructure.

To demonstrate the full workflow, this guide walks you through a working prototype built for Goa, India, featuring:

A hotel agent (dense + sparse search) using Goibibo datasets
A discovery agent that surfaces must-visit attractions using real traveler review datasets
A planner agent that combines both and generates a personalized itinerary

Architecture

Press enter or click to view image in full size

Source: Author

Our system currently consists of three specialized agents, each responsible for a different part of the travel-planning pipeline. Instead of one model doing everything, each agent performs a focused task and all of this orchestration is made possible using the Phidata (Phi) agentic framework.

Phidata allows us to register tools, define agents, and orchestrate how they talk to each other. Instead of manually routing queries, it handles intent-detection, tool-calling, inter-agent communication, and final response synthesis behind the scenes.

Hotel Agent

It uses hotel_search_tool, which retrieves hotel information from the Goibibo dataset stored in Qdrant by combining dense MiniLM embeddings for semantic search with sparse TF-IDF vectors for accurate keyword matching.
It applies filters such as minimum rating, star category, and so on.

Discovery Agent

It uses discover_places_tool to recommend places to visit in Goa based on descriptions from real travelers.
It leverages information fromIndian Places to Visit Reviews Dataset stored in Qdrant.
The tool performs hybrid search, retrieves the most relevant places, and then reranks using CrossEncoder to ensure highly accurate matches

Planner Agent

It is the “CEO” of our agent system which combines hotel recommendations + places + sequencing into a proper itinerary.
It reads user intent and decides
**Do they need a hotel? **If yes, calls hotel_search_tool.
Do they want sightseeing? If yes, calls discover_places_tool.
**Do they want a full itinerary? **If yes, calls both, then analyses results and drafts a final response.

Why Is This Architecture So Strong?

Modular: Each agent is responsible for one domain, thereby reducing hallucinations.
Powered by RAG: Both hotels and places rely on real datasets stored in Qdrant.
Context-Aware: The planner agent uses tool outputs + reasoning, producing coherent itineraries.
Extendable: New agents can be easily integrated into the current system like food agent, budget optimizer agent, flight search agent, and so on.

Building the Travel Knowledge Base

Dataset Collection

We use two Kaggle datasets: the Goibibo Hotels Dataset for hotel listings and the Indian Places to Visit Reviews Dataset for attraction-level reviews and places to visit.

Since Kaggle requires authentication, I’ll begin by configuring the API inside our Colab environment:

# Create the .kaggle directory!mkdir -p ~/.kaggle# Copy kaggle.json from correct path!cp "/content/kaggle.json" ~/.kaggle/# Set correct permissions!chmod 600 ~/.kaggle/kaggle.json

Once authenticated, we download and unzip both datasets.

**Goibibo Hotels Dataset (for hotel recommendation agent): **This dataset provides hotel names, amenities, ratings, room types, and locality data used to build the hybrid semantic–keyword search for the hotel agent.

# Download the Goibibo hotels dataset from Kaggle!kaggle datasets download -d PromptCloudHQ/hotels-on-goibibo# Extract into folder "goibibo_hotels"!unzip hotels-on-goibibo.zip -d goibibo_hotels

**Indian Places to Visit Reviews Dataset (for discovery agent): **This dataset contains traveler reviews and place descriptions from across India. We filter all records belonging to Goa and create dense + sparse embeddings for the discovery agent.

# Download the reviews-based tourism dataset from Kaggle!kaggle datasets download -d ritvik1909/indian-places-to-visit-reviews-data# Extract into folder "place_reviews"!unzip indian-places-to-visit-reviews-data.zip -d place_reviews

Data Analysis

Before building the retrieval engine and AI agents, the first step is to prepare our travel datasets.

Load & clean the Goibibo Hotel Dataset

import pandas as pddf = pd.read_csv("goibibo_hotels/goibibo_com-travel_sample.csv")# Normalize column namesdf.columns = (df.columns.str.strip().str.lower().str.replace(" ", "_"))# Filter rows where state = Goagoa_hotel_df = df[df["state"].str.lower() == "goa"]goa_hotel_df.head(3)Next, we keep only the hotel attributes relevant for retrieval and filtering.columns_to_keep = ["property_id","property_name","hotel_facilities","address","locality","city","state","hotel_star_rating","site_review_rating","site_review_count","room_type"]final_cols = [col for col in columns_to_keep if col in goa_hotel_df.columns]goa_hotels = goa_hotel_df[final_cols].copy()# Remove rows with missing valuesgoa_hotels = goa_hotels.dropna()goa_hotels.head(2)

This cleaned dataset becomes the knowledge base for the hotel search agent.

Load & clean the Tourist Places Review Dataset

import pandas as pddf = pd.read_csv("place_reviews/Review_db.csv")df["City"] = df["City"].astype(str).str.strip().str.lower()Define the list of Goa cities and localities present in the dataset.goa_places = ["agonda", "alto-porvorim", "amboli", "anjuna", "assagao", "baga", "bardez","benaulim", "calangute", "canacona", "candolim", "chapora", "divar island","dona paula", "margao", "marmagao", "mapusa", "nuvem", "old goa","panjim", "porvorim", "quepem", "saligao", "sangolda", "sanguem","sanquelim", "vasco da gama", "varca", "velsao", "verna", "vagator"]

Filter the dataset to only include entries from these regions:

goa_places = df[df["City"].isin(goa_places)].copy()# Keep only one row per city to prevent duplicatesgoa_places = goa_places.drop_duplicates(subset=["City"], keep="first").reset_index(drop=True)goa_places.head(2)

This dataset feeds into the place discovery agent, helping it retrieve tourist attractions, food spots, beaches, and nightlife areas from across Goa.

Embedding Models

To power semantic search and hybrid retrieval, we’ll prepare two embedding models:

Dense Embedding Model (SentenceTransformer)
Sparse Embedding Model (TF-IDF)

Dense Embeddings

from sentence_transformers import SentenceTransformerdense_model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

This model (MiniLM-L6-v2) creates 384-dimensional semantic vectors, allowing the system to understand meaning such as:

“beach resort” ≈ “seaside hotel”
“nightlife places” ≈ “clubs”, “party spots”

Dense vectors retrieve hotels or places based on meaning, even if the exact keywords are not present.

Sparse Embeddings

from sklearn.feature_extraction.text import TfidfVectorizertfidf = TfidfVectorizer()

TF-IDF creates keyword-based sparse vectors.

Advantages:

Captures exact keyword matches (“spa”, “pool”, “breakfast included”)
Helps retrieve items that match specific phrases
Crucial for precision filtering

Using both dense + sparse gives a powerful hybrid search that is more accurate than either alone.

Connecting to Qdrant Vector Database

from qdrant_client import QdrantClientQ_client = QdrantClient(url="https://2ddc31f6-2586-4530-a046-770d105c0382.europe-west3-0.gcp.cloud.qdrant.io:6333",api_key="YOUR API KEY")

We initialize the Qdrant client that connects to a cloud-hosted instance.

Why Qdrant?

For me, Qdrant is not just another vector database; it’s the missing puzzle piece that has made modern AI search both practical and powerful at the same time. I wanted something that captured those small, vital details people actually type in.

Qdrant does exactly that.

With hybrid search allowing semantic intelligence and keyword precision to live together in the same collection, it doesn’t force me to choose between “smart” and “accurate”. And with HNSW under its hood, I get lightning-fast search even when the dataset grows to millions of travel listings, hotels, or attractions.

The thing I like most is the way Qdrant handles metadata. It does it easily, whether I want to narrow the results by price, rating, city, vibe, or even a personal preference like ‘quiet neighborhoods.’ That gives flexibility to my travel planning system to behave almost like a human researcher: aware and able to adapt to real-world constraints.

Create Qdrant Collections (Hotels + Places)

Two separate collections store hotels and tourist attractions.

Hotels Collection

from qdrant_client.models import VectorParams, Distance, SparseVectorParamsQ_client.recreate_collection(collection_name="goa_hotels",vectors_config={"dense": VectorParams(size=384, distance=Distance.COSINE),},sparse_vectors_config={"sparse": SparseVectorParams()})print("Created Qdrant collection: goa_hotels")

Places Collection

Q_client.recreate_collection(collection_name="goa_places",vectors_config={"dense": VectorParams(size=384, distance=Distance.COSINE)},sparse_vectors_config={"sparse": SparseVectorParams()})print("Created Qdrant collection: goa_places")

Uploading Vector Embeddings to Qdrant (Hotels + Places)

Hotels Embedding Upload

Creates dense vectors using hotel facilities column and sparse vectors using TF-IDF over room types.

from qdrant_client.models import SparseVector, PointStructpoints = []for idx, row in goa_hotels.iterrows():    # Dense    dense_vec = dense_model.encode(str(row["hotel_facilities"])).tolist()    # Sparse TF-IDF    tfidf.fit([str(row["room_type"])])    sparse_row = tfidf.transform([str(row["room_type"])])    sparse_vec = SparseVector(        indices=sparse_row.indices.tolist(),        values=sparse_row.data.tolist()    )    # Build point    points.append(        PointStruct(            id=int(idx),            vector={                "dense": dense_vec,                "sparse": sparse_vec            },            payload=row.to_dict()        )    )Q_client.upsert(collection_name="goa_hotels", points=points)print(f"Inserted {len(points)} hybrid hotel vectors ")

Adds payload indexes for: city,hotel_star_rating,site_review_rating

from qdrant_client.models import PayloadSchemaTypeQ_client.create_payload_index(    collection_name="goa_hotels",    field_name="city",    field_schema=PayloadSchemaType.KEYWORD)Q_client.create_payload_index(    collection_name="goa_hotels",    field_name="hotel_star_rating",    field_schema=PayloadSchemaType.INTEGER)Q_client.create_payload_index(    collection_name="goa_hotels",    field_name="site_review_rating",    field_schema=PayloadSchemaType.FLOAT)print("Indexes created successfully ")

Tourist Places Embedding Upload

Creates a combined text block using Place + Review + City.

goa_places["text"] = (goa_places["Place"].astype(str) + " - " +goa_places["Review"].astype(str) + " - Located in " +goa_places["City"].astype(str))

Generates both dense + sparse vectors from the text column.

from tqdm import tqdmfrom qdrant_client.models import PointStruct, SparseVectorpoints = []for i, row in tqdm(goa_places.iterrows(), total=len(goa_places)):    text = row['text']    #Dense    dense_vec = dense_model.encode(text).tolist()    # Sparse TF-IDF    tfidf.fit([text])    sparse_tf = tfidf.transform([text])    sparse_vec = SparseVector(        indices=sparse_tf.indices.tolist(),        values=sparse_tf.data.tolist()    )    # Build Point    points.append(        PointStruct(            id=i,            vector={                "dense": dense_vec,                "sparse": sparse_vec            },            payload=row.to_dict()        )    )Q_client.upsert(    collection_name="goa_places",    points=points)print(f"Inserted {len(points)} hybrid tourist goa places vectors ")

Reranker Model

After Qdrant finds the initial matches, the CrossEncoder Reranker helps decide which ones actually fit the user’s intent. It reads your query and result together, understands subtle meaning and context, and gives each item a single accuracy score. This final reranking makes the results more relevant and personalized.

Load a Reranker Model

from sentence_transformers import CrossEncoderreranker = CrossEncoder("cross-encoder/ms-marco-MiniLM-L-6-v2")

Building the Hotel Search Tool

The Hotel Search Tool is responsible for retrieving, filtering, and ranking hotels from the Goa Hotels vector database stored in Qdrant.

Retrieving Information About Hotels from Qdrant

from qdrant_client.models import (Filter, FieldCondition, MatchValue, Range, SparseVector)def retrieve_candidates(query, min_stars=0, min_rating=0, top_k=10):    # Dense vector    q_dense = dense_model.encode(query).tolist()    # Sparse TF-IDF    sparse_q = tfidf.transform([query])    sparse_vec = SparseVector(        indices=sparse_q.indices.tolist(),        values=sparse_q.data.tolist()    )    # Filters    hotel_filter = Filter(        must=[            FieldCondition(key="city", match=MatchValue(value="Goa")),            FieldCondition(key="hotel_star_rating", range=Range(gte=min_stars)),            FieldCondition(key="site_review_rating", range=Range(gte=min_rating))        ]    )    # Dense search    dense_results = Q_client.search(        collection_name="goa_hotels",        query_vector={"name": "dense", "vector": q_dense},        query_filter=hotel_filter,        limit=top_k    )    # Sparse search    sparse_results = Q_client.search(        collection_name="goa_hotels",        query_vector={"name": "sparse", "vector": sparse_vec},        query_filter=hotel_filter,        limit=top_k    )    # Combine IDs    unique_ids = list({r.id for r in dense_results + sparse_results})    candidates = []    for pid in unique_ids:        rec = Q_client.retrieve("goa_hotels", [pid])[0]        candidates.append({            "id": pid,            "hotel": rec.payload.get("property_name"),            "locality": rec.payload.get("locality"),            "stars": rec.payload.get("hotel_star_rating"),            "rating": rec.payload.get("site_review_rating"),            "facilities": rec.payload.get("hotel_facilities")        })    return candidates

Reranking Results

def rerank(query, candidates):    sentences = [        f"{c['hotel']} {c['locality']} {c['facilities']}"        for c in candidates    ]    scores = reranker.predict([(query, s) for s in sentences])    for i, score in enumerate(scores):        candidates[i]["rerank_score"] = float(score)    return sorted(candidates, key=lambda x: x["rerank_score"], reverse=True)

Search Pipeline

def search_hotels(query, min_stars=0, min_rating=0, top_k=10):    # Retrieve dense+sparse candidates -    candidates = retrieve_candidates(query, min_stars, min_rating, top_k)    if not candidates:        return []    # Rerank retrieved candidates    ranked = rerank(query, candidates)    results = [        {            "hotel": c["hotel"],            "stars": float(c["stars"]),            "rating": float(c["rating"]),            "locality": c["locality"]        }        for c in ranked[:top_k]    ]    return results,candidates

Phidata Tool Wrapper

This is the function the Hotel Expert Agent calls.

def hotel_search_tool(query: str, min_stars: float = 0, min_rating: float = 0):    results, _ = search_hotels(        query=query,        min_stars=min_stars,        min_rating=min_rating,        top_k=5    )    if not results:        return "No hotels found matching criteria."    formatted = "\n".join([        f" {r['hotel']} —  {r['stars']} | Rating: {r['rating']} |  {r['locality']}"        for r in results    ])    return "Here are the best matches:\n\n" + formatted

Building the Discover Places Tool

The Discover Places Tool is responsible for retrieving, filtering, and ranking hotels from the Goa Places vector database stored in Qdrant.

Retrieving Information About Tourist Places from Qdrant

from qdrant_client.models import Filter, FieldCondition, MatchValue, Range, SparseVectordef retrieve_places(query, top_k=10):    q_dense = dense_model.encode(query).tolist()    sparse_q = tfidf.transform([query])    sparse_vec = SparseVector(        indices=sparse_q.indices.tolist(),        values=sparse_q.data.tolist()    )    dense_results = Q_client.search(        collection_name="goa_places",        query_vector={"name": "dense", "vector": q_dense},        limit=top_k    )    sparse_results = Q_client.search(        collection_name="goa_places",        query_vector={"name": "sparse", "vector": sparse_vec},        limit=top_k    )    ids = list({r.id for r in dense_results + sparse_results})    # Retrieve full records    candidates = []    for pid in ids:        rec = Q_client.retrieve("goa_places", [pid])[0]        candidates.append({            "id": pid,            "text": rec.payload.get("text"),        })    return candidates

Search Pipeline

def search_places(query, top_k=10):    candidates = retrieve_places(query, top_k)    if not candidates:        return []    ranked = rerank_places(query, candidates)    return ranked[:top_k]

Phidata Tool Wrapper

This is the function the Goa Discovery Agent calls.

def discover_places_tool(query: str):    results = search_places(query, top_k=5)    if not results:        return "No relevant places found."    formatted = "\n".join([        f"{r['text']}..."        for r in results    ])    return "Suggested places:\n\n" + formatted

Building Travel Agents With Phidata

After configuring the Qdrant hybrid search pipeline and preparing both datasets (Goibibo hotels + Indian tourist review dataset), the final step is assembling the multi-agent travel planning system using Phidata.

Define Hotel Agent

from phi.agent import Agentfrom phi.model.openai import OpenAIChatimport osos.environ["OPENAI_API_KEY"] = "Your API KEY"hotel_agent = Agent(name="Hotel Expert",model=OpenAIChat(id="gpt-4o-mini"),description="Recommends hotels in Goa.",tools=[hotel_search_tool],instructions=["If user asks for hotels or stays, call hotel_search_tool.",],markdown=True,show_tool_calls=True)

Define Discovery Agent

discovery_agent = Agent(name="Goa Discovery Agent",model=OpenAIChat(id="gpt-4o-mini"),description="Suggests attractions, nightlife, beaches and food spots in Goa.",tools=[discover_places_tool],instructions=["If user asks about things to do, places to visit, or nightlife, call discover_places_tool.",],markdown=True,show_tool_calls=True)

Define Orchestrator

planner = Agent(name="Goa Planner",model=OpenAIChat(id="gpt-4o"),description="Creates travel itineraries: hotels + places + pacing.",instructions=["1. Detect whether user needs hotels, sightseeing, or both.","2. Call hotel_search_tool if hotel preferences present.","3. Call discover_places_tool for attractions.","4. Combine results into a final formatted itinerary with day-by-day plan.","Do not show raw function output - rewrite into a travel-friendly summary."],tools=[hotel_search_tool, discover_places_tool],show_tool_calls=True,markdown=True)

How the Agents Work Together

Source: Author

You can send a prompt like:

“Plan a 4-day luxury Goa trip. Prefer Candolim or Baga. Want pool-facing hotel and nightlife. Min Stars=4.”

The Planner Agent analyzes the request and decides:

The user needs hotels (location + stars + pool-facing)
The user needs nightlife
The user needs a structured itinerary

Phidata allows this agent to automatically call tools based on the intent.

The Planner triggers a tool call and has information related to a curated list of luxury pool-facing hotels, a list of attractions/nightlife spots and user preferences (location, luxury style).

The Planner then:

Structures everything into a 4-day plan
Balances mornings, afternoons, nightlife
Places hotels strategically
Matches nearby attractions
Formats it into a clean travel-friendly itinerary

The final output is *not *raw tool output. It is rewritten using the LLM’s reasoning capability.

Agent Output

response = planner.run(    "Plan a 4-day luxury Goa trip. Prefer Candolim or Baga. Want pool-facing hotel and nightlife. Min_Stars=4")print(response.content)

Running: - hotel_search_tool(query=Goa Candolim or Baga pool-facing luxury, min_stars=4) - discover_places_tool(query=Goa Candolim Baga nightlife attractions)### 4-Day Luxury Goa Itinerary#### Day 1: Arrive in Goa and Settle In- **Hotel Check-in**: Choose from these recommended luxury hotels:  - **Cygnett Inn Celestiial** (4 stars, Candolim)  - **Treehouse The Palatium** (4 stars, Candolim)  - **White Pearl Suites** (5 stars, Baga)  - **Vivanta by Taj - Fort Aguada** (5 stars, Candolim)  - **Cidade De Goa** (5 stars, Panjim)- **Evening**: Explore the nearby beach area. Relax by the pool and enjoy a fine dining experience at your hotel.#### Day 2: Beach Day and Nightlife- **Morning**: Visit **Baga Beach**. Enjoy the sun, sand, and sea. Great for family bonding, couple trips, and making new friends.- **Afternoon**: Return to the hotel for a refreshing dip in the pool and grab a delicious lunch.- **Nightlife**: Dive into Goa's vibrant nightlife scene at clubs in Baga and Candolim. Enjoy cocktails and music at the local hotspots.#### Day 3: Cultural Exploration- **Morning**: Explore the **Houses of Goa Museum** in Porvorim. Discover its unique architecture and fascinating exhibits.- **Afternoon**: Visit the **Our Lady of Assumption Church** in Velsao for some quiet reflection and to witness the local culture.- **Evening**: Dinner at one of the eateries in Candolim or Baga, followed by a leisurely beachside stroll.#### Day 4: Relaxation and Departure- **Morning**: Head to the **Benz Celebrity Wax Museum** in Calangute and marvel at the fine statues and dine at the adjoining restaurant.- **Afternoon**: Sign up for one of the **Best Shore Trips** in Marmagao to explore secret beaches and enjoy the ocean.- **Evening**: Arrange for a late check-out, enjoy the hotel amenities one last time, and prepare for your departure.This itinerary combines luxury accommodation with the excitement of Goan nightlife and the tranquility of its cultural landmarks. Enjoy your trip!

The above output shows how a multi-agent system eliminates hallucinations by grounding every answer in real Qdrant-stored hotel and attraction data, not model guesses.

The Hotel Agent and Discovery Agent fetch only verified results using hybrid dense + sparse search plus CrossEncoder reranking. The Planner Agent then combines these factual results into a coherent, personalized itinerary ensuring relevance and zero fabricated recommendations.

Summary

In this project, I built a travel-planning system that finally behaves the way I want AI tools to: really grounded, dependable, and (almost) devoid of hallucinations. Because each suggestion is pulled from Qdrant, and not out of the model’s imagination, the planner remains fact-based and personalized. The result sounds less like an AI making guesses and more like a real assistant setting up a trip for the traveler. I hope you found my tutorial helpful.