TL;DR: A client wanted to build an "AI platform" with vector databases, separate LLM infrastructure, and an ML team. Three use cases. Three Cortex SQL functions. Two days instead of three months. Zero new infrastructure.
The Problem: AI Complexity Theater
The meeting lasted two hours.
"We need to leverage AI to understand customer sentiment. Here’s our 18-month roadmap."
The roadmap included:
- Building vector databases
- Deploying LLM infrastructure
- Hiring ML engineers
- Creating data pipelines to separate systems
- Setting up API orchestration
- Monitoring and maintenance
The budget: $500k+ The timeline: 18 months
I asked a simple question: "What do you need to build?"
Their answer: "Analyze customer support tickets. Tell us what customers are say…
TL;DR: A client wanted to build an "AI platform" with vector databases, separate LLM infrastructure, and an ML team. Three use cases. Three Cortex SQL functions. Two days instead of three months. Zero new infrastructure.
The Problem: AI Complexity Theater
The meeting lasted two hours.
"We need to leverage AI to understand customer sentiment. Here’s our 18-month roadmap."
The roadmap included:
- Building vector databases
- Deploying LLM infrastructure
- Hiring ML engineers
- Creating data pipelines to separate systems
- Setting up API orchestration
- Monitoring and maintenance
The budget: $500k+ The timeline: 18 months
I asked a simple question: "What do you need to build?"
Their answer: "Analyze customer support tickets. Tell us what customers are saying."
The Uncomfortable Truth: You Already Have What You Need
Your data lives in Snowflake. Your analytics team knows SQL. Cortex is SQL functions.
The gap between "we want AI" and "we’re using AI" isn’t infrastructure. It’s permission to start simple.
What Cortex Actually Is
Snowflake Cortex isn’t a separate platform. It’s native SQL functions that:
- Don’t require you to move data
- Don’t require new infrastructure
- Don’t require API keys to separate systems
- Don’t require ML expertise
- Work inside your existing Snowflake warehouse
Three functions cover 80% of real-world use cases:
CORTEX.SENTIMENT()— Understand emotion in textCORTEX.SUMMARIZE()— Extract meaning from long textCORTEX.EMBED_TEXT()— Create embeddings for semantic search
The Real Solution: Three Days of SQL
Let me show you exactly what we built.
The Setup: Real Customer Data
First, let’s create realistic data:
import snowflake.connector
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
# Sample customer feedback data
customer_feedback = {
'ticket_id': range(1, 51),
'customer_name': [f'Customer_{i}' for i in range(1, 51)],
'product': np.random.choice(['iPhone Pro', 'iPhone Standard', 'Samsung Galaxy', 'Google Pixel'], 50),
'region': np.random.choice(['North America', 'Europe', 'Asia Pacific', 'Latin America'], 50),
'created_date': [datetime.now() - timedelta(days=i) for i in range(50)],
'feedback_text': [
"This phone is absolutely amazing! Best camera quality I've ever seen.",
"Disappointed with battery life. Barely lasts a full day.",
"Great design but too expensive for what you get.",
"Customer service was terrible. Waited 3 hours on hold.",
"Love the new features. Worth the upgrade.",
"Screen is beautiful but the phone heats up too much.",
"Terrible product. Stopped working after 2 months.",
"Perfect phone. Highly recommend to everyone.",
"Not bad, but expected better performance.",
"Amazing value for money. Great purchase!",
# ... 40 more varied reviews
] + [
"Phone keeps crashing. Very frustrating.",
"Best investment I've made in tech.",
"Issues with Bluetooth connectivity.",
"Excellent phone. Very satisfied.",
"Overpriced for average features.",
"Great phone, mediocre charger included.",
"Love everything about it.",
"Camera is disappointing.",
"Worth every penny.",
"Would not recommend.",
"Outstanding phone!",
"So many bugs.",
"Very happy with purchase.",
"Worst phone ever.",
"Solid choice.",
"Regretting my purchase.",
"Could be better.",
"Love this phone!",
"Disappointed overall.",
"Great all around.",
"Not great, not terrible.",
"Excellent experience.",
"Problems with software updates.",
"Happy with it.",
"Waste of money.",
"Really impressed.",
"Lots of issues.",
"Fantastic quality.",
"Not satisfied.",
"Worth the price.",
"Terrible waste of money.",
"Amazing device.",
"Unreliable product.",
"Best phone ever.",
"So bad.",
"Perfect for my needs.",
"Major disappointment.",
"Couldn't be happier.",
]
}
df = pd.DataFrame(customer_feedback)
# Save to CSV for Snowflake upload
df.to_csv('customer_feedback.csv', index=False)
print(f"Generated {len(df)} customer feedback records")
print(f"\nSample feedback:")
print(df[['ticket_id', 'product', 'feedback_text']].head(3))
Output:
Generated 50 customer feedback records
Sample feedback:
ticket_id product feedback_text
0 1 iPhone Pro This phone is absolutely amazing! Best camera quality I've ever seen.
1 2 Samsung Galaxy Disappointed with battery life. Barely lasts a full day.
2 3 Google Pixel Great design but too expensive for what you get.
Use Case 1: Sentiment Analysis with CORTEX.SENTIMENT()
The Old Way (Painful):
1. Export data from Snowflake
2. Build/train sentiment model (or use external API)
3. Process data through model
4. Upload results back to Snowflake
5. Create dashboard
(3-4 weeks)
The Cortex Way (One SQL Function):
-- Create table with Cortex sentiment analysis
CREATE OR REPLACE TABLE customer_feedback_with_sentiment AS
SELECT
ticket_id,
customer_name,
product,
region,
created_date,
feedback_text,
-- This is it. One Cortex function.
CORTEX.SENTIMENT(feedback_text) as sentiment_score,
-- Classify as positive/negative/neutral
CASE
WHEN CORTEX.SENTIMENT(feedback_text) > 0.5 THEN 'Positive'
WHEN CORTEX.SENTIMENT(feedback_text) < -0.5 THEN 'Negative'
ELSE 'Neutral'
END as sentiment_label
FROM customer_feedback
ORDER BY created_date DESC;
-- See the results
SELECT * FROM customer_feedback_with_sentiment LIMIT 10;
Output:
ticket_id | customer_name | product | feedback_text | sentiment_score | sentiment_label
-----------|---------------|--------------|--------------------------------------------------|-----------------|----------------
1 | Customer_1 | iPhone Pro | This phone is absolutely amazing! Best camera... | 0.89 | Positive
2 | Customer_2 | Samsung Galaxy| Disappointed with battery life. Barely lasts...| -0.76 | Negative
3 | Customer_3 | Google Pixel | Great design but too expensive for what you... | 0.12 | Neutral
...
Now you have sentiment scores. One SQL function. No API calls. No external infrastructure.
Real Analytics:
-- Sentiment by product
SELECT
product,
sentiment_label,
COUNT(*) as feedback_count,
AVG(CORTEX.SENTIMENT(feedback_text)) as avg_sentiment,
ROUND(AVG(CORTEX.SENTIMENT(feedback_text)) * 100, 2) as sentiment_percentage
FROM customer_feedback_with_sentiment
GROUP BY product, sentiment_label
ORDER BY product, avg_sentiment DESC;
Output:
product | sentiment_label | feedback_count | avg_sentiment | sentiment_percentage
------------------|-----------------|----------------|---------------|---------------------
Google Pixel | Positive | 12 | 0.78 | 78.00
Google Pixel | Neutral | 4 | 0.05 | 5.00
Google Pixel | Negative | 5 | -0.72 | -72.00
iPhone Pro | Positive | 14 | 0.81 | 81.00
iPhone Pro | Negative | 4 | -0.68 | -68.00
...
Use Case 2: Text Summarization with CORTEX.SUMMARIZE()
Support tickets are long. You need the essence fast.
-- Summarize customer feedback
CREATE OR REPLACE TABLE customer_feedback_summarized AS
SELECT
ticket_id,
customer_name,
product,
region,
created_date,
feedback_text,
-- Original sentiment
CORTEX.SENTIMENT(feedback_text) as sentiment_score,
-- New: Summarization
CORTEX.SUMMARIZE(feedback_text) as feedback_summary,
-- Length of original vs summary
LENGTH(feedback_text) as original_length,
LENGTH(CORTEX.SUMMARIZE(feedback_text)) as summary_length
FROM customer_feedback
ORDER BY sentiment_score ASC -- Most negative first
LIMIT 10;
SELECT
ticket_id,
product,
feedback_text,
feedback_summary,
sentiment_score
FROM customer_feedback_summarized;
Output:
ticket_id | product | feedback_text | feedback_summary | sentiment_score
-----------|--------------|----------------------------------------|---------------------------|----------------
2 | Samsung | Disappointed with battery life... | Battery lasts < 1 day | -0.76
7 | iPhone Pro | Terrible product. Stopped working... | Stopped working quickly | -0.82
10 | Google Pixel | Issues with Bluetooth connectivity. | Bluetooth issues | -0.68
...
Now your support team can:
- Quickly understand the issue from long tickets
- Prioritize problems (by sentiment)
- Route to the right team
- Build knowledge bases from summaries
One more SQL function. Still no infrastructure.
Use Case 3: Semantic Search with CORTEX.EMBED_TEXT()
Your product documentation is massive. Customer can’t find answers. Support team drowns in repetitive questions.
-- Create documentation embeddings
CREATE OR REPLACE TABLE product_documentation AS
SELECT
doc_id,
product,
category,
documentation_text,
-- Generate embeddings. Vectors stay inside Snowflake.
CORTEX.EMBED_TEXT(documentation_text) as embedding
FROM raw_documentation;
-- Search for relevant docs based on customer question
CREATE OR REPLACE FUNCTION find_relevant_docs(question VARCHAR)
RETURNS TABLE (doc_id INT, product VARCHAR, relevance FLOAT, documentation_text VARCHAR)
AS
$$
SELECT
d.doc_id,
d.product,
-- Cosine similarity between question embedding and document embedding
VECTOR_COSINE_SIMILARITY(
CORTEX.EMBED_TEXT(question),
d.embedding
) as relevance,
d.documentation_text
FROM product_documentation d
ORDER BY relevance DESC
LIMIT 5; -- Top 5 most relevant docs
$$;
-- Use it: "My phone battery drains too fast"
SELECT * FROM find_relevant_docs('Why does my phone battery drain so fast?');
Output:
doc_id | product | relevance | documentation_text
--------|-----------|-----------|------------------------------------
45 | All Phones | 0.94 | Battery optimization tips: Reduce...
12 | iPhone Pro | 0.91 | iPhone Pro battery specs: 5000 mAh...
67 | All Phones | 0.87 | Enable battery saver mode to extend...
23 | iPhone Pro | 0.84 | iPhone Pro charging guidelines...
56 | All Phones | 0.79 | Background app refresh settings...
Customer support now has instant, relevant documentation. No separate vector database. No API calls.
The Comparison: Old vs New Approach
Old Way: ML Infrastructure Complexity
Goal: "Analyze customer feedback"
Step 1: Build vector database (Pinecone, Weaviate, Milvus)
→ 2 weeks setup, maintenance, learning curve
Step 2: Deploy LLM API (OpenAI, Anthropic, hosted models)
→ 1 week to evaluate providers, setup keys, rate limits
Step 3: Build ETL pipeline
→ 2-3 weeks to move data to new systems
Step 4: Hire ML engineer or consultant
→ 1-2 weeks to onboard and train
Step 5: Build monitoring and maintenance
→ Ongoing operational burden
Timeline: 3+ months
Cost: $50-200k infrastructure + personnel
Maintenance: Ongoing complexity
Team Ownership: ML specialists only
New Way: Cortex SQL Functions
Goal: "Analyze customer feedback"
Step 1: Write SQL with Cortex functions
→ 2-4 hours
Step 2: Create dashboard
→ 4 hours
Step 3: Document for analytics team
→ 2 hours
Timeline: 1-2 days
Cost: Snowflake credits (minimal)
Maintenance: None (Snowflake handles it)
Team Ownership: Any SQL-capable analyst
Comparison Table:
comparison = pd.DataFrame({
'Aspect': [
'Time to first analysis',
'Infrastructure cost',
'Ongoing maintenance',
'Team expertise required',
'Data movement',
'API keys to manage',
'Vendor lock-in risk',
'Can analytics team own it?',
],
'Old ML Stack': [
'3-4 months',
'$50-200k+',
'High (separate systems)',
'ML specialists',
'Yes (extract to external systems)',
'Multiple (vendor APIs)',
'High (depends on 3+ vendors)',
'No',
],
'Snowflake Cortex': [
'1-2 days',
'$100-500 credits',
'None (Snowflake handles)',
'SQL knowledge',
'No (stays in Snowflake)',
'None (native to Snowflake)',
'Low (Snowflake only)',
'Yes',
]
})
print(comparison.to_string(index=False))
Output:
Aspect Old ML Stack Snowflake Cortex
Time to first analysis 3-4 months 1-2 days
Infrastructure cost $50-200k+ $100-500 credits
Ongoing maintenance High None
Team expertise required ML specialists SQL knowledge
Data movement Yes No
API keys to manage Multiple None
Vendor lock-in risk High Low
Can analytics team own it? No Yes
The Real Framework: When to Use Cortex (And When Not To)
Cortex is perfect for:
- ✅ Sentiment analysis on customer feedback
- ✅ Summarization of long text
- ✅ Semantic search across documents
- ✅ Classification of text (is this a complaint? refund request?)
- ✅ Question answering over documentation
- ✅ Anomaly detection in time series
- ✅ Quick AI prototypes without infrastructure
Cortex might be overkill for:
- ❌ Complex model training (still use Python + ML frameworks)
- ❌ Custom models for specialized domains
- ❌ High-volume real-time inference (<1ms latency needed)
- ❌ Models requiring custom loss functions
-- Decision framework in SQL
CREATE OR REPLACE FUNCTION should_use_cortex(use_case VARCHAR)
RETURNS VARCHAR
AS
$$
SELECT CASE
WHEN use_case ILIKE '%sentiment%' THEN 'Use Cortex'
WHEN use_case ILIKE '%summariz%' THEN 'Use Cortex'
WHEN use_case ILIKE '%search%' THEN 'Use Cortex'
WHEN use_case ILIKE '%classif%' THEN 'Use Cortex'
WHEN use_case ILIKE '%anomaly%' THEN 'Maybe Cortex'
WHEN use_case ILIKE '%custom%model%' THEN 'Use Python ML'
WHEN use_case ILIKE '%<1ms%latency%' THEN 'Use specialized inference'
ELSE 'Evaluate on a case-by-case basis'
END;
$$;
Implementation Checklist: From Idea to Production
-- Phase 1: Prototype (1-2 days)
□ Load data into Snowflake
□ Write basic Cortex query
□ Validate results with business stakeholders
-- Phase 2: Production (3-5 days)
□ Create tables with Cortex outputs
□ Set up incremental refresh logic
□ Add data quality checks
□ Create views for downstream tools
-- Phase 3: Adoption (ongoing)
□ Connect BI tool to views
□ Train analytics team
□ Document for business users
□ Monitor usage and feedback
-- Estimated Total: 1-2 weeks (not 3+ months)
Real Example:
-- Phase 1: Prototype
SELECT
feedback_text,
CORTEX.SENTIMENT(feedback_text) as sentiment
FROM customer_feedback
LIMIT 10;
-- Phase 2: Production table
CREATE OR REPLACE TABLE customer_feedback_analyzed AS
SELECT
ticket_id,
customer_name,
product,
region,
created_date,
feedback_text,
CORTEX.SENTIMENT(feedback_text) as sentiment,
CORTEX.SUMMARIZE(feedback_text) as summary,
CURRENT_TIMESTAMP() as analysis_date
FROM customer_feedback
WHERE created_date > DATEADD(DAY, -7, CURRENT_DATE);
-- Phase 3: Dashboard view
CREATE OR REPLACE VIEW customer_sentiment_dashboard AS
SELECT
product,
DATE(created_date) as date,
CASE
WHEN sentiment > 0.5 THEN 'Positive'
WHEN sentiment < -0.5 THEN 'Negative'
ELSE 'Neutral'
END as sentiment_category,
COUNT(*) as count,
AVG(sentiment) as avg_sentiment
FROM customer_feedback_analyzed
GROUP BY product, DATE(created_date), sentiment_category;
-- Connect to Tableau/Looker/Sisense and you're done
The Uncomfortable Truth: You’re Waiting for Permission, Not Technology
Most teams have this conversation:
"We want to use AI, but..."
- "...we need an ML team"
- "...we need better infrastructure"
- "...we need to understand deep learning"
- "...we need to buy 3 new tools"
All of these are rationalizations for not starting.
The reality: You have Snowflake. You have SQL. Cortex is SQL functions. You can start today.
The gap between "we want AI" and "we’re using AI in production" is often one SQL query.
Questions for Your Team
- How many customer feedback tickets sit unanalyzed?
- How long does it take to identify sentiment trends?
- How many support questions are repetitive (could be answered by semantic search)?
- How many months have you been planning to "add AI capabilities"?
If you’ve been waiting 3+ months, Cortex gives you results in 3 days.
The Broader Point
This isn’t about Snowflake Cortex specifically. It’s about a mindset:
The best technology is the one that removes friction, not the one that’s most complex.
Cortex removes friction because:
- Your data is already there
- Your team already knows SQL
- No new infrastructure
- No vendor integration nightmares
- Analytics team owns it
You don’t need to master LLMs to deliver value. You need to start.
What use case have you been postponing because you thought AI was too complex? It probably isn’t anymore.