- 22 Dec, 2025 *
This year I got pretty into AI. Not only was I using Claude Code to help complete stalled projects, but I also dove headfirst into self-hosting LLMs, playing with different open models, and building agents. This gave me great appreciation for the sheer amount of compute needed to run the latest available public models and how they pale in comparison to the knowledge and speed of frontier models. I wanted to help others understand what is being discussed when we talk about different layers of AI systems and how those interact.
Key Terms to Know
While there are a plethora of terms to be familiar with, I just want to touch on Frontier and Open AI models.
Frontier Models
When I refer to an AI model as "Frontier," I’m referring to closed models that a…
- 22 Dec, 2025 *
This year I got pretty into AI. Not only was I using Claude Code to help complete stalled projects, but I also dove headfirst into self-hosting LLMs, playing with different open models, and building agents. This gave me great appreciation for the sheer amount of compute needed to run the latest available public models and how they pale in comparison to the knowledge and speed of frontier models. I wanted to help others understand what is being discussed when we talk about different layers of AI systems and how those interact.
Key Terms to Know
While there are a plethora of terms to be familiar with, I just want to touch on Frontier and Open AI models.
Frontier Models
When I refer to an AI model as "Frontier," I’m referring to closed models that are on the cutting edge of what’s available from the large AI companies such as Anthropic’s Claude, Google’s Gemini, and OpenAI’s ChatGPT, among others. These models are closed source in that their weights and biases are not publicly known, nor are the models available to download for someone to run locally or privately.
Open Models
Open models are freely available to anyone who wants to download them and has the hardware to run them. Model size, parameterization, and quantization all have an impact on performance and knowledge. Some open models are specialized with agentic capabilities, while others excel in different areas such as coding or writing (just like specialized frontier models).
Parameters: A model’s parameters refer to the weights and biases used to train said model. The larger the parameter number, the smarter or more capable the model—but that comes at the expense of requiring significantly larger resources to run.
Quantization: Quantization allows users to pull down larger parameter models while reducing the precision of the weights, meaning a slight loss of accuracy at the benefit of increased speed and efficiency and requiring less RAM. The lower the quantization, the less precision but the easier the model is to run.
To understand the lingo being used when discussing AI systems, let’s think of the components of a restaurant.
Layer 1: AI Chatbot Web Apps
Dining at a Full-Service Restaurant
Web application AI chatbots are hosted on remote servers, and a user typically just logs into an account and is able to ask questions and get back answers—quick and easy, like dining at a full-service restaurant. There are no setup or technical requirements to use these systems. These typically have a free account or paid monthly subscription fee structure. This is how the majority of people use AI and what comes to mind first when working with AI is discussed. It works well for answering questions, assisting with creative writing, research, and other general tasks.
Examples:
- ChatGPT
- Gemini
- Claude.ai
Layer 2: AI Browsers
Tableside Service
AI browsers are a new trend where companies have made browsers available that have deep AI integration. This makes the experience of navigating the web and looking up information much less about search engine results and more about conversational AI assistance. It is akin to going to a restaurant and getting tableside service where a waiter helps you throughout the course of the meal.
Examples:
- ChatGPT Canvas
- Perplexity Comet
Layer 3: Local AI
Cooking at Home
Similar to cooking at home in your own kitchen, locally-hosted AI models allow you to run AI completely offline, without sending data to someone else’s server. With just hardware constraints, you are free to use any open model privately, and there is no subscription cost or token constraints. It does take some technical setup and powerful hardware, but tools like Ollama have made selecting, deploying, and running local models extremely accessible. Users interested in offline access, enthusiasts, tinkerers, privacy-conscious individuals, or those with strict data privacy guidelines are most likely to leverage local models.
Examples:
- Ollama
- LM Studio
Layer 4: Developer Tools
Professional Kitchen Equipment
Developer tools are for building applications or using AI in the terminal. These are your professional kitchen tools meant for chefs making dishes. It is typically associated with dev workflows or running tools allowing the AI models to read and write files on the user’s machine. Specific controls in this layer include human-in-the-loop patterns where the end user is responsible for reviewing the plan and actions and then permitting the AI to take those actions.
Examples:
- Hugging Face
- Cursor
- Claude CLI
- Cline (VS Code extension)
Layer 4.1: API Services & Pay-Per-Use
Wholesale Food Supplier
APIs are the wholesale food supplier, where ingredients are purchased in bulk and you build your own menu. They’re a great choice for more advanced users who want frontier model access but aren’t frequent enough users to justify static subscription costs. It also allows users to easily tie frontier models into automated workflows and tie systems together by leveraging API extensibility. There are services like OpenRouter that provide users with access to dozens of frontier models for a single low price. This helps consolidate multiple services into one convenient payment point as opposed to having to juggle multiple service payments.
Examples:
- OpenRouter
- Anthropic API
- OpenAI API
- Grok API
- t3.chat
- Amazon Bedrock
Cross-Layer Concepts
AI Agents
Your Personal Chef
Agents are the personal chefs of the restaurant world. Agents perform tasks, make decisions, and leverage tools all to serve the dish the customer wants. Unlike traditional chatbots that answer simple questions, agents can execute multi-step workflows, use various tools, and iterate on results to refine them for the best possible outcome.
Examples:
- Claude CLI Code Agent
- Cursor with Agent Mode
- Devin (autonomous software engineer)
AI Skills
Recipe Cards & Technique Guides
A skill is a reusable set of instructions that teaches an AI agent how to perform or complete specialized tasks, which allows for on-demand capabilities. The AI agent (such as Claude), when given a task, can pull from a skill file that contains a skill name, description, and instructions—but can also contain additional references such as scripts, templates, or other defined assets. The agent checks a skills folder and reads the description for each skill to determine its relevance to a given task. If needed, the entire file and associated assets are loaded for execution. AI skills are like providing your personal chef with a specialized recipe card or specific technique guide to be referenced when needed.
Example Skills:
- Code Review Skill – Contains a checklist and outlines best practices
- Bug Triage Skill – Defined workflow or process to categorize and prioritize bugs
- Data Analysis Skill – Reusable workflow for cleaning and analyzing datasets
RAG (Retrieval-Augmented Generation)
Restaurant with Specialty Ingredients & Reference Library
Like a restaurant with specialty ingredients and a reference library of cookbooks, RAG (Retrieval-Augmented Generation) allows an AI to access a custom knowledge base. Documents are submitted by the end user and processed and stored in a searchable format called a vector database. When questioned, relevant chunks are retrieved to generate accurate and contextualized answers regarding the submitted content.
The idea behind using RAG is to give AI information that it may not have been trained on, reduce hallucinations, and get answers about your specific documentation or data. This data could be internal company documents, legal documents, research papers, or even personal notes.
Examples:
- Claude Projects
- NotebookLM (Google)
- PrivateGPT (local)
- ChatGPT with file uploads
MCP (Model Context Protocol)
Supply Chain and Delivery Network
To allow agents to perform actions across systems and layers, a standard was developed to provide connective tissue between AI models, tools, and external data sources. MCP is like a restaurant’s supply chain—it connects the kitchen (AI) to various suppliers (databases, APIs, file systems, cloud storage). The result? AI agents are able to access real-time information and take actions beyond their training data.
Examples:
- GitHub MCP - Interact with repositories
- Google Drive MCP - Access to cloud files
Wrapping Up
Understanding these layers helps demystify the AI landscape. Whether you’re a casual user sticking with web apps, a privacy-focused individual running models locally, or a developer building custom AI-powered applications, there’s a layer that fits your needs. The restaurant metaphor reminds us that each layer offers different levels of convenience, control, and capability—and just like dining options, the right choice depends on what you’re trying to accomplish.
Now you may say, "Frank, that sounds great and all, but where do I start?" I’d recommend starting with Layer 1 web apps—they’re the easiest entry point and require zero technical setup. Once you’re comfortable there and understand what AI can do, you can explore the other layers based on your specific needs. Whether that’s diving into local models for privacy, exploring APIs for automation, or building agents for complex workflows, the foundation starts with understanding how these layers work together.
Resources: https://www.anthropic.com https://ollama.com https://www.cursor.com https://openrouter.ai