What We Built Before LLMs — and Why Modern AI Deployments Are Failing Without It

What We Built Before LLMs — and Why Enterprise AI Deployments Are Failing Without It

Context Management at Scale: The Foundation Enterprise AI Deployments Need To Succeed

The Foundation That Still Matters

In 2012, we built what you’d now call a RAG system for 25,000 users across 170+ countries. We just didn’t have LLMs to do the heavy lifting.

The United Nations Development Programme (the largest UN development aid agency) had a problem that should sound familiar: institutional knowledge distributed across a vast global footprint, with no systematic way to connect it. Their Global Staff Survey captured the frustration — only 61 percent of UNDP staff were satisfied with how knowledge, experience, and expertise was accessible when needed. The information exis...

What We Built Before LLMs — and Why Enterprise AI Deployments Are Failing Without It

Context Management at Scale: The Foundation Enterprise AI Deployments Need To Succeed

The Foundation That Still Matters

In 2012, we built what you’d now call a RAG system for 25,000 users across 170+ countries. We just didn’t have LLMs to do the heavy lifting.

Documents flowed in from government ministries in Buenos Aires, project offices in Bangladesh, headquarters in New York — paper, email, shared drives, legacy databases. Everyone was generating and saving valuable information. Finding it when you needed it was another matter entirely. This wasn’t a storage problem or a search problem. It was a business problem: no one had designed a system around how people actually worked. When a registry clerk needed to route correspondence, when a program officer needed to track deliverables, when Finance needed to process an invoice — the context required to make those decisions wasn’t systematically available.

Fast forward to today, and I watch enterprise AI deployments struggle with the exact same challenges. RAG systems that retrieve irrelevant or sensitive documents. AI copilots that users abandon after week two. ROI that nobody can demonstrate. The technology stack changed dramatically. The fundamental problem didn’t.

We were doing context engineering before it had that name — building taxonomies, metadata schemas, and classification systems to make relevant information findable, usable, and trustworthy. But the key wasn’t the tools we used. It was understanding what context people needed to do their actual work, then building the foundation to provide it systematically. That systematic approach to context is what separates successful enterprise AI implementations from failed pilots. Let me show you what we learned.

Why This Was Hard (And Why It Matters for AI Today)

The UNDP implementation was hard in ways that had nothing to do with technology. The stack was enterprise-proven — SharePoint for document management, Nintex for workflow automation, MFCs for capture, Web Services for interoperability. What made this project brutal was the combination of scale, technical debt, and organizational politics. Sound familiar?

The Scale: 170+ Countries, One System

UNDP operates in over 170 countries, from New York headquarters to country offices with varying infrastructure capabilities. We took a phased approach: pilot with two HQ units, one regional bureau plus the one country office, then expand globally. This wasn’t just “deploy to more servers.” Each country office had different connectivity, different local requirements, different relationships with government ministries and NGOs sending them documents.

The e-Registry application was already the most-requested intranet feature across all country offices — everyone knew they needed it. But delivering a system that would work for a registry clerk in a well-connected HQ unit and a country office with intermittent connectivity required solving very different problems under one architecture.

The Technical Mess: “Electronic Document Chaos”

The project documentation literally used the phrase “Electronic Document Chaos” to describe what we found. I still have the diagram showing the existing document routing — it was a perfect illustration of what happens when information accumulates for decades without governance.

*Documents arrived through uncoordinated channels and dispersed into volatile storage locations — no systematic classification or retrieval across organizational boundaries*

Documents arrived through multiple uncoordinated channels — paper, fax, email, internal systems. From there, they naturally dispersed into what we called “unstructured, volatile storage locations.” That meant local file shares (no centralized backup guarantees), individual desktops (inaccessible when someone transferred or left), email inboxes (unsearchable by colleagues), physical filing cabinets (requiring institutional memory to navigate), and various local e-Registry applications that individual country offices had built to meet their immediate needs.

Meanwhile, the “official” centralized systems for enterprise resource planning and project repositories existed in parallel, but weren’t connected to each other or to the local solutions. Information lived in multiple places, managed by capable staff doing their best with available tools. But there was no systematic way to classify it, route it, or retrieve it when someone in a different office or country actually needed it for a decision.

Organizations never lack places to put documents. The problem was missing context infrastructure — no global taxonomy, no consistent metadata standards, no classification framework that worked across organizational boundaries.

The Political Reality: Governance at UN Scale

Then there was the governance layer — where enterprise implementations typically die. Confidentiality policies, retention schedules, transparency requirements, and security controls all had to be designed into the architecture from day one. Transparency requirements meant some documents had to be published openly, while others needed strict access controls. Only a small group of users could access content across organizational boundaries.

Getting alignment on these requirements required workshops across program offices, country offices, legal, administrative services, and regional bureaus. Each stakeholder group had different needs. We spent enormous energy on this because we’d learned the hard lesson: you cannot retrofit governance after deployment. (More on how we solved this in the Governance section.)

The Real Problem: No Unified Context Foundation

As said, the technology existed. Individual components were working — repositories storing documents, workflows routing them, scanners digitizing paper, web services connecting the systems. Country offices and teams had developed their own organizational approaches. What was missing was the layer that would tie it all together: a unified, systematic approach to context that would make these pieces useful at scale.

Without a global taxonomy, the same document type got filed differently in every office — not from lack of effort, but because each office had developed conventions that made sense locally. Without consistent metadata standards, search returned noise alongside signal. Without a shared classification framework, documents couldn’t flow intelligently across organizational boundaries — every routing decision required human judgment because the system had no intelligence layer to draw on.

We needed something beyond keyword search — semantic understanding of document content to enable automatic classification at scale.

This probably sounds familiar if you’re deploying AI in enterprises today. Scattered data sources. No consistent structure. Governance committees that can’t agree on data access policies. The technology stack looks different — vector databases instead of SharePoint libraries, embeddings instead of managed metadata — but the fundamental challenge hasn’t changed.

The question is still: how do you build the context foundation that makes intelligent systems actually work?

Here’s how we approached it.

Building the Context Management Foundation

We needed two things: a platform layer that provided common services, and applications that would prove it worked. You can’t build intelligent applications on chaos. This is what I now call Context Management: the systematic approach to making information findable, usable, and trustworthy at scale.

The approach we took matters: we didn’t start with technology or taxonomy design. We started by understanding the business problem, then identified what contextual information the solution would need to systematically provide for those specific decisions. Only then did we build the foundation. Not generic “context management” — but the specific context required to solve the actual business problem.

Platform + Wedge: The Two-Component Architecture

We designed the solution as two distinct components: the ECM Services Platform and the e-Registry Application.

The platform provided shared infrastructure: taxonomy, metadata, classification, workflows, centralized stores. It was designed from day one to support multiple document-centric applications, not just one. The project documentation used a library analogy that still holds: the platform was the bookshelves (organized by a filing system), the card catalog (indexes for finding things), and the policies governing how books move through the system.

The e-Registry was our wedge application. We didn’t try to solve every document management problem at once. Instead, we picked the single application with the clearest demand signal: the electronic registry had been the most-requested intranet feature across all country offices for years. Clear need, defined scope, measurable success criteria.

This wasn’t speculative platform-building — it was validated platform thinking: the e-Registry proved every platform capability with real users before expansion, and when Finance approached us with invoice processing requirements, the foundation worked without modification. The platform earned its existence by enabling production value, not by anticipating hypothetical future needs.

This was deliberate product strategy. Build the platform. Prove it works with one high-value application. Then expand. The architecture diagram showed future applications waiting in the wings — Project Workspaces, Project Documents Centre, country office intranets, a KM Centre — but none of those would work until the foundation was solid.

*Platform + wedge strategy: Prove the foundation before scaling applications*

Information Architecture: Designing for Global Scale

The information architecture had to work across 170 countries with different local requirements. We couldn’t design in a vacuum — we had to ground the taxonomy in how work actually flowed.

We started with UNDP’s Programme and Operations Policies and Procedures (POPP) — the operational framework that defined what documents every project was required to produce. Not arbitrary categories we invented, but the actual work patterns staff already followed. If policy says every project needs specific deliverables, that’s your taxonomy starting point. Build classification around actual work patterns, not abstract categories.

The Global Filing List became our foundation — designed to expand as needs evolved. We took a functional approach: the file plan reflected activities and transactions rather than organizational structure. The reason was practical: department names change constantly, but underlying business functions remain stable. A file plan based on org charts would be obsolete within a year.

We needed multiple hierarchical taxonomies working together — geographic (countries and offices), functional (document categories and types), and thematic (focus areas). The multi-hierarchy approach mattered because a single classification system couldn’t capture the different ways people needed to find documents: by location, by business function, or by topic. Each taxonomy connected to a global controlled vocabulary, which meant a document tagged in Buenos Aires used the same terms as one tagged in Bangladesh.

Country offices could extend the taxonomy locally — adding categories that made sense for their specific context — but within the global structure. Extensibility within governance, not instead of it.

The Classification Engine: Semantic Understanding at Scale

Here’s where it got interesting. At this scale, working across multiple languages, Manual classification would never scale consistently — and keyword matching wasn’t enough. We needed the system to understand what documents meant, not just what words they contained.

After evaluating options, we selected Concept Searching, a solution that specialized in semantic metadata generation and auto-classification. Their approach analyzed document content semantically, applied automatic metadata tags based on meaning and context, and scored classification confidence.

The configuration shows how this worked in practice. Each taxonomy used weighted signals — “clue scores” — that indicated how confident the system was about a classification. Different signals carried different weights based on reliability: filename patterns were more reliable than generic header fields, so they scored higher. The system accumulated evidence from multiple signals to reach a classification decision. Multilingual support meant “Asunto:” (Spanish for Subject) triggered the same classification logic as “Subject:”.

High confidence scores meant automatic classification. Medium confidence queued documents for quick human review. Low confidence required manual tagging. The system learned which clue patterns worked, and we tuned the scores based on actual classification accuracy.

This was 2012. We built it with rules-based semantic analysis and carefully designed taxonomies. You’d solve the same requirement today with embeddings and LLMs — the implementation tools changed, but the architectural requirement for semantic understanding at scale didn’t.

Governance: Designed In, Not Bolted On

The project documentation captured a lesson we learned early: “Experience taught that disposal and retention requirements must be considered alongside file plan requirements during the planning phase.” You cannot retrofit governance after deployment.

We built the governance framework into the architecture from day one. Retention policies varied by document type and legal requirements. Disposal schedules had to balance compliance obligations with operational needs. IATI compliance requirements meant certain project documents had to be published openly while others needed strict access controls.

The security model reflected this complexity. Only 30 users out of 25,000 could access content across organizational boundaries. Everyone else saw only their specific network’s documents. The system had three security layers: security levels applied by administrators, security caveats for specific restrictions (like budget information within an otherwise-open finance classification), and access controls that document creators could apply. Getting this architecture right meant the difference between a system people trusted and a system people worked around.

This wasn’t bureaucracy for its own sake. It was the framework that made the system trustworthy. Staff would only use a system they trusted to protect sensitive information and comply with legal requirements.

This was the Context Management foundation: platform architecture, information architecture, classification engine, governance framework. Each piece required the others. The taxonomy was useless without classification to apply it. Classification was useless without governance to make it trustworthy. Governance was useless without a platform to enforce it. They had to be designed together, not bolted together.

With this foundation in place, we could build applications that actually worked at scale. Like the intelligent document processing system that handled everything from scanned government correspondence to email attachments — routing documents automatically based on content understanding, not just file names.

e-Registry & Intelligent Document Processing

The e-Registry was what users saw: document intake, routing, tracking, and retrieval. Behind it was the intelligent document processing system — OCR, classification, and workflow automation tied together by the Context Management foundation. None of it would have functioned without the platform layer we built first.

How the System Worked: From Paper to Intelligence

Documents arrived through multiple channels. Physical paper from government ministries and vendors went through scanning stations in each operating unit. Emails with attachments were captured automatically through dedicated e-Registry mailboxes — each country office had a standardized address that routed to their “Drop-off Library”. Internally generated documents — Word files, Excel spreadsheets, PDFs — entered the system directly.

The architecture had to accommodate vastly different connectivity scenarios — from well-connected headquarters to remote offices with intermittent bandwidth. We built multiple ingestion paths: email-based scanning for simple setups, direct HTTPS upload for connected offices, and local branch servers with background sync for high-volume remote locations. This mattered when you’re operating everywhere from New York headquarters to country offices with intermittent connectivity.

Once captured, documents flowed to the Capture Centre for OCR processing. The system converted scanned images to machine-readable text, attempted to recognize document types against trained samples, and extracted metadata fields — sender, recipient, date, subject, reference numbers. For emails, the system automatically parsed the header fields. For scanned paper, it applied zone recognition and barcode detection where available.

The processed document then landed in the unit’s Drop-off Library, where the real intelligence kicked in. The system executed an automated registration sequence: applying default values, auto-classifying based on content, generating unique IDs, and routing to the appropriate library based on document type and security requirements.

The document reached the right person without manual intervention — classified, tagged, and ready for action.

Quality Control: The Classification Confidence Workflow

Here’s the operational reality: you can’t have 100% automation when inputs vary this much. Crisp scans from headquarters copiers look nothing like degraded faxes from remote offices. Handwritten notes on invoices don’t OCR cleanly. Unusual document formats don’t match trained samples.

But you also can’t have 100% manual review when you’re processing thousands of documents daily across 170 countries. Registry staff would drown.

We built a three-tier system based on classification confidence:

High confidence (30+ points): Auto-classify and route. Standard memo formats with clear indicators — recognizable document type markers in filenames, sender and subject fields matching expected patterns — got processed without human intervention. The document reached its destination automatically.

Medium confidence (10–29 points): Queue for validation. The system applied a tentative classification but flagged the document for quick review. Registry staff could confirm with one click or correct if needed. This was the majority of edge cases: documents that looked mostly right but had some ambiguity.

Low confidence (below 10 points): Manual classification required. Unusual formats, poor quality scans, documents in unexpected languages. These went to a queue for full human review and tagging.

The key was routing human attention where it mattered. High-confidence documents flowed through automatically. Staff spent their time on the genuinely ambiguous cases, not reviewing every single document.

Manual corrections fed back into the classification rules. When staff consistently reclassified a certain pattern, we adjusted the clue scores. The system improved over time — but we never tried to eliminate human judgment entirely. Some documents genuinely require interpretation.

*Three-tier confidence system: Route human attention where it matters, automate what’s clear, review what’s ambiguous*

Multi-Language Reality

Operating across 170 countries meant dealing with dozens of languages. English, French, Spanish, and Arabic were dominant, but country offices also handled documents in local languages.

OCR accuracy varied significantly by language and font quality. Arabic right-to-left text required different processing than European languages. Spanish documents used different formatting conventions — “Asunto:” instead of “Subject:”, “De:” instead of “From:” — so classification rules needed multilingual pattern matching.

The clue scoring system handled this by including language variants. A document with “MEMORANDO” triggered the same classification logic as one with “MEMO.” This wasn’t just translation; it was recognizing that the same document type appears differently across linguistic contexts.

Modern AI-powered OCR faces the same challenges: language support, accuracy variation across scripts, and formatting differences by locale. The tools improved; the requirement didn’t change.

Platform Validation: Finance Came Calling

Before the e-Registry was even fully deployed, Finance approached us with a problem.

They had paper invoices arriving from vendors worldwide. Multiple languages, varying formats, needed digitization, classification, routing to approvers, and a complete audit trail. Could the platform handle it?

The answer was yes — because we’d built the foundation right. The taxonomy could extend to financial document types: invoices, purchase orders, payment requests, direct debits. The classification engine could route based on vendor name, amount thresholds, or department codes. Our workflows already supported approval chains with escalation rules. The governance framework already covered retention requirements and audit compliance for financial records.

The project documentation even included an invoicing workflow that handled the complete lifecycle: from incoming invoice through approval chains to payment processing and case closure. The workflow handled partial payments, questions requiring clarification, and exception cases.

This was platform thinking working exactly as designed. The e-Registry proved the foundation worked for correspondence. Finance proved it could extend to entirely different document types without rebuilding the platform. Different application, same foundation.

The technical architecture worked. The classification engine worked. The platform strategy validated itself when another department showed up asking to use it.

But deployment success wasn’t about the technology. The hard parts — the parts that actually determined whether this succeeded — were organizational, not technical. Let me show you what that looked like.

The Hard Parts — What Actually Mattered

The platform worked. Classification accuracy improved with every batch of documents. The OCR handled multiple languages. The workflows routed correctly. None of that mattered if people kept saving documents to their local drives instead of the system.

Enterprise implementations of this scale require more than technical architecture. We had executive sponsorship willing to back calculated risks, forward-thinking leadership that understood foundational work comes before quick wins, and a dedicated team across project management, technical delivery, and regional support who made deployment real. My role was designing the context management foundation and coordinating delivery across this organizational complexity — but the success belonged to everyone who understood what we were building.

Designing for Adoption

We were asking people to change how they worked. Stop using local file shares and desktop storage where they had complete control. Stop using the individual e-Registry applications that country offices had built themselves over years. Start using a centralized system with a global taxonomy that felt foreign at first.

We were asking people to give up local control for global consistency. That’s a hard sell without proof it works.

So we designed adoption into the deployment from day one. The project timeline made this explicit: training materials, communications plan, and support model all had to be completed before global rollout — not after. These weren’t afterthoughts bolted on at the end.

We ran business process workshops with staff in each business unit to map their actual work — functions, activities, transactions. This did two things: it grounded the file plan in actual work patterns, and it gave staff a sense of ownership. They’d seen the system before go-live. They understood the logic because they’d helped shape it.

We established a Power User role in every business area — staff who received additional training before implementation and provided local support during the early rollout. When someone couldn’t figure out where to file a document, they had a colleague down the hall who could help. The support model wasn’t just a help desk ticket; it was distributed knowledge in every office.

Pilot First, Scale After

We didn’t deploy globally on day one. The project took a phased approach:

Phase 1 (Pilot): Pilot rollout to two HQ units (Executive Office and Administrative Services) plus Argentina country office. This gave us a mix of headquarters sophistication and country office reality — varied connectivity, contrasting document volumes, and distinct user expectations.

Phase 2 (Adjust): Adjust solution based on pilot. The project plan said this explicitly: learn from the pilot before scaling. This phase also finalized training materials, completed the communications plan, and established the support model. Nothing went global until these were ready.

Phase 3 (Technical Deployment): Technical deployment for global availability. Only after pilot validation.

Phase 4 (Systematic Expansion): Knowledge transfer, ongoing workflow configuration, rollout to additional units. Not a big bang — a systematic expansion with continuous learning.

Phase 5 (Completion): Project completed.

The pilot phase caught things that testing never would. Workflow tweaks that made sense in headquarters didn’t always work in country offices. Classification rules needed adjustment based on actual document patterns. Training materials that worked for experienced registry staff needed simplification for occasional users. We fixed these issues in the pilot phase, before global rollout.

*Systematic rollout: Pilot → Validate → Adjust → Scale*

The Pattern That Still Holds

I see organizations make the same mistakes with enterprise AI deployments today. They treat them as technology projects focused on efficiency gains rather than organizational transformation. Ship the model, deploy the interface, wonder why adoption stalls.

A RAG system goes live without understanding what problems it should solve or the context to solve them. An AI copilot rolls out globally before anyone proves it helps one team. An AI system deploys without governance framework and exposes sensitive data across organizational boundaries.

The technology stack changed. The organizational requirements didn’t.

And I’ve seen this pattern enough times to know what works.

Formalizing the Patterns

The systematic approach we used at UNDP wasn’t accidental. Over twenty years of implementations — enterprise content management systems, ERPs, CRMs, cloud platforms, now AI — I’ve seen these patterns repeat across every technology stack.

I’ve formalized them into two frameworks I use today: Customer Impact Framework (translating technical capabilities into measurable business outcomes) and CLEAR Tracks (coordinating delivery across complex stakeholder environments). The principles haven’t changed since 2012. Just the tools.

This was 2012. We built it with SharePoint and rules-based classification. You’d build it differently today — with vector databases and LLMs. But you’d still need the same systematic rollout, the same attention to adoption, the same coordination across teams. The implementation tools changed. The deployment patterns didn’t.

Let me show you what’s different now and what isn’t.

Context Management — Same Foundation, Different Tools

The UNDP project was 2012. Generative AI wasn’t on anyone’s enterprise roadmap. LLMs were research curiosities. Vector databases didn’t exist in the form we know them today. The tools we used — SharePoint, rules-based classification, managed metadata — reflected what was available and proven at enterprise scale.

The tools have evolved dramatically. The foundational requirements haven’t.

What Changed: The Implementation Layer

We built classification with Concept Searching: rules-based semantic analysis, clue scoring, manually curated taxonomies. You’d build it today with vector embeddings, semantic search, LLM-based classification. The modern approach understands nuance and context without explicit rules. But you still need quality training data. You still need taxonomies for structure. You still need confidence scoring for edge cases. The classifier got smarter. The requirement for semantic understanding at scale didn’t change.

We built metadata with Microsoft’s managed metadata service: hierarchical taxonomies, controlled vocabularies. You’d build it today with a hybrid approach — structured metadata plus vector representations, knowledge graphs, dynamic taxonomies that evolve with use. The modern approach handles unstructured metadata and relationships between concepts better. But you still need governance. You still need consistent structure for business-critical fields. Structure still matters. You can’t govern what you can’t organize.

We built search with keyword matching, taxonomy navigation, and Concept Searching’s semantic layer. You’d build it today with semantic search, RAG pipelines, vector similarity. The modern approach handles natural language queries and contextual understanding natively. But you still need relevance ranking. You still need quality control. You still need to measure retrieval accuracy. Finding documents is easier. Ensuring you found the right documents requires the same rigor.

Workflows went from Nintex automation and business rules to agent-based routing and LLM-powered decision logic. More complex logic, better context adaptation. But you still need defined processes, escalation paths, and human oversight for critical decisions.

What Hasn’t Changed: The Foundation

Context Management is still the foundation. Governance must still be designed upfront — retrofitting retention policies, access controls, and compliance doesn’t work. Organizational design is still harder than technology — training, support, and adoption still determine whether the system gets used. Pilot before scale still works — you can’t skip the validation phase, even with better technology. Quality control is still essential — confidence scoring, human-in-loop for ambiguous cases, continuous improvement.

LLMs made implementation easier. They didn’t eliminate the foundational requirements — or the organizational ones.

How I’d Build This Today

If I were building the UNDP system today, the architecture would look different. Here’s the high-level approach — not a detailed implementation guide, but the strategic layers that would replace what we built in 2012.

Context Management Layer: Vector database for document embeddings. Knowledge graph for structured relationships — taxonomy, organizational structure, document connections. Hybrid metadata: structured fields for governance plus vector representations for semantic search. You need both structure and flexibility. Pure vector search isn’t enough for compliance and governance.

Intelligence Layer: LLM-based classification with confidence scoring. RAG pipeline for document retrieval. Multi-language embedding models that handle the UNDP’s language requirements better than 2012-era OCR ever could. Semantic understanding out of the box — but you still need confidence thresholds and human review for edge cases.

Agentic workflow orchestration: The document processing workflows we built with Nintex in 2012 — intake, OCR, classification, routing, approval, notification — would today be handled by autonomous AI agents. Instead of rigid rule trees that break on edge cases, agents make context-aware decisions at each step. An agent receives a document, invokes the appropriate tools (OCR service, classification API, approval system), interprets the results, and routes based on content understanding rather than pattern matching. When something doesn’t fit the expected pattern — poor quality scan, ambiguous document type, unusual approval chain — the agent adapts rather than failing into a queue. This is workflow automation with intelligence built in.

API-first architecture for connecting existing systems. Modern AI-powered document processing — Mistral’s OCR capabilities, Azure Document Intelligence, AWS Textract — for extraction and classification. Better tools for the same job: extracting structured data from unstructured documents.

*Modern implementation: Same foundational layers, dramatically better tools for intelligence and semantic understanding*

What’s easier now: Multi-language support works natively. Classification accuracy improved with context-aware LLMs. Natural language queries reduce user training requirements.

What’s still hard: Governance framework still requires upfront design. Organizational change still means asking people to change workflows. Quality control at scale still needs confidence scoring and human review. Integration with legacy systems still requires connecting to existing infrastructure.

The technology stack is dramatically better. The organizational requirements are identical.

I’m working on a detailed technical deep-dive on modern Context Management architecture — specific tools, integration patterns, deployment approaches. But the high-level lesson is clear: better tools make implementation faster. They don’t eliminate the need for systematic thinking about context, governance, and adoption.

This pattern — better tools, same foundational requirements — is why so many enterprise AI implementations struggle today. Organizations focus on the technology layer and skip the Context Management foundation. Let me show you what that looks like.

Why Your Enterprise AI Implementations Are Struggling

I watch many enterprise AI implementations struggle with the same patterns I saw in content management, SharePoint deployments, and cloud migrations. The challenge is uniquely acute in enterprises: multiple departments with different functions, each requiring specific context to solve their particular problems. Finance needs different information to process invoices than HR needs for performance reviews. Project teams need different context than Legal needs for contract analysis.

The common failure pattern: organizations focus on what the AI can do — the capability layer — without understanding what it needs to actually work. They deploy RAG systems, AI copilots, and intelligent assistants without first building the foundation that makes those capabilities useful: the systematic infrastructure to provide the right context for each function’s specific problems.

Here’s what I see repeatedly.

“Our RAG System Isn’t Retrieving Relevant Documents”

Users ask questions and get irrelevant results. Engineers keep tweaking the vector similarity threshold. Retrieval accuracy stays low despite good embeddings.

The diagnosis: you skipped understanding what context people need to make decisions. You built taxonomy around document types, not around the information people actually need for their work. No structured metadata grounded in real business problems. You’re treating it as a pure vector search problem when it’s a business context problem.

We developed the taxonomy iteratively with users — testing classifications against actual documents, refining based on what worked. You can’t retrieve what you haven’t organized. Vector search is powerful — but it still needs structure to work at enterprise scale.

“Our AI Tool Has Low Adoption Rates”

Great demo, poor daily usage. Users try it once, then go back to old workflows. You can’t prove ROI because nobody uses it consistently.

The diagnosis: you deployed globally without pilot validation. No systematic approach to understanding how different teams actually work or what would make them adopt new tools. You treated this as a technology project, not organizational change.

We piloted for two months before going global. Trained Power Users in every office. Built support into the deployment from day one. And still had adoption challenges. AI tools that skip this step don’t have a chance — the technology being better doesn’t make organizational change easier.

“Governance Is Blocking Our AI Deployment”

Legal won’t approve. Data retention policies are unclear. Access control requirements aren’t addressed. You’re stuck in approval cycles.

The diagnosis: you tried to deploy first, govern later. Security and compliance weren’t designed into the architecture. You’re treating governance as an afterthought instead of a foundation.

We designed retention policies, access controls, and compliance requirements as part of the technical architecture — not as a separate phase, and definitely not after deployment. You can’t retrofit trust.

“We Can’t Scale Beyond the Pilot Team”

The pilot succeeded. Broader rollout is failing. Each new team requires custom configuration. Your customer success team is overwhelmed.

The diagnosis: you didn’t systematically capture pilot learnings. No templates or playbooks. You treated pilot as proof of technology, not validation of deployment approach.

Pilot isn’t just “does the tech work?” It’s “what does successful deployment require?” We captured pilot insights as we went — what worked, what didn’t — and built rollout templates from actual deployment experience. Otherwise, you’re starting from scratch with every new team.

These failures aren’t about having the wrong AI model or insufficient training data. They’re about focusing on AI capabilities without understanding what those capabilities need to succeed in complex enterprise environments where different functions require different context to solve different problems.

The organizations that succeed understand this. They start by understanding what context different functions need to solve their specific problems. They build the foundation — taxonomy, governance, quality control, organizational design — before deploying AI capabilities. They focus on what the AI needs, not just what it can do.

What This Represents

This case study shows what foundational infrastructure looks like at scale — from 2012 technology to today’s enterprise AI deployments. The tools evolved dramatically. The foundational requirements for making information findable, usable, and trustworthy didn’t. Twenty years of pattern recognition across these implementations tells the same story: understand the problems people need to solve, build the foundation that provides the context to solve them, then deploy the capabilities that make it intelligent.

The pattern that made this work: understand the business problem first. What decisions do people need to make? What context do they need to make those decisions? Then build the foundation that systematically provides that specific context — not generic infrastructure, but the exact contextual information the solution needs to solve real business problems. That’s what’s missing in most enterprise AI deployments.

What I’m Working On Now

My current work spans a few areas: building production RAG systems with vector databases and LLMs (not just proofs-of-concept), prototyping AI tools from scratch, consulting companies on enterprise AI transformation and adoption strategies, and helping sales and solutions teams prepare for consultative conversations with their buyers.

That last piece matters. Enterprise AI sales conversations fail when they focus on model capabilities instead of the business problems those capabilities solve in the customer’s actual context. I help teams bridge that gap — whether through hands-on prototyping, deployment consulting, or sales enablement.

I’m also formalizing the frameworks I mentioned — Customer Impact Framework and CLEAR Tracks — into methodologies that teams can actually use to qualify opportunities, scope deployments, and prove value.

Let’s Connect

If you’re working on enterprise AI deployments, building Enterprise Context Management architectures, or helping teams have better conversations with buyers about AI value — let’s connect.

https://www.linkedin.com/in/brianlachman/

Context Management isn’t a buzzword. It’s how you make intelligent systems actually deliver enterprise value.

What We Built Before LLMs — and Why Modern AI Deployments Are Failing Without It was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

What We Built Before LLMs — and Why Enterprise AI Deployments Are Failing Without It

Context Management at Scale: The Foundation Enterprise AI Deployments Need To Succeed

The Foundation That Still Matters

What We Built Before LLMs — and Why Enterprise AI Deployments Are Failing Without It

Context Management at Scale: The Foundation Enterprise AI Deployments Need To Succeed

The Foundation That Still Matters

Why This Was Hard (And Why It Matters for AI Today)

The Scale: 170+ Countries, One System

The Technical Mess: “Electronic Document Chaos”

The Political Reality: Governance at UN Scale

The Real Problem: No Unified Context Foundation

Building the Context Management Foundation

Platform + Wedge: The Two-Component Architecture

Information Architecture: Designing for Global Scale

The Classification Engine: Semantic Understanding at Scale

Governance: Designed In, Not Bolted On

e-Registry & Intelligent Document Processing

How the System Worked: From Paper to Intelligence

Quality Control: The Classification Confidence Workflow

Multi-Language Reality

Platform Validation: Finance Came Calling

The Hard Parts — What Actually Mattered

Designing for Adoption

Pilot First, Scale After

The Pattern That Still Holds

Formalizing the Patterns

Context Management — Same Foundation, Different Tools

What Changed: The Implementation Layer

What Hasn’t Changed: The Foundation

How I’d Build This Today

Why Your Enterprise AI Implementations Are Struggling

“Our RAG System Isn’t Retrieving Relevant Documents”

“Our AI Tool Has Low Adoption Rates”

“Governance Is Blocking Our AI Deployment”

“We Can’t Scale Beyond the Pilot Team”

What This Represents

What I’m Working On Now

Let’s Connect

Similar Posts