tamaulipas's Feed

Bifrost: Hybrid TEE-FHE Inference for Privacy-Preserving Transformer and LLM Serving

Cloud-hosted transformer and large language model (LLM) inference creates a direct confidentiality problem: user prompts may contain sensitive code, business data, personal information, or regulated documents, yet remote serving exposes intermediate state to the cloud software stack and accelerator runtime. Fully homomorphic encryption (FHE) keeps accelerator-side execution ciphertext-only, but end-to-end LLM inference remains expensive because ... Read more ›

🗄️Databases sumanthpoola.medium.com·

What Is a Vector Database? Why Traditional Databases Aren’t Enough for AI — Part 15

When I first heard about vector databases, I assumed they were just another database trend. Then I realized that many modern AI… Read more ›

🔌API Design medium.com

REST vs GraphQL vs gRPC vs MCP vs A2A: The 5 Protocols Your AI Stack Actually Speaks

Continue reading on Medium » Read more ›

📚RAG medium.com

RAG (Retrieval-augmented generation)

What is Retrieval Augmented generation? Read more ›

🏗️System Design medium.com

Why Real-Time AI Is Harder Than Most People Think

AI Latency, Streaming Inference, Distributed Systems, Real-Time AI, Infrastructure Engineering Read more ›

📡Observability medium.com

Microsoft Foundry Observability: Tracing, Evaluating, and Proving ROI for AI Agents on Any…

Microsoft Foundry Observability lets you trace, evaluate, monitor, and optimize AI agents on any framework, then measure their real… Read more ›

⚙️AI Engineering medium.com

Building a PDF Question-Answering Chatbot with Spring AI: From PDF Upload to RAG-Powered Answers

A practical guide to building a Retrieval-Augmented Generation (RAG) application using Spring AI, Gemini, Ollama, PostgreSQL, and PGVector. Read more ›

🛠️MLOps mayursurani.medium.com·

MLflow 101: Why MLOps Matters and How MLflow Solves the Model Deployment Crisis

The Business Problem Read more ›

✍️Prompt Engineering medium.com

From Machine Learning to Agentic AI: The Complete Story Every Practitioner Should Know

A practical guide to machine learning, neural networks, NLP, large language models, prompt engineering, and agentic AI — and how they… Read more ›

🗄️Databases medium.com

Vector Databases Are Overhyped Here’s What Actually Matters

A vector database won’t fix a broken retrieval system. But a great retrieval system can make an average AI application exceptional. Read more ›

🔗LLM Orchestration medium.com

NeuroCore: The Agent Framework That Actually Works for Math Proofs

Why YAML-based worker agents beat LangGraph for theorem proving and complex research workflows Read more ›

🚀High Performance arXiv·

The EVerest Dataset for Secure Software Engineering

End-to-end security verification, from requirements through architecture to code, requires datasets that span all three artifact types with fine-grained security labels. No existing dataset provides this combination. We present the EVerest dataset, a multi-artifact resource based on EVerest, an industry-driven open-source software stack for electric vehicle charging stations. The dataset includes 84 manually elicited security requirements anno... Read more ›

🧠LLMs medium.com

Everyone’s Using AI Built on Transformers. Most People Can’t Explain What That Means.

You don’t need a PhD to understand the architecture behind GPT-4, Claude, and Gemini. You just need someone to stop making it complicated. Read more ›

🌐Distributed Systems arXiv·

A Composable CRDT Layer for Byzantine-Resilient Deterministic Reconstruction

Conflict-free Replicated Data Types (CRDTs) ensure Strong Eventual Consistency without coordination, but typically assume benign participants and rely on validation or exclusion to handle Byzantine behavior. We address this problem through deterministic state reconstruction: rather than deciding which updates are admissible, all accepted updates are incorporated, while only a subset contributes to the reconstructed state. We instantiate this app... Read more ›

📐CS Fundamentals arXiv·

Learning-Augmented Algorithms for Online Vertex Cover

This paper studies learning-augmented online weighted vertex cover with advice and a parameter $\lambda \in (0,1)$. We consider two graph cases: bipartite graphs and general graphs. In both settings, the online algorithm must maintain a feasible vertex cover under irrevocable decisions. We show that these problems admit the same robustness--consistency tradeoffs as learning-augmented ski rental. For the bipartite graph model, we give a randomi... Read more ›