tamaulipas's Feed

My FastAPI Learning Journey: From Confused to Creating REST APIs for My AI Chatbot

Why I Started Learning FastAPI (The Real Story) Read more ›

Why RAG Systems Fail Even When Everything Looks Correct

You built the pipeline. You chunked the documents. You picked a solid embedding model. You stood up a vector database. You tested a few… Read more ›

⚙️AI Engineering mosthofa-imran.medium.com·

The Boring Layers AI Still Cannot Fake

Infrastructure, evaluation, and operations: the layers where production AI quietly lives or dies. Read more ›

🛠️MLOps medium.com

Building & Deploying an Employee Attrition Prediction Model

In my previous blogs, we explored AI fundamentals, the end-to-end ML + MLOps lifecycle, and why most models never make it to production… Read more ›

🔗LLM Orchestration medium.com

Building an AI Content Strategy Assistant with LangGraph

Welcome to the third article in this series. Read more ›

📚RAG medium.com

From Keywords to Meaning: How Vector Search Changed Search Forever

Why modern AI systems don’t search for words anymore they search for meaning. Read more ›

📐CS Fundamentals arXiv·

Formalizing Task-Space Complexity for Zero-Shot Generalization

Policies must operate across diverse conditions, yet a single policy is often conservative while fully adaptive schemes can be complex. We study zero-shot generalization in contextual dynamical systems and introduce a performance-centric, directional task dissimilarity--the signed divergence--that upper bounds the generalization gap from a source context to a target context. The signed divergence induces $\varepsilon$-tolerance sets that certify... Read more ›

🏗️System Design arXiv·

AoiZora: Topology-Aware Auto-Parallel Optimization for Inference of Diffusion Transformers

Video diffusion has quickly grown into a key generative serving workload, yet producing each clip demands many denoising iterations over large spatio-temporal latents, which puts low-latency inference out of reach on a single device. A denoising step is therefore typically distributed across multiple accelerators, and TPU sub-slices have become an attractive and practical fabric for doing so. Current auto-parallel systems, however, search almost... Read more ›

🧠LLMs medium.com

Fictional Framing Part 3: Does the Fix Generalize, or Did I Just Patch One Sentence?

This is the third piece in a series on a prompt injection vector that leaked a system-prompt secret from GPT-4o using nothing but a… Read more ›

📡Observability medium.com

Microsoft Foundry Observability: Tracing, Evaluating, and Proving ROI for AI Agents on Any…

Microsoft Foundry Observability lets you trace, evaluate, monitor, and optimize AI agents on any framework, then measure their real… Read more ›

✍️Prompt Engineering medium.com

Deploying SIE on GPU: Embeddings, and Zero-Shot Extraction

Continue reading on Medium » Read more ›

🔌API Design medium.com

Part-1 WhatsApp Business Platform Explained: APIs, Architecture, and How to Send Your First…

WhatsApp is no longer just a messaging app for personal conversations. With over 2 billion users worldwide, it has become one of the most… Read more ›

⚙️AI Engineering medium.com

Prompt Engineering: The Skill That Separates Average AI Users from Expert Practitioners

A practical, production-tested guide for engineers, technologists, and business leaders on Prompt Engineering, Context Engineering, RAG… Read more ›

🗄️Databases medium.com

Closing the Reflection Gap: How to Train AI Agents to Trust Environment Feedback

When an LLM operates as a standalone agent writing SQL queries, invoking APIs, or running terminal commands it relies heavily on the… Read more ›

🌐Distributed Systems arXiv·

When the Next Step Is Not One Step: Distribution-Aware Execution Modeling for Concurrent Go Programs

Training a model to predict the next step in a concurrent program is harder than it looks: two runs of the same program from the same trace prefix can produce different next events, both valid, because the scheduler is nondeterministic. A model trained against a single label is learning to guess one outcome of a random process. We turn this around and use the nondeterminism as a training signal. We run each program many times, aggregate the obse... Read more ›

🛠️MLOps amanpathakdevops.medium.com·

Day 09 of MLOps: From Localhost to Production-Ready ML Deployment on AWS

Introduction Read more ›

🚀High Performance arXiv·

Platooning Connected, Autonomous, and Human-Driven Vehicles: A Deep Reinforcement Learning-based Approach

Conventionally, existing vehicle platooning approaches are designed for connected vehicles, typically including connected autonomous vehicles and connected human-driven vehicles. Non-connected vehicles, such as non-connected autonomous or human-driven vehicles, are not incorporated. As a result, these platooning approaches may not properly reflect real-world mixed traffic conditions at the current stage. To address this limitation, this study ... Read more ›

📚RAG sadiqueali.medium.com·

Laravel Vector Search: Semantic Search in Your App Without a PhD in Machine Learning

whereVectorSimilarTo(), embeddings, pgvector, cosine similarity, chunking strategies, and the difference between keyword search, full-text… Read more ›

📐CS Fundamentals arXiv·

Breaking chains with trees: Deep learning with $\mathcal{O}(\log N)$ parallel time complexity

Modern deep neural network architectures are trained via backpropagation, which requires errors to be sequentially propagated through all layers before parameters can be updated. This introduces two limitations: locking, where layer-wise updates are strictly interdependent and cannot proceed in parallel, and the weight transport problem, which requires symmetric forward and backward pathways for exact gradient computation. These constraints re... Read more ›

🧠LLMs medium.com

I Benchmarked Llama 3.2 3B on a Snapdragon X Plus and Beat Qualcomm’s Published Numbers

Qualcomm published figures for the X Elite, not the X Plus, regarding the Llama 3.2 Read more ›