amy_yunduo's Feed

🗄️Vector Databases HubSpot Product Blog (Live)·

Building the AI Retrieval Infrastructure Behind 20 Billion+ Vectors at HubSpot

Discover how HubSpot built a scalable AI retrieval infrastructure, managing over 20 billion vectors with Qdrant, to enhance semantic search and support diverse applications. Read more ›

Covers Qdrant - Vector Database

🔭Observability Dynatrace news·

Building in the open: How Dynatrace invests in open source to move the industry forward

At Dynatrace, we believe the future of observability and cloud-native operations is open. Not “open” as a slide-deck buzzword, but open as in showing up every day to write code, review PRs, chair working groups, and build tools the community can use, extend, and make their own. We’re proud to be an active contributor to […] The post appeared first on . Read more ›

✍️Prompt Engineering Silicon Opera·

Why Longer System Prompts Usually Make LLMs Worse

There’s a pattern that shows up constantly in LLM deployments: something isn’t working quite right, so someone adds more instructions to the system prompt. The model ignores a constraint, so you restate it more forcefully. It produces the wrong tone, so you add a tone guide. Repeat until the prompt is 2,000 words long and the model is somehow worse than when you started. This isn’t a fringe experience. It’s close to a law of LLM prompt engineering. Here’s why it keeps happening. 1. LLMs Don’t... Read more ›

🔄MLOps Flexiana·

Clojure Meets Production MLOps: How chachaml Delivers AI‑Native Workflows ( Part 1)

chachaml is a Clojure-native MLOps library developed within the Flexiana ecosystem.It's built for teams that want to run machine learning systems in production without moving their workflows to another language or stack. Read more ›

Covers The state of AI in 2025: Agents, innovation, and transformation

📚RAG medium.com

AI Explained Simply: Understanding Embeddings, Vector Databases, and RAG with Everyday Indian…

🚀 Everyone is talking about AI, RAG, Embeddings, and Vector Databases. Read more ›

⚙️Backend Engineering I Programmer·

Redis Iris - Real Time Context Engine For AI Agents

Programming book reviews, programming tutorials,programming news, C#, Ruby, Python,C, C++, PHP, Visual Basic, Computer book reviews, computer history, programming history, joomla, theory, spreadsheets and more. Read more ›

💳Fintech PYMNTS·

Green Dot Bank-FinTech Split Moves Closer to the Finish Line

Green Dot has moved one step closer to separating its banking and FinTech operations. The company announced Tuesday (June 23) that its shareholders had approved the sale of Green Dot Bank to CommerceOne. That sale is part of a larger process that involves CommerceOne forming a new publicly traded bank holding company that owns CommerceOne Bank and Green Dot […] The post appeared first on <a href=" Read more ›

📊LLM Evaluation arXiv·

MINCE: Shrinking LLM Evaluation Datasets via Few-Model Monte Carlo Calibration

Evaluating LLMs across many model variants -- quantized, fine-tuned, or deployment-specific -- requires running large benchmarks repeatedly, a process that can take tens of hours per model on edge hardware such as NPUs. Existing subset selection methods reduce this cost but depend on large calibration pools or learned prediction layers. We introduce MINCE (Monte Carlo Informed N-sizing for Compact Evaluation), which uses Monte Carlo simulation o... Read more ›

🤖AI Agents TechRadar

Know your agent: building the foundation of autonomous commerce

As AI agents become autonomous, establishing cryptographic trust and verifying identity is crucial for business security. Read more ›

🔌MCP Microsoft Tech Community

MCP Server Authorization with Azure API Management: From Simple to Advanced

Why put API Management in front of your MCP servers The Model Context Protocol (MCP) has quickly become the standard way for AI agents, such as GitHub Copilot in VS Code, to reach external tools and data. As soon as an MCP server does anything meaningful, the same questions that govern any API resurface: who is allowed to call it, what are they allowed to do, and how do you enforce that consistently across many servers without rewriting each one. Azure API Management (APIM) answers those ques... Read more ›

🧠LLMs The AI Frontier·

Open models don't need to be OpenAI

Why smart enough, fast enough, and cheap enough is good enough Read more ›

Discussed on Substack

🔗APIs API Evangelist·

Tyk and the Quiet Superpower of Extending OpenAPI

Extending the OpenAPI specification is a widely used, but seldom talked about superpower of the specification. People who aren’t in the know hit the wall with what the specification can’t do, and they move on and create a new specification — where those in the know understand the specification has become the lingua franca of API operations over the last 16 years, and craft their own extensions for the specification to make it do what they need it to do. Read more ›

🗄️Vector Databases Nazar Boyko·

Vector Databases Compared: pgvector, Qdrant, Pinecone, Weaviate

There's a moment in almost every RAG project where someone asks the question that decides your next two years of ops work: "Do we actually need a vector database, or can Postgres just do this?" It's a better question than it sounds, because the honest answer isn't "use Pinecone" or "use Postgres." It's "it depends on numbers you probably haven't measured yet": how many vectors, how aggressively you filter, how much you care about the absolute ceiling of queries per second. Most teams pick bas... Read more ›

Discussed on DEV

🔭Observability medium.com

Day 177 — Scaling of Collector in OpenTelemetry

25th June 2026, Netherlands — OpenTelemetry has become one of the most important standards in modern observability. It provides a… Read more ›

✍️Prompt Engineering Nature·

clickBrick prompt engineering: optimizing large language model performance in clinical psychiatry

Prompt engineering has the potential to enhance large language models’ (LLM) ability to solve tasks through improved in-context learning. In clinical research, the use of LLMs has shown expert-level performance for a variety of tasks ranging from pathology slide classification to identifying suicidality. We introduce clickBrick, a modular prompt-engineering framework, and rigorously test its effectiveness. Here, we explore the effects of increasingly structuring prompts with the clickBrick fr... Read more ›

Covers 3 stories including GitHub here . You can follow the build instructions below as well. Change -DGGML_CUDA=ON to -DGGML_CUDA=OFF if you don't have a GPU or just want CPU inferen...

🔄MLOps medium.com

RocoMart: Building an End-to-End MLOps Pipeline Orchestration for E-Commerce

Architect a robust MLOps pipeline from scratch using Python, Prefect, MLflow, and Flask to power real-time e-commerce tech. Read more ›

📚RAG medium.com

Beyond RAG: The Evolution of Knowledge Augmentation (CAG vs. RAG vs. CRAG)

The Knowledge Augmentation Spectrum: CAG vs RAG vs CRAG For the past year, the industry has been obsessed with RAG \(Retrieval-Augmented Generation\) \. It was the “gold standard” for giving LLMs access to enterprise data\. But as our production requirements shift toward lower latency, higher accuracy, and better reliability, we are seeing the emergence of new paradigms\. If you are building AI applications today, you need to understand the architectural trade-offs between RAG , CAG \(Cache-A... Read more ›

⚙️Backend Engineering GitHub·

Building a small self-hosted SQL database with live updates — too niche?

KalamDB — a lightweight, real-time, storage-efficient SQL database. Designed for per-user data isolation and scalable performance — ideal for the AI era. - kalamdb/KalamDB Read more ›

Discussed on r/selfhosted

💳Fintech PYMNTS·

Treasury Prime Taps Green Dot to Enable Cash Deposits to Digital Accounts

Treasury Prime now enables its FinTech partners to let their customers add cash to their digital accounts at more than 90,000 participating Green Dot Network retail locations. This offering is enabled by Treasury Prime’s new Prime Cash solution, which is powered by Green Dot’s embedded finance platform, Arc, and money processing network, the companies said […] The post appeared first on <a href=" Read more ›

Covers Show HN: A full on end to end Payment System

🤖AI Agents InfoWorld·

The missing layer in enterprise agentic AI

In the past year, the enterprise AI ecosystem has gained enormous capability and zero consensus. Developers now have a remarkable set of tools for building AI agents: OpenAI’s frameworks, Anthropic’s Claude tooling, LangChain, LangGraph, CrewAI, Microsoft AutoGen, and a growing list of alternatives. Each promises to coordinate reasoning loops, manage multi-step task execution, and connect agents to tools and APIs. For experimentation, the progress has been substantial. Teams can now assemble ... Read more ›

Covers Gartner Predicts over 40% of Agentic AI Projects Will Be Canceled by End of 2027

Covered by OODAloop