ggalmeida's Feed

RAG Observability with Langfuse, vLLM, and FAISS

Table of Contents RAG Observability with Langfuse, vLLM, and FAISS Introduction to Production-Grade RAG and LLM Observability RAG Observability Architecture with Langfuse, vLLM, and FAISS Project Setup Building a Langfuse-Traced Retriever with FAISS Building a Traced LLM Wrapper for vLLM… The post appeared first on <a rel="nofollow" href=" Read more ›

📝Plain Text editxr.org·

A terminal Markdown editor that links like Obsidian

Obsidian-style links in a terminal Markdown editor: [[wikilinks]], quick-open, and following links between notes without leaving your shell. Read more ›

Discussed on Hacker News

🐧Programming, DevOps and Open Source Software medium.com

I Learned Data Structures Better From Video Games Than From Any Textbook (Here’s What 47 Hours of…

Hi everyone, I am Trends 24/7 and in this blog I want to talk about something that took me an embarrassingly long time to figure out: data… Read more ›

🔧Prompt Optimization arxiv.org·

FAPO: Fully Autonomous Prompt Optimization of Multi-Step LLM Pipelines

Multi-step LLM pipelines fail through interactions among retrieval, reasoning, and formatting steps, so prompt-only optimization can miss bottlenecks in the chain. We present FAPO (Fully Autonomous Prompt Optimization), a framework that lets Claude Code optimize an LLM pipeline inside a standardized codebase. FAPO evaluates a pipeline, inspects intermediate steps, diagnoses failures, proposes scoped changes, and validates variants repeatedly t... Read more ›

💾Data Science and Databases biorxiv.org·

Seed variation impacts clustering stability in Single-Cell RNA-Seq and can be mitigated by StAbility-BasEd-Reassignment (SABER)

Single-cell RNA-seq clustering is commonly treated as reproducible once a random seed is fixed, yet the choice of seed itself may alter cell assignments and downstream interpretation. We systematically quantified seed-induced clustering variability by running Louvain and Leiden clustering across 100 seeds in Seurat and Scanpy on 28 single-cell RNA-seq datasets from the Human Cell Atlas and IMMUcan. Using Element-Centric Consistency, we found that seed choice affected a substantial fraction of... Read more ›

🔗Entity Resolution moderndata101.substack.com·

The Identity Crisis: Why Entity Resolution Is the Missing Foundation of Every Data Product Stack (10 minute read)

Identity resolution and warehouse-native MDM are core infrastructure for trusted data products, AI, and compliance. At enterprise scale, local checks fall short, creating duplicate customers, phantom entities, and model-poisoning risk. The pattern combines blocking, rule-based and ML matching, graph clustering, and human review inside the warehouse. Read more ›

Discussed on Substack

🔄Dataset Augmentation medium.com

AI Model Fine-Tuning Data Guide: Quality, Formats & Flywheel.

The fine-tuning data guide engineers need: how much data, 4 sources, 3 formats, model collapse risk from synthetic data, and the data… Read more ›

🎯Retrieval Systems medium.com

From Toy RAG to Production RAG: Hybrid Search, Reranking and Observability

Most Retrieval-Augmented Generation (RAG) tutorials stop too early. Read more ›

💾Data Science and Databases GitHub·

Show HN: FERNme – agent memory that updates with ~zero LLM calls

A lightweight memory engine for AI agents using fuzzy graphs, Hebbian updates, and optional LLM gating. - mirkofr/FERNme Read more ›

Discussed on Hacker News

📝Plain Text Sébastien Dubois·

TypedMark

TypedMark is an open specification for typed Markdown note systems. It adds explicit structure (schemas, field definitions, property sets, note-type inheritance, and validation) while keeping notes as plain Markdown files with YAML frontmatter. Authored by Sébastien Dubois under the MIT license (202 Read more ›

Covers Obsidian

🐧Programming, DevOps and Open Source Software medium.com

Understanding Data Structures by Building a Contact Book in Python

Data structures sound scary. They are not. Let this simple project show you exactly what they are and why they matter. Read more ›

🔧Prompt Optimization leanpub.com·

Free eBook: Building Pragmatic AI Agents That Use Tools and APIs

Practical guide to building AI agents that use tools and APIs with DSPy, Pydantic AI, Claude, OpenAI, and Google ADK focused on real-world production systems. Read more ›

Discussed on Hacker News

🔄Dataset Augmentation arxiv.org·

Phantoms and Disclosures: a Causal Framework for Auditing Synthetic Data

The rapid adoption of generative AI and Large Language Models (LLMs) has spurred interest in synthetic data as a privacy-preserving alternative to sensitive real-world datasets. However, generating high-utility synthetic data often carries the risk of memorizing and regurgitating private information from the training corpus. In this work, we present a customizable empirical auditing framework designed to detect and explain such data disclosures.... Read more ›

🔗Entity Resolution GitHub·

chore(deadcode): share levenshtein distance helper

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞 - chore(deadcode): share levenshtein distance helper · openclaw/openclaw@b574da5 Read more ›

🎯Retrieval Systems medium.com

RAG (Retrieval-augmented generation)

What is Retrieval Augmented generation? Read more ›

💾Data Science and Databases kaggle.com·

LoRA: I Trained <1% of a 1.5B Model and Matched a Full Fine-Tune

Series — Fine-Tuning, Smallest to Largest: LoRA (1.5B) ← you are here In I fully fine-tuned a 270M model — updating every weight. That's fine for a tiny model. It gets painful as models grow, because full fine-tuning needs gradients and optimizer state for every parameter (~4× the model size in memory). So: what do you do when the model is too big to comfortably fine-tune all of? The idea behind LoRA LoRA (Low-Rank Adaptation) rests on one observation: the change fine-tuning makes to a weight... Read more ›

Discussed on DEV

🐧Programming, DevOps and Open Source Software iwtlp.com·

Build your own Redis from scratch, and talk to it with the real redis-cli

Redis has a reputation for being a serious piece of infrastructure, and it is. But the core of it, the part that makes it Redis, is astonishingly small. Small enough that you can rebuild it in about 80 lines of Python, point the real redis-cli at your version, and have it just work. Same commands, same wire protocol, same behavior. That is the fun of it. By the end you will run redis-cli -p 6399 set foo bar, and the OK that comes back is from a server you wrote. This is the written companion ... Read more ›

Covered by DEV Community

Discussed on DEV

🔧Prompt Optimization perceptiontheory.bearblog.dev·

Improving a data pipeline with DSPy

As a developer, I have plenty of experience building full-stack apps, backend services, cloud infrastructure, and increasingly over the past three years, AI ... Read more ›

📝Plain Text MacStories·

The Bear Team Releases Public Beta of Lettera, a New Mac Markdown Editor

Today, the team behind note-taking app Bear announced the public beta of Lettera, a new Mac text editor, based on Panda, an earlier beta that was used to work out the Bear 2.0 text editing engine. That immediately caught my eye because I’ve been using Panda for months. In fact, it’s the default way I […] Read more ›

Covers 3 stories including Obsidian

🔄Dataset Augmentation arxiv.org·

Improving Code-Switching ASR with Code-Mixing Guided Synthetic Speech

Code-switch (CS) Automatic Speech Recognition (ASR) remains challenging due to limited availability of high quality CS text-speech pairs for training. Although synthetic data augmentation via Text-to-speech (TTS) has been explored, existing CS TTS approaches primarily optimise reconstruction fidelity and do not explicitly enforce language-boundary consistency, thereby limiting their effectiveness for CS ASR augmentation. This paper proposes a ... Read more ›