Evaluating LLMs with LangSmith: A Comprehensive Guide
analyticsvidhya.com·3d
🔍Refinement Types
Flag this post
Understanding Tokenization in Large Language Models
pub.towardsai.net·10h
Tokenizer Benchmarks
Flag this post
Best tool for measuring lots of source code
shape-of-code.com·1d
🏺Code Archeology
Flag this post
Reflections on Trusting Trust (1984)
web.archive.org·1d·
Discuss: Hacker News
🏷️Memory Tagging
Flag this post
Using eBPF to attribute packet drops to netfilter rules
developers.redhat.com·1d
🪤Trap Handlers
Flag this post
How Perplexity Built an AI Google
blog.bytebytego.com·18h
🔄Subinterpreters
Flag this post
When Five Dumb AIs Beat One Smart AI: The Case for Multi-Agent Systems
ksramalakshmi.medium.com·2d·
Discuss: r/LocalLLaMA
🔢Algebraic Datatypes
Flag this post
Quietly intelligent app features with OpenAI Agent Builder
ashryan.io·3h
🎭Program Synthesis
Flag this post
Testing Unnatural Prompt Engineering Across Five Large Language Models
blog.codeminer42.com·3d
🔍ML Language
Flag this post
Asking Our Documents the Right Questions — Locally
manas.tech·1d
🎮Language Ergonomics
Flag this post
This is a description of a test for markov chain program in a book I'm reading...
reddit.com·2d·
🎲Property Testing
Flag this post
AI Models Write Code with Security Flaws 18–50% of the Time, New Study Finds
medium.com·16h·
Discuss: Hacker News
🎭Program Synthesis
Flag this post
Text-guided Fine-Grained Video Anomaly Detection
arxiv.org·5h
🌱Minimal ML
Flag this post
ZoFia: Zero-Shot Fake News Detection with Entity-Guided Retrieval and Multi-LLM Interaction
arxiv.org·5h
⚖️Weighted Automata
Flag this post
Optimizing Native Sparse Attention with Latent Attention and Local Global Alternating Strategies
arxiv.org·5h
🪜Recursive Descent
Flag this post
REMI: PostgreSQL as Agentic Core in Tiger Cloud (Agentic Postgres Challenge by Auth0)
dev.to·1d·
Discuss: DEV
📋Tablegen
Flag this post
Towards Automated Petrography
arxiv.org·5h
📈Earley Parsing
Flag this post