💬 NLP - hop1.ng.1357 · Scour

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

✨LLMs Code

github.com··Hacker News

A system programmer’s guide to LLM inference

🤖AI Blog

blog.xiangpeng.systems··Hacker News

Attention Expansion: Enhancing Keyphrase Extraction from Long Documents with Attention-Augmented Contextualized Embeddings

🔍Information Retrieval Academic

Mini Shai-Hulud, Miasma, and Hades Worms Target Bioinformatics and MCP Developers via Malicious PyPI Wheels

🔧Agent Tooling Blog

socket.dev··Hacker News

Large companies can add a local LLM filter layer to considerably reducing their AI costs

umrashrf.github.io··Hacker News

ashp15205/guardian-runtime: A zero-latency, local-first runtime firewall for LLMs. Intercept every prompt and response locally to stop data leaks and runaway token costs.

👨‍💻AI Coding Code

github.com··Hacker News

I built a sentiment analyzer for Hacker News (as an MCP server)

🔧Agent Tooling

mcpize.com··Hacker News

Agentic AI frameworks compared: LangChain, LangGraph, AutoGen

🔧Agent Tooling Blog

Optimality of FSQ Tokens for Continuous Diffusion for Categorical Data with Application to Text-to-Speech

ℹ️Information Theory Academic

Vibe Diaries: Training Nanochat

🧠Machine Learning

vibediary.dev··Hacker News

StereoTales: Multilingual Open-Ended Stereotype Discovery in LLMs

🎭Claude Blog

research.giskard.ai··Hacker News

defai-digital/ax-engine: Apple Silicon LLM runtime supporting Gemma 4 and Qwen 3.6 MTP modes

🤖AI Code

github.com··Hacker News

Auditing Training Data in Domain-adapted LLMs: LoRA-MINT

✨LLMs Academic

What Are Tokens in LLMs?

🤖LLM Blog

bearisland.dev··Hacker News

How LLMs Actually Work: A Friendly Map for Humans • oreoro

🔧Agent Tooling

oreoro.github.io··Hacker News

Causal Semantic Alignment for LLM-based Time Series Forecasting

✨LLMs Academic

How to Measure Time To First Token (TTFT) in AI Systems

qainsights.com··Hacker News

Phase transition in large language models and the criticality of natural languages

🧠Machine Learning Academic

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

deemwar-products.github.io··Hacker News

EVA-Bench Data 2.0: 3 Domains, 121 Tools, 213 Scenarios

🧠Machine Learning Blog

huggingface.co·

Log in to enable infinite scrolling