LLMs

Feeds to Scour
SubscribedAll
Scoured 161 posts in 24.1 ms

defai-digital/ax-engine: Apple Silicon LLM runtime supporting Gemma 4 and Qwen 3.6 MTP modes

鈱笍CLI ToolsContent type: Code
github.comHacker News

Show HN: LLM memory without context bleed; 100% precision vs. <10% vector search

鉁嶏笍Prompt Engineering
Less-relevant results

[AINews] FrontierCode: Benchmarking for Code Quality over Slop

鉁嶏笍Prompt EngineeringContent type: News
latent.space

The Order Matters: Sequential Fine-Tuning of LLaMA for Coherent Automated Essay Scoring

馃LLMContent type: Academic
arxiv.org

Can Voice Agents Handle Bilingual Customers? Benchmarking Frontier ASR on Code-Switched Speech

馃LLMContent type: Blog
huggingface.co

See, Act, Correct: three levers for working with a code agent

馃幃Reinforcement LearningContent type: Blog

Apples to Apples: MLX vs. Llama.cpp for Gemma 4 12B on an M1 16GB

鈱笍CLI ToolsContent type: Blog
ziraph.comHacker News

Tokenminning: Because Tokenmaxxing Is a Bad Idea

鉁嶏笍Prompt Engineering

Running Qwen 35B MoE at 450k Context on a Single 32GB GPU

鈱笍CLI Tools

Testing MiniMax M3 on real tasks: repo refactor, screenshot debugging, and Spotify recommendations

馃RustContent type: Blog
andlukyane.comHacker News

Alignment Collapse Under KV Cache Quantization: Diagnosis and Mitigation

馃AIContent type: Academic
arxiv.org

Stack Overflow didn't just help AI learn to code

馃AI

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

馃AIContent type: NewsContent type: Blog
blog.googleHacker News

SafeRun: Enabling Determinism in LLM Planning for Running

馃挕AI ReasoningContent type: Academic
arxiv.org

[AINews] not much happened today

馃AIContent type: News
latent.space

KaiFelixBennett/gemma4-turboquant-rdna4: Run Gemma-4-31B at full 256K context on a $1,400 AMD RDNA4 GPU (gfx1201): TurboQuant KV cache + HIP-graph-safe Flash-Attention for llama.cpp, fully measured on real hardware.

馃捑Retro ComputingContent type: Code
github.comHacker News

Analyzing the Correlation Between Hallucinations and Knowledge Conflicts in Large Language Models

馃LLMContent type: Academic
arxiv.org

Florian Brand, Prime Intellect research engineer, adopts Gemma 4 E4B 6-bit quantized as his primary local Mac LLM

馃AIContent type: News
digg.comHacker News

A retrieval conditioned rebinding circuit for dynamic entity tracking in large language models

馃TransformersContent type: Academic
arxiv.org

No more posts from yfff's subscribed feeds.

Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help