MLOps

Feeds to Scour
SubscribedAll
Scoured 99 posts in 6.1 ms

Google Shrank Gemma 4 by 72% and Unsloth Fixed the 4-Bit Bug Nobody Else Caught on One 4090, and 4-Bit Shouldn鈥檛 Be This Good

馃LLMsContent type: Blog
towardsai.net

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

馃LLMs

Introducing Granite Libraries and Project Granite Switch

馃LLMsContent type: Blog
research.ibm.comHacker News

OpenPCC: Open and Confidential LLM Serving on Commodity TEEs

馃LLMsContent type: Academic
arxiv.org

mirkolenz/llmhop: Tiny, stateless Go router that dispatches OpenAI-compatible requests to single-model vLLM and sglang backends with zero external dependencies

馃LLMsContent type: Code
github.comHacker News

Florian Brand, Prime Intellect research engineer, adopts Gemma 4 E4B 6-bit quantized as his primary local Mac LLM

馃LLMsContent type: News
digg.comHacker News

Google's new open model DiffusionGemma generates text from noise instead of word by word

馃LLMs
the-decoder.com

Where to Host Your Open-Source Model (Under 10B Parameters)

馃LLMs
digitalocean.com

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

馃LLMsContent type: NewsContent type: Blog
blog.googleHacker News

Youssof Altoukhi (@Youssofal_)

馃LLMs
xcancel.comr/LocalLLaMA

Breaking free of a single datacenter: Practical geo-distributed AI operations with the k0smos platforms

馃LLMsContent type: Blog
cncf.io

MoQ GGUFs and GSQ: Low-Bit GGUFs Are About to Get Much Better

馃LLMsContent type: NewsContent type: Blog

heterodoxin/graphkv: Graph-guided KV cache compression for memory-efficient LLM inference.

馃LLMsContent type: Code
github.comr/LocalLLaMA

#068 - Apple runs Siri on Google's Gemini, OpenAI files a secret IPO at $852B, Xiaomi clocks 1,000 tps

馃尡Startups
indiehacker.news

Alignment Collapse Under KV Cache Quantization: Diagnosis and Mitigation

馃LLMsContent type: Academic
arxiv.org

[eCHO News] Episode #104: mTLS for Cilium. Lisp for eBPF

馃LLMs

Build a local voice agent with Red Hat OpenShift AI

馃LLMs
developers.redhat.com

Using local LLMs for agentic coding

馃LLMsContent type: Blog
blog.alexewerlof.com

Build a Medical Report Analyzer on Dedicated Inference with Python

馃LLMs
digitalocean.com

KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++

馃LLMsContent type: Code
github.comHacker News
Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help