LLM Inference

Feeds to Scour
SubscribedAll
Scoured 409 posts in 9.0 ms

Unsloth Gemma 4 QAT

 🔬Deep Learning
unsloth.ai·

Token4Token — pay-per-token inference on Gnosis + Swarm

 🧠LLMs
t4t.eth.link··Hacker News

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

 🤖Data science

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

 🧠LLMs

An LLM that reviews your code, challenges your decisions, but never writes code for you

 🤖Data science  Content type: Blog
blog.adafruit.com·

fix(ollama): use provider thinking default in SDK session factory (#9… · openclaw/openclaw@4f3c2cd

 🎯Fine-tuning  Content type: Code
github.com·

TileFuse: A Fused Mixed-Precision Kernel Library for Efficient Quantized LLM Inference on AMD NPUs

 🔬Deep Learning  Content type: Academic
arxiv.org·

Running Qwen 35B MoE at 450k Context on a Single 32GB GPU

 🪟Context Windows

AMD's Lemonade SDK For Local AI Adds NVIDIA CUDA Support

 CUDA

MoQ GGUFs and GSQ: Low-Bit GGUFs Are About to Get Much Better

 🤖Data science  Content type: News  Content type: Blog

Google's new open-weights model brings image-generation tricks to AI text generation

 🤖Data science  Content type: News
theregister.com·

From GPU to Token: The 8-Layer Observability Stack for AI Infrastructure

 🔬Deep Learning  Content type: Blog
jimmysong.io·

Fixing a stuck Ollama runner and building a GPU watchdog

 🏠Self-hosting

DiffusionGemma 26B A4B results on my 5090

 🧠LLMs

CommBench: Can LLMs Write Correct and Efficient GPU Communication Code?

 🧠LLMs

Self-hosted remote access for Ollama without complicated setup

 🏠Self-hosting

NVIDIA Accelerates Google DeepMind’s DiffusionGemma for Local AI

 CUDA  Content type: Blog
blogs.nvidia.com·

VIA-SD: Verification via Intra-Model Routing for Speculative Decoding

 🧠LLMs  Content type: Academic
arxiv.org·

Tales of an Ollama Honeypot (Part 3): More Traffic, More Findings

 📡RSS
posts.inthecyber.com·

Xiaomi MiMo-V2.5-Pro Just Hit 1,000 Tokens Per Second!

 🤖Data science
gizchina.com·
Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help