Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🤖 LLM Inference
Model Serving, Quantization, vLLM, ONNX Runtime
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
173832
posts in
12.4
ms
Accuracy
vs. Speed in Local LLMs: Finding Your
Sweet
Spot
grigio.org
·
9h
·
Discuss:
Hacker News
👁️
Multimodal LLMs
Optimizing LLM Inference: Sparse Activation, MoE, and
Gated-MLP
Efficiency
hackernoon.com
·
1d
👁️
Multimodal LLMs
DualPath
: Breaking the Storage
Bandwidth
Bottleneck in Agentic LLM Inference
arxiv.org
·
2d
·
Discuss:
Hacker News
👁️
Multimodal LLMs
On language models and
intuition
aleksei.dev
·
32m
👁️
Multimodal LLMs
Min-p
sampling
for LLMs
thoughtworks.com
·
21h
👁️
Multimodal LLMs
Scaling ML Inference on Databricks: Liquid or
Partitioned
?
Salted
or Not?
towardsdatascience.com
·
6h
⚡
FastAPI
ConstraintBench
:
Benchmarking
LLM Constraint Reasoning on Direct Optimization
arxiv.org
·
1d
👁️
Multimodal LLMs
QORA
- Native Rust LLM
Inference
Engine
huggingface.co
·
1h
·
Discuss:
DEV
⚡
FastAPI
Large model inference
container
– latest capabilities and performance
enhancements
aws.amazon.com
·
2d
⚡
FastAPI
Probabilistic Graph Neural Inference for bio-inspired soft robotics maintenance with ethical
auditability
baked
in
dev.to
·
9h
·
Discuss:
DEV
👁️
Multimodal LLMs
Some notes on
unreliability
of LLM
APIs
andrewpwheeler.com
·
1d
·
Discuss:
Hacker News
⚡
FastAPI
Unsloth
Dynamic 2.0
GGUFs
unsloth.ai
·
10h
·
Discuss:
Hacker News
,
r/LocalLLaMA
👁️
Multimodal LLMs
brendanhogan/base-model-agents
github.com
·
10h
🔍
Retrieval-Augmented Generation
How I'm using Local Large Language Models
jvt.me
·
5h
·
Discuss:
Hacker News
👁️
Multimodal LLMs
Qwen3.5-35B-A3B-GGUF
from
Unsloth
huggingface.co
·
1h
·
Discuss:
Hacker News
👁️
Multimodal LLMs
dReLU
Sparsification: Recovering LLM Performance with
150B
Token Pretraining
hackernoon.com
·
18h
🔍
Retrieval-Augmented Generation
🚀 Stop
Guessing
Which LLM
Runs
on Your Machine
dev.to
·
8h
·
Discuss:
DEV
👁️
Multimodal LLMs
Asura
:
Looped
Language Models done better
neel04.github.io
·
2d
·
Discuss:
Hacker News
👁️
Multimodal LLMs
The 4 LLM
Evaluation
Frameworks
: How to Benchmark AI Like Google and OpenAI Do
pub.towardsai.net
·
1d
👁️
Multimodal LLMs
Reinforcement
Learning for LLMs
mesuvash.github.io
·
2d
·
Discuss:
Hacker News
👁️
Multimodal LLMs
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help