Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
π§ LLM Inference
Quantization, Attention Mechanisms, Batch Processing, KV Caching
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
26922
posts in
1.23
s
PackInfer
: Compute- and I/O-Efficient Attention for
Batched
LLM Inference
arxiv.org
Β·
2d
ποΈ
LLM Infrastructure
hirako2000/latent-energy
: An Energy Based Model to solve
nonograms
via self supervised CNN
github.com
Β·
17h
Β·
Discuss:
Hacker News
π±
Edge AI Optimization
Near-Oracle
KV
Selection via Pre-hoc
Sparsity
for Long-Context Inference
arxiv.org
Β·
1d
πΈοΈ
Sparse Embeddings
Introspective
Interpretability
: a Definition, Motivation, and Open Problems
lesswrong.com
Β·
1d
π
AI Interpretability
Automating Inference Optimizations with NVIDIA
TensorRT
LLM
AutoDeploy
developer.nvidia.com
Β·
1d
ποΈ
LLM Infrastructure
Parallel Track Transformers:
Enabling
Fast GPU Inference with Reduced
Synchronization
machinelearning.apple.com
Β·
1d
π¦
Batch Embeddings
Guney-olu/nanoslg
: A from-scratch implementation of distributed LLM inference in simple readable Python
github.com
Β·
1d
Β·
Discuss:
Hacker News
,
r/LLM
ποΈ
LLM Infrastructure
Machine learning reveals hidden
landscape
of
robust
information storage
phys.org
Β·
14h
πΎ
Persistence Strategies
OSTEP
Chapter
8
muratbuffalo.blogspot.com
Β·
1h
Β·
Discuss:
Blogger
π
MCP
The
distinct
generative
models
breno.bearblog.dev
Β·
2d
π
Embeddings
Import AI 444: LLM
societies
; Huawei makes kernels with AI;
ChipBench
importai.substack.com
Β·
1d
Β·
Discuss:
Substack
π
LLM Benchmarking
Mechanistic
Interpretability:
Peeking
Inside anΒ LLM
towardsdatascience.com
Β·
5d
ποΈ
LLM Infrastructure
How We Built
Platybot
: An AI-Powered
Analytics
Assistant
pulumi.com
Β·
5h
β‘
ClickHouse
[
AINews
] Qwen Image 2 and
Seedance
2
latent.space
Β·
15m
ποΈ
LLM Infrastructure
Reasoning: A
smarter
way for AI to
understand
text and images
techxplore.com
Β·
8h
π
AI Interpretability
Quantization-Aware
Distillation
ternarysearch.blogspot.com
Β·
3d
Β·
Discuss:
Hacker News
π’
BitNet
AI-augmented
data quality engineering
infoworld.com
Β·
1d
π
AI Interpretability
AI
Disruption
ma.tt
Β·
4h
π‘οΈ
AI Security
Large Language Models Live in Time
lesswrong.com
Β·
1d
ποΈ
LLM Infrastructure
AFMTJ
Model For In-Memory Computing (University of
Arizona
)
semiengineering.com
Β·
12h
π¦
In-process Databases
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help