Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
π§ LLM Inference
Quantization, Attention Mechanisms, Batch Processing, KV Caching
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
26853
posts in
2.73
s
PackInfer
: Compute- and I/O-Efficient Attention for
Batched
LLM Inference
arxiv.org
Β·
2d
ποΈ
LLM Infrastructure
hirako2000/latent-energy
: An Energy Based Model to solve
nonograms
via self supervised CNN
github.com
Β·
22h
Β·
Discuss:
Hacker News
π±
Edge AI Optimization
Near-Oracle
KV
Selection via Pre-hoc
Sparsity
for Long-Context Inference
arxiv.org
Β·
1d
πΈοΈ
Sparse Embeddings
Introspective
Interpretability
: a Definition, Motivation, and Open Problems
lesswrong.com
Β·
1d
π
AI Interpretability
Automating Inference Optimizations with NVIDIA
TensorRT
LLM
AutoDeploy
developer.nvidia.com
Β·
1d
ποΈ
LLM Infrastructure
Parallel Track Transformers:
Enabling
Fast GPU Inference with Reduced
Synchronization
machinelearning.apple.com
Β·
1d
π¦
Batch Embeddings
First look: Run LLMs
locally
with
LM
Studio
infoworld.com
Β·
1h
π¦
Ollama
Guney-olu/nanoslg
: A from-scratch implementation of distributed LLM inference in simple readable Python
github.com
Β·
1d
Β·
Discuss:
Hacker News
,
r/LLM
ποΈ
LLM Infrastructure
Machine learning reveals hidden
landscape
of
robust
information storage
phys.org
Β·
19h
πΎ
Persistence Strategies
OSTEP
Chapter
8
muratbuffalo.blogspot.com
Β·
6h
Β·
Discuss:
Blogger
π
MCP
The
distinct
generative
models
breno.bearblog.dev
Β·
2d
π
Embeddings
Import AI 444: LLM
societies
; Huawei makes kernels with AI;
ChipBench
importai.substack.com
Β·
1d
Β·
Discuss:
Substack
π
LLM Benchmarking
Logic
That
Patterns
Find
udara.io
Β·
5h
Β·
Discuss:
Hacker News
π»
Programming languages
Open
weight
models are here: more
choice
, more speed, less cost
kiro.dev
Β·
2h
π
New AI
A data-efficient foundation model for
porous
materials based on expert-guided
supervised
learning
nature.com
Β·
2h
πΈοΈ
Sparse Embeddings
How We Built
Platybot
: An AI-Powered
Analytics
Assistant
pulumi.com
Β·
10h
β‘
ClickHouse
[
AINews
] Qwen Image 2 and
Seedance
2
latent.space
Β·
5h
ποΈ
LLM Infrastructure
Reasoning: A
smarter
way for AI to
understand
text and images
techxplore.com
Β·
13h
π
AI Interpretability
Large Language Models Live in Time
lesswrong.com
Β·
1d
ποΈ
LLM Infrastructure
AFMTJ
Model For In-Memory Computing (University of
Arizona
)
semiengineering.com
Β·
17h
π¦
In-process Databases
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help