Attention Optimization, Memory Efficiency, Transformer Acceleration, IO-Aware

Feeds to Scour
SubscribedAll
Scoured 82837 posts in 2.40 s
HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing
arxiv.org·1h
🧩Attention Kernels
Preview
Report Post
Building Intelligent Retail Signage with BrightSign's NPU: A Deep Dive into Real-Time Gaze Detection
dev.to·12h·
Discuss: DEV
🔍Nsight
Preview
Report Post
Anthropic's Performance Take-Home: A 65x Optimization (For Dummies)
ikot.blog·16h·
Discuss: Hacker News
🎛️CUDA Optimization
Preview
Report Post
Poly-attention: a general scheme for higher-order self-attention
arxiv.org·1d
🧩Attention Kernels
Preview
Report Post
Attention Optimization
aussieai.com·5d
👁️Attention Optimization
Preview
Report Post
Physics 1 – Attention can’t exactly simulate uniform linear motion
kindxiaoming.github.io·17h
👁️Attention Optimization
Preview
Report Post
New Google Agentic Vision Sharpens Gemini 3 Enabling it to Rethink Images, Then Act
geeky-gadgets.com·18h
🤖AI Coding Tools
Preview
Report Post
MemAlign: Building Better LLM Judges From Human Feedback With Scalable Memory
databricks.com·1d
📊Gradient Accumulation
Preview
Report Post
A New AI Architecture Without Prior Distributions: Stream-Based AI and Compositional Inference
dev.to·1d·
Discuss: DEV
👁️Attention Optimization
Preview
Report Post
Training Design for Text-to-Image Models: Lessons from Ablations
huggingface.co·19h
📊Gradient Accumulation
Preview
Report Post
Inference Energy Consumption Diagnosed: LLM Tasks Show 25% Energy Differences
quantumzeitgeist.com·11h
🏎️TensorRT
Preview
Report Post
Solving Real-World AI Bottlenecks
semiengineering.com·1d
🧠CPU Architecture
Preview
Report Post
Why Your Browser Benchmark is Lying to You About AI Performance
speedpower.run·22h·
Discuss: DEV
⏱️Benchmarking
Preview
Report Post
Using Nsight Compute with large codebases - Part 2 : Profiling large code bases
blog.ncompass.tech·13h·
Discuss: Hacker News
🔍Nsight
Preview
Report Post
Edit Mind : AI-Powered Local Video Search & Analysis
producthunt.com·8h
👁️Attention Optimization
Preview
Report Post
SeenThis and Lumen Research launch attention measurement model powered by SeenThis’ adaptive streaming
lumen-research.com·16h
👁️Attention Optimization
Preview
Report Post
Future leakage in block-quantized attention
matx.com·1d·
Discuss: Hacker News
📉Model Quantization
Preview
Report Post
MichiAI: A 530M Full-Duplex Speech LLM with ~75ms Latency Using Flow Matching
ketsuilabs.io·14h·
Discuss: Hacker News
📉Model Quantization
Preview
Report Post
CutefishOS
cutefishos.com·5h
🦀Rust
Preview
Report Post
Show HN: Stanislavski Protocol – Real-Time Emotional Logic Framework
news.ycombinator.com·17h·
Discuss: Hacker News
ONNX Runtime
Preview
Report Post

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help