Model Quantization, ONNX Runtime, Embedded Inference, TinyML

Feeds to Scour
SubscribedAll
Scoured 7685 posts in 50.4 ms
QMC: Efficient SLM Edge Inference via Outlier-Aware Quantization and Emergent Memories Co-Design
arxiv.org·48m
💻Local LLMs
Preview
Report Post
Edge AI: The future of AI inference is smarter local compute
infoworld.com·2d
🏗️AI Infrastructure
Preview
Report Post
Everything Moe
ianbarber.blog·1d·
Discuss: Hacker News
🤖Transformers
Preview
Report Post
2026-01-21 Daily Ai News
dev.to·6h·
Discuss: DEV
🏗️AI Infrastructure
Preview
Report Post
IGAA: Intent-Driven General Agentic AI for Edge Services Scheduling using Generative Meta Learning
arxiv.org·1d
🏗️AI Infrastructure
Preview
Report Post
AI Systems Performance Engineering
github.com·4h·
Discuss: Hacker News
🏗️AI Infrastructure
Preview
Report Post
Deep learning as program synthesis
lesswrong.com·1d·
🤖AI Inference
Preview
Report Post
Artificial Intelligence
radiofreemobile.com·22h
🤖Anthropic Claude
Preview
Report Post
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation
machinelearning.apple.com·1d
Incremental Computation
Preview
Report Post
a transport layer for agentic apps
ably.com·8h·
Discuss: Hacker News
🌊Event Streaming
Preview
Report Post
I replaced my ChatGPT subscription with a 12GB GPU and never looked back
xda-developers.com·9h
Hardware Acceleration
Preview
Report Post
Qdrant - Vector Database
qdrant.tech·1d
🎯Vector Databases
Preview
Report Post
MIT’s new ‘recursive’ framework lets LLMs process 10 million tokens without context rot
venturebeat.com·1d·
🏗️AI Infrastructure
Preview
Report Post
YC Spring – Full-Stack AI Consulting Company
news.ycombinator.com·14h·
Discuss: Hacker News
🤖AI Coding Tools
Preview
Report Post
Why I Moved My ML Model from Flask to AWS Lambda (A Student’s Guide to $0 Hosting)
dev.to·1h·
Discuss: DEV
☁️Serverless Rust
Preview
Report Post
Learning from Models
rodney.bearblog.dev·1d
🤖Reinforcement Learning
Preview
Report Post
Co-optimization Approaches For Reliable and Efficient AI Acceleration (Peking University et al.)
semiengineering.com·12h
Hardware Acceleration
Preview
Report Post
Why AI Needs GPUs and TPUs: The Hardware Behind LLMs
blog.bytebytego.com·2d
Hardware Acceleration
Preview
Report Post
Ensemble Listening Model (ELM): State-of-the Art Foundation Model Accuracy. A Fraction of the Cost.
ensemblelisteningmodel.com·1d·
Discuss: Hacker News
🎙️Whisper
Preview
Report Post
Finally! Proof That Agentic AI Scales (For Creating Broken Software)
codemanship.wordpress.com·20h·
Discuss: Hacker News
🤖AI agents
Preview
Report Post

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help