Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🔲 ML Hardware
GPU, TPU, inference hardware, AI accelerators, CUDA
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
12456
posts in
15.1
ms
Seeing the
ML
Compiler
Stack Live on AMD GPU
🔧
Compilers
compilersutra.com
·
2d
·
DEV
LLM inference engine
written
ground-up
natively
in C#/.NET
🧠
LLMs
dotllm.dev
·
13h
·
Hacker News
Beledarian/wgpu-llm
: A from-scratch LLM inference engine that uses wgpu (the cross-platform WebGPU implementation) to dispatch
WGSL
compute shaders for every math operation a Transformer needs. No CUDA. No Python. No massive framework dependencies. Just Rust, raw shaders, and your GPU.
🧠
LLMs
github.com
·
3d
·
Hacker News
20260324_
snn
_vs_gpu_en
🤖
AI Research
dev.to
·
20h
·
DEV
The GPU
Moat
Has a Side Door: AI Research Outside the
Frontier
Labs
🤖
AI Research
mangeshgupte.substack.com
·
1d
·
Substack
Finetuned
a
270M
model on CPU only - full weights, no LoRA, no GPU
🧠
LLMs
promptinjection.net
·
4d
·
r/LocalLLaMA
GAIA
– Open-source framework for building AI agents that run on local
hardware
⚡
Performance Engineering
news.ycombinator.com
·
21h
·
Hacker News
The
nextAI
Solution to the
NeurIPS
2023 LLM Efficiency Challenge
🧠
LLMs
arxiv.org
·
2d
Self-Hosted AI on a 24GB GPU:
OpenClaw
+
Ollama
Setup Guide for Windows
🧠
LLMs
blog.zolty.systems
·
12h
The
Beginning
of
Scarcity
in AI
🤖
AI Research
tomtunguz.com
·
2d
·
Hacker News
Leveraging
CPU memory for faster, cost-efficient
TPU
LLM training
🧠
LLMs
opensource.googleblog.com
·
4d
·
Blogger
Nvidia
slaps
forehead
: AI, that’s what quantum needs!
🤖
AI Research
theregister.com
·
11h
·
Hacker News
Nvidia says AI cuts 10-month, eight-engineer GPU design task to
overnight
job — company is still 'a long way' from AI
designing
chips without human input
🤖
AI Research
tomshardware.com
·
20h
·
r/singularity
Scaling
PaddleOCR
to Zero: A Multi-Cloud GPU Pipeline with
KEDA
☁️
Cloud Computing
silverlining.cloud
·
5d
·
DEV
How
screwed
am I on Windows 11 PC?
🥧
Raspberry Pi
techcommunity.microsoft.com
·
23h
Superintelligence
With a
26B
Model? It Might Actually Be Possible
🤖
AI Research
agentbazaar.tech
·
3d
·
DEV
RT by @
awnihannun
:
DFlash
speculative decoding on Apple Silicon
🔌
Embedded Systems
twitter.macworks.dev
·
3d
Nvidia's
moat
is not what it used to be
⚡
Performance Engineering
news.ycombinator.com
·
2d
·
Hacker News
Hot Experts in your VRAM! Dynamic expert cache in llama.cpp for 27% faster CPU +GPU token generation with
Qwen3.5-122B-A10B
compared to layer-based single-GPU p...
⚡
Performance Engineering
github.com
·
2h
·
r/LocalLLaMA
Building a
Voice-Controlled
Local AI Agent on a 4GB GPU
🧠
LLMs
dev.to
·
2d
·
DEV
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help