Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
GPU Architecture
🎮 GPU Architecture
SIMT, Warp, Memory Hierarchy, Compute Units
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
137
posts in
8.2
ms
SK Hynix bets
HBM
, wins Nvidia jackpot
🟩
CUDA
jonpeddie.com
·
1d
1 day ago
Actions for SK Hynix bets HBM, wins Nvidia jackpot
Unreleased RTX 3050 Ti engineering sample appears in photos and benchmarks — the RTX 3060 alternative that never happened
💡
FlashAttention
Content type:
News
tomshardware.com
·
5d
5 days ago
Actions for Unreleased RTX 3050 Ti engineering sample appears in photos and benchmarks — the RTX 3060 alternative that never happened
Big Blue’s Redbook on Storage Scale KV Cache management
💡
FlashAttention
Content type:
News
blocksandfiles.com
·
2d
2 days ago
Actions for Big Blue’s Redbook on Storage Scale KV Cache management
[News] HBF Spurs Equipment Race; Hanmi Semiconductor Eyes First TC Bonder Deliveries in 2H26
💡
FlashAttention
Content type:
News
trendforce.com
·
6d
6 days ago
·
r/hardware
Actions for [News] HBF Spurs Equipment Race; Hanmi Semiconductor Eyes First TC Bonder Deliveries in 2H26
Profiling in PyTorch (Part 2): From Nn.Linear to a Fused MLP
💻
OS
Content type:
Blog
huggingface.co
·
19h
19 hours ago
·
Hacker News
Actions for Profiling in PyTorch (Part 2): From Nn.Linear to a Fused MLP
APEX4: Efficient Pure W4A4 LLM Inference via Intra-SM
Compute
Rebalancing
💻
OS
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for APEX4: Efficient Pure W4A4 LLM Inference via Intra-SM Compute Rebalancing
2026 budget phones to bring back the waterdrop notch, and that’s not the only downgrade
🐧
Kernel Dev
gizmochina.com
·
9h
9 hours ago
Actions for 2026 budget phones to bring back the waterdrop notch, and that’s not the only downgrade
Unreleased RTX 3050 Ti graphics card spotted in the wild, GA106
GPU
with 6GB VRAM
💡
FlashAttention
Content type:
News
tweaktown.com
·
4d
4 days ago
Actions for Unreleased RTX 3050 Ti graphics card spotted in the wild, GA106 GPU with 6GB VRAM
The Inference Alpha: Maximizing Frontier Models on AMD
💻
OS
Content type:
Blog
digitalocean.com
·
1d
1 day ago
Actions for The Inference Alpha: Maximizing Frontier Models on AMD
AMD Board Partners Allegedly Tip When RDNA 5 GPUs Might Arrive
🟩
CUDA
Content type:
News
hothardware.com
·
4d
4 days ago
Actions for AMD Board Partners Allegedly Tip When RDNA 5 GPUs Might Arrive
Vortex 3.0 Released As Full-Stack, Open-Source RISC-V
GPU
Now With 3D Pipeline
🟩
CUDA
phoronix.com
·
2d
2 days ago
Actions for Vortex 3.0 Released As Full-Stack, Open-Source RISC-V GPU Now With 3D Pipeline
'The thing that gives me hope is there is an enormous amount of capacity being built' - AMD's head of Ryzen and Radeon is pinning hopes of an end to the
memory
...
💡
FlashAttention
Content type:
News
pcgamer.com
·
2d
2 days ago
Actions for 'The thing that gives me hope is there is an enormous amount of capacity being built' - AMD's head of Ryzen and Radeon is pinning hopes of an end to the memory ...
Nvidia selects top three vendors for critical AI
memory
💡
FlashAttention
techzine.eu
·
6d
6 days ago
Actions for Nvidia selects top three vendors for critical AI memory
Google's latest DiffusionGemma open AI model comes with a 4x speed boost
🟩
CUDA
Content type:
News
arstechnica.com
·
23h
23 hours ago
Actions for Google's latest DiffusionGemma open AI model comes with a 4x speed boost
Chip Industry Week In Review
🐧
Kernel Dev
semiengineering.com
·
6d
6 days ago
Actions for Chip Industry Week In Review
Holding the FP8 Quality Ceiling at 8-Bit Weights and Activations: INT8 and GGUF Post-Training Quantization of Ideogram 4.0 for Consumer GPUs
💻
OS
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for Holding the FP8 Quality Ceiling at 8-Bit Weights and Activations: INT8 and GGUF Post-Training Quantization of Ideogram 4.0 for Consumer GPUs
bigattichouse/packed-twin-inference: PTI achieves ~2×
throughput
using a single quantized model (Q5_K_M or better) by running 4 generation
streams
in one batched decode call. The
GPU
loads model weights once per step and produces 4 predictions simultaneously. KV cache overhead is ~0.8 GiB total for all 4
streams
. No draft model. No quality loss
💻
OS
Content type:
Code
github.com
·
2d
2 days ago
·
r/LocalLLaMA
Actions for bigattichouse/packed-twin-inference: PTI achieves ~2× throughput using a single quantized model (Q5_K_M or better) by running 4 generation streams in one batched decode call. The GPU loads model weights once per step and produces 4 predictions simultaneously. KV cache overhead is ~0.8 GiB total for all 4 streams. No draft model. No quality loss
Google's new open model DiffusionGemma generates text from noise instead of word by word
⚙
MLSys
the-decoder.com
·
23h
23 hours ago
Actions for Google's new open model DiffusionGemma generates text from noise instead of word by word
Two old GPUs I salvaged are doing more AI work than a brand new $2000 card, and I won't be upgrading anytime soon
💻
OS
xda-developers.com
·
3h
3 hours ago
Actions for Two old GPUs I salvaged are doing more AI work than a brand new $2000 card, and I won't be upgrading anytime soon
Industry coalition urges Trump administration to take urgent action as AI data centers' extreme
memory
consumption threatens other industries — AI-driven
memory
chip shortage could raise prices in automotive, medical, telecommunications sectors
💡
FlashAttention
Content type:
News
tomshardware.com
·
6d
6 days ago
Actions for Industry coalition urges Trump administration to take urgent action as AI data centers' extreme memory consumption threatens other industries — AI-driven memory chip shortage could raise prices in automotive, medical, telecommunications sectors
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help