Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🖥 GPUs
GPU Pricing, Serverless GPU Hosting, Cloud AI Model Deployment
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
24831
posts in
58.7
ms
Taming GPU
Underutilization
via Static Partitioning and Fine-grained CPU
Offloading
⚙️
Mechanical Sympathy
arxiv.org
·
13h
AI Training Goes Off the Grid With Solar Homes and
Spare
GPUs
🌐
Distributed systems
spectrum.ieee.org
·
3d
·
Hacker News
Running AI Workloads on Rack-Scale
Supercomputers
: From Hardware to
Topology-Aware
Scheduling
🏗️
LLM Infrastructure
developer.nvidia.com
·
2d
A developer’s guide to
architecting
reliable
GPU infrastructure at scale
🏗️
LLM Infrastructure
cloud.google.com
·
19h
Nvidia can't have every data center
workload
🤖
AI
runtime.news
·
18h
Nvidia Pascal GPUs
debuted
10 years ago today, best known for the GTX 1060 and GTX 1080 Ti — architecture kicked off with the Tesla
P100
🤖
AI
tomshardware.com
·
5d
CUDA
Programming for NVIDIA
H100s
⚡
Hardware Acceleration
freecodecamp.org
·
19h
Vultr
says its Nvidia-powered AI infrastructure costs 50% to 90% less than
hyperscalers
📊
Model Serving Economics
thenewstack.io
·
6d
How Nvidia learned to
embrace
the light in its
quest
for scale
📊
Model Serving Economics
theregister.com
·
5d
·
Hacker News
Fine-Grained Power and Energy Attribution on AMD
GPU/APU-Based
Exascale
Nodes
⚡
Hardware Acceleration
arxiv.org
·
2d
janit/viiwork
: LLM inference load balancer optimized for AMD Radeon VII GPUs
🏗️
LLM Infrastructure
github.com
·
4d
·
Hacker News
What is an AI
Native
Cloud?
🆕
New AI
together.ai
·
3d
Integrate
Physical AI Capabilities into Existing Apps with NVIDIA
Omniverse
Libraries
✨
Gemini
developer.nvidia.com
·
2d
pmady/keda-gpu-scaler
: KEDA External gRPC Scaler for GPU workloads — native
NVML
metrics via DaemonSet, no Prometheus required
🏗️
LLM Infrastructure
github.com
·
3d
Foundry: Template-Based
CUDA
Graph Context
Materialization
for Fast LLM Serving Cold Start
🏗️
LLM Infrastructure
arxiv.org
·
1d
Blink: CPU-Free LLM Inference by
Delegating
the Serving Stack to GPU and
SmartNIC
🏗️
LLM Infrastructure
arxiv.org
·
13h
ai-infos/vllm-gfx906-mobydick
: A high-throughput and memory-efficient inference and serving engine for LLMs - Optimized for AMD
gfx906
GPUs, e.g. Radeon VII / MI50 / MI60
🏗️
LLM Infrastructure
github.com
·
4d
·
r/LocalLLaMA
GTaP
: A GPU-Resident Fork-Join Task-Parallel Runtime with a
Pragma-Based
Interface
⚡
Hardware Acceleration
arxiv.org
·
2d
Minos
:
Systematically
Classifying Performance and Power Characteristics of GPU Workloads on HPC Clusters
⚡
Systems Performance
arxiv.org
·
3d
Characterizing
WebGPU
Dispatch Overhead for LLM Inference Across Four GPU Vendors, Three
Backends
, and Three Browsers
🏗️
LLM Infrastructure
arxiv.org
·
4d
·
Hacker News
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help