Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Compute Costs
💰 Compute Costs
Specific
GPU cost, training cost, inference cost, FLOP pricing, cloud spend
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
233
posts in
5.3
ms
harshuljain13/llm-inference-at-scale
: A Practitioner handbook for production
llm
serving.
🗄️
KV Cache
Content type:
Code
github.com
·
5d
5 days ago
·
Hacker News
,
r/LLM
Actions for harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.
lightmetal:
GPU
LLM
Inference
From a Single Java 25 JAR
🖥️
Inference Engineering
Content type:
Blog
adambien.blog
·
2d
2 days ago
Actions for lightmetal: GPU LLM Inference From a Single Java 25 JAR
TileFuse: A Fused Mixed-Precision Kernel Library for Efficient Quantized
LLM
Inference
on AMD NPUs
🖥️
Inference Engineering
Content type:
Academic
arxiv.org
·
9h
9 hours ago
Actions for TileFuse: A Fused Mixed-Precision Kernel Library for Efficient Quantized LLM Inference on AMD NPUs
Less-relevant results
A Complete Beginner's Guide to Local
LLM
Inference
🖥️
Inference Engineering
Content type:
Blog
khnsakhnm.medium.com
·
9h
9 hours ago
Actions for A Complete Beginner's Guide to Local LLM Inference
Introducing a new database category - the predictive database
💰
AI Economics
Content type:
Blog
aito.ai
·
2d
2 days ago
·
Hacker News
Actions for Introducing a new database category - the predictive database
'The best solution is to murder him in his sleep': AI can learn violent tendencies from each other despite zero references to violence in
training
data
🤖
AI
Content type:
News
livescience.com
·
6d
6 days ago
Actions for 'The best solution is to murder him in his sleep': AI can learn violent tendencies from each other despite zero references to violence in training data
A system programmer’s guide to
LLM
inference
🖥️
Inference Engineering
Content type:
Blog
blog.xiangpeng.systems
·
3d
3 days ago
·
Hacker News
Actions for A system programmer’s guide to LLM inference
Shadow AI Governance: How to Secure Employee AI Use in 2026
💰
AI Economics
Content type:
Blog
cswithsanjay.blogspot.com
·
1d
1 day ago
Actions for Shadow AI Governance: How to Secure Employee AI Use in 2026
What to look for in an AI assistant
🤖
AI
proton.me
·
2d
2 days ago
Actions for What to look for in an AI assistant
Running
LLM
Inference
on Kubernetes: What It Actually Takes
🖥️
Inference Engineering
Content type:
Blog
fairwinds.com
·
5d
5 days ago
Actions for Running LLM Inference on Kubernetes: What It Actually Takes
I built a "pay as you go" dictation app because I'm tired of all the subscriptions everywhere. Am looking for beta testers for feedback :)
🔤
Tokenization
Content type:
Discussion
getvoxa.app
·
1h
1 hour ago
·
r/SideProject
Actions for I built a "pay as you go" dictation app because I'm tired of all the subscriptions everywhere. Am looking for beta testers for feedback :)
Intro — Sehastrajit
🔤
Tokenization
Content type:
Blog
medium.com
·
2d
2 days ago
Actions for Intro — Sehastrajit
Show HN:
Ext-Infer
🖥️
Inference Engineering
infer.displace.tech
·
4d
4 days ago
·
Hacker News
Actions for Show HN: Ext-Infer
PagedAttention vs Traditional KV Cache: How vLLM Reinvented
GPU
Memory for
LLM
Inference
🗄️
KV Cache
Content type:
Blog
medium.com
·
2d
2 days ago
Actions for PagedAttention vs Traditional KV Cache: How vLLM Reinvented GPU Memory for LLM Inference
Huawei chips refine DeepSeek
model
in major leap for China’s AI self-reliance
🗄️
KV Cache
oodaloop.com
·
5d
5 days ago
Actions for Huawei chips refine DeepSeek model in major leap for China’s AI self-reliance
ASUS ExpertBook Ultra Flagship Business Laptop Debuts In SEA Markets, Featuring Sub-1kg Chassis & Intel Core Ultra X7 Processor
💰
API Pricing
pokde.net
·
22h
22 hours ago
Actions for ASUS ExpertBook Ultra Flagship Business Laptop Debuts In SEA Markets, Featuring Sub-1kg Chassis & Intel Core Ultra X7 Processor
Intelligent
inference
scheduling with
llm-d
on Red Hat AI
🖥️
Inference Engineering
developers.redhat.com
·
13h
13 hours ago
Actions for Intelligent inference scheduling with llm-d on Red Hat AI
Unlawful by design: Exposing the human rights
costs
of generative AI
💰
AI Economics
Content type:
PDF
amnesty.org
·
6d
6 days ago
Actions for Unlawful by design: Exposing the human rights costs of generative AI
Autonomous AI worm uses local
models
to exploit networks and repair its own code
🤖
AI
4sysops.com
·
1d
1 day ago
Actions for Autonomous AI worm uses local models to exploit networks and repair its own code
PoQ-Judge: A Multi-Architecture Evaluation Framework for
Cost-Aware
Proof-of-Quality in Decentralized
LLM
Inference
🖥️
Inference Engineering
Content type:
Academic
arxiv.org
·
9h
9 hours ago
Actions for PoQ-Judge: A Multi-Architecture Evaluation Framework for Cost-Aware Proof-of-Quality in Decentralized LLM Inference
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help