Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Triton
🔱 Triton
Specific
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
11
posts in
10.1
ms
No high-quality results found.
Less-relevant results
Show HN:
Ext-Infer
🔄
ONNX
infer.displace.tech
·
4d
4 days ago
·
Hacker News
Actions for Show HN: Ext-Infer
Integrate OpenShift AI and PG Airman MCP
Server
🛠
Ml-eng
developers.redhat.com
·
2d
2 days ago
Actions for Integrate OpenShift AI and PG Airman MCP Server
mirkolenz/llmhop: Tiny, stateless Go router that dispatches OpenAI-compatible requests to
single-model
vLLM and sglang
backends
with zero external dependencies
🛠
Ml-eng
Content type:
Code
github.com
·
6d
6 days ago
·
Hacker News
Actions for mirkolenz/llmhop: Tiny, stateless Go router that dispatches OpenAI-compatible requests to single-model vLLM and sglang backends with zero external dependencies
1-bit and 1.58 bit LLM Benchmarking on Jetson Orin Nano Super | Bonsai LM
⏱️
Benchmarking
smolhub.com
·
2d
2 days ago
·
r/LocalLLaMA
Actions for 1-bit and 1.58 bit LLM Benchmarking on Jetson Orin Nano Super | Bonsai LM
Where to Host Your Open-Source
Model
(Under 10B Parameters)
🛠
Ml-eng
digitalocean.com
·
6d
6 days ago
Actions for Where to Host Your Open-Source Model (Under 10B Parameters)
The hidden bottleneck in LLM
inference
and the impact on MLPerf benchmarking
⚡
ONNX Runtime
edn.com
·
6d
6 days ago
Actions for The hidden bottleneck in LLM inference and the impact on MLPerf benchmarking
xander-jp/audio-sentinel: Audio Sentinel: Ultra-Low-Power Sound Event Detection Node with RP2350 + NDP120
⚡
Flash Attention
Content type:
Code
github.com
·
3d
3 days ago
·
r/embedded
Actions for xander-jp/audio-sentinel: Audio Sentinel: Ultra-Low-Power Sound Event Detection Node with RP2350 + NDP120
How to Measure Time To First Token (TTFT) in AI Systems
⏱️
CUDA Events
qainsights.com
·
4d
4 days ago
·
Hacker News
Actions for How to Measure Time To First Token (TTFT) in AI Systems
Running LLM
Inference
on Kubernetes: What It Actually Takes
🛠
Ml-eng
Content type:
Blog
fairwinds.com
·
5d
5 days ago
Actions for Running LLM Inference on Kubernetes: What It Actually Takes
JinXSuper/gwenland: GwenLand — AI toolkit. Local-first, <50MB, zero Python.
💻
CLI Tools
Content type:
Code
github.com
·
3d
3 days ago
·
DEV
Actions for JinXSuper/gwenland: GwenLand — AI toolkit. Local-first, <50MB, zero Python.
Build a local voice agent with Red Hat OpenShift AI
🔥
PyTorch
developers.redhat.com
·
3d
3 days ago
Actions for Build a local voice agent with Red Hat OpenShift AI
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help