Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Inference
⚙️ Inference
model inference, serving, quantization, throughput, vLLM
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
348
posts in
6.7
ms
Train
Models
Faster with JAX and MaxText Using NVFP4 on NVIDIA Blackwell
🚀
Model Releases
Content type:
News
Content type:
Blog
developer.nvidia.com
·
2d
2 days ago
Actions for Train Models Faster with JAX and MaxText Using NVFP4 on NVIDIA Blackwell
Nvidia DGX Spark GB10 – AI
Models
and Guide with
vLLM
and Autonomous Script
🧠
AI
Content type:
Code
github.com
·
5d
5 days ago
·
Hacker News
Actions for Nvidia DGX Spark GB10 – AI Models and Guide with vLLM and Autonomous Script
Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks
🤖
Machine Learning
aarushgupta.io
·
1d
1 day ago
·
Lobsters
,
Hacker News
Actions for Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks
For Robotaxis, Safety Must Be Built In, Not Bolted On
🧠
AI
Content type:
Blog
blogs.nvidia.com
·
2h
2 hours ago
Actions for For Robotaxis, Safety Must Be Built In, Not Bolted On
Vadzo Imaging Introduces HDR MIPI CSI-2 Embedded Cameras Recommended for Drone and UAV Applications
🤖
Machine Learning
Content type:
News
einpresswire.com
·
14h
14 hours ago
Actions for Vadzo Imaging Introduces HDR MIPI CSI-2 Embedded Cameras Recommended for Drone and UAV Applications
google/gemma-4-31B-it · fix: chat template — null handling, reasoning preservation, turn-tag balance, input validation
💬
LLMs
huggingface.co
·
2d
2 days ago
·
r/LocalLLaMA
Actions for google/gemma-4-31B-it · fix: chat template — null handling, reasoning preservation, turn-tag balance, input validation
Show HN:
Ext-Infer
🦀
Rust
infer.displace.tech
·
3d
3 days ago
·
Hacker News
Actions for Show HN: Ext-Infer
🇳🇱 Go/Golang job: Senior Backend
Engineer
(Go) | Studio AI at Creative Fabrica (Amsterdam, Netherlands)
👨🏫
Karpathy
golangprojects.com
·
7h
7 hours ago
Actions for 🇳🇱 Go/Golang job: Senior Backend Engineer (Go) | Studio AI at Creative Fabrica (Amsterdam, Netherlands)
Why agentic AI needs an open
inference
stack
🕵️
AI Agents
redhat.com
·
2d
2 days ago
Actions for Why agentic AI needs an open inference stack
MLPerf and the rise of
latency-aware
LLM
benchmarking
⚡
Transformers
edn.com
·
5d
5 days ago
Actions for MLPerf and the rise of latency-aware LLM benchmarking
TFLite Edge
Model
Quantizer
Snippet
💬
LLMs
itsevilduck.gumroad.com
·
2d
2 days ago
·
DEV
Actions for TFLite Edge Model Quantizer Snippet
AMD's Lemonade SDK For Local AI Adds NVIDIA CUDA Support
🎨
Diffusion Models
phoronix.com
·
5h
5 hours ago
Actions for AMD's Lemonade SDK For Local AI Adds NVIDIA CUDA Support
LC-QAT: Data-Efficient 2-Bit QAT for LLMs via Linear-Constrained Vector
Quantization
💬
LLMs
Content type:
Academic
arxiv.org
·
17h
17 hours ago
Actions for LC-QAT: Data-Efficient 2-Bit QAT for LLMs via Linear-Constrained Vector Quantization
The latest Gemma 4
models
use a training trick to slash their on-device memory footprint
🧠
AI
androidauthority.com
·
5d
5 days ago
Actions for The latest Gemma 4 models use a training trick to slash their on-device memory footprint
What's in the Box? A Field Guide to AI
Models
🧠
AI
Content type:
Blog
iankduncan.com
·
1d
1 day ago
Actions for What's in the Box? A Field Guide to AI Models
Google’s DiffusionGemma is 4x faster than its other Gemma
models
🎨
Diffusion Models
thenewstack.io
·
4h
4 hours ago
Actions for Google’s DiffusionGemma is 4x faster than its other Gemma models
MiMo-v2.5-Pro-UltraSpeed: 1T
model
with 1000 TPS
🎯
Fine-Tuning
Content type:
Blog
mimo.xiaomi.com
·
2d
2 days ago
·
Hacker News
,
r/LocalLLaMA
Actions for MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 TPS
A field journal on Ray Data and Daft for multimodal data lake (14 minute read)
📊
AI Evals
Content type:
Blog
mehulbatra.medium.com
·
6d
6 days ago
Actions for A field journal on Ray Data and Daft for multimodal data lake (14 minute read)
Azure OpenAI Architecture: The Decisions That Actually Matter (Part 2)
💬
LLMs
techcommunity.microsoft.com
·
2d
2 days ago
Actions for Azure OpenAI Architecture: The Decisions That Actually Matter (Part 2)
Latest technical articles & videos.
🎯
Fine-Tuning
certdepot.net
·
4d
4 days ago
Actions for Latest technical articles & videos.
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help