Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
inference
🤔 inference
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
161
posts in
6.1
ms
DiffusionGemma: 4x Faster
Text
Generation
🤖
AI
Content type:
News
Content type:
Blog
blog.google
·
1d
1 day ago
·
Hacker News
,
r/LocalLLaMA
,
r/singularity
Actions for DiffusionGemma: 4x Faster Text Generation
google/gemma-4-31B-it · fix: chat template — null handling, reasoning preservation, turn-tag balance, input validation
🤖
AI
huggingface.co
·
3d
3 days ago
·
r/LocalLLaMA
Actions for google/gemma-4-31B-it · fix: chat template — null handling, reasoning preservation, turn-tag balance, input validation
Two Leaps to 1000
Tokens/s
on a 1T-Parameter
Model
: On
Inference
Systems, Execution Boundaries, and Co-Design
🤖
AI
Content type:
Blog
tilert.ai
·
3d
3 days ago
·
Hacker News
Actions for Two Leaps to 1000 Tokens/s on a 1T-Parameter Model: On Inference Systems, Execution Boundaries, and Co-Design
PoQ-Judge: A Multi-Architecture Evaluation Framework for Cost-Aware Proof-of-Quality in Decentralized
LLM
Inference
🤖
AI
Content type:
Academic
arxiv.org
·
18h
18 hours ago
Actions for PoQ-Judge: A Multi-Architecture Evaluation Framework for Cost-Aware Proof-of-Quality in Decentralized LLM Inference
Redis vs Memorystore: key differences in 2026
🤖
AI
Content type:
Blog
redis.io
·
22h
22 hours ago
Actions for Redis vs Memorystore: key differences in 2026
Autonomous AI worm uses local
models
to exploit networks and repair its own code
🤖
AI
4sysops.com
·
2d
2 days ago
Actions for Autonomous AI worm uses local models to exploit networks and repair its own code
I Processed 2.4 Billion
Tokens
Across 52 AI
Models
for $0.52. Here's the Full Breakdown.
🤖
AI
saintlex.sbs
·
18h
18 hours ago
·
DEV
Actions for I Processed 2.4 Billion Tokens Across 52 AI Models for $0.52. Here's the Full Breakdown.
🇳🇱 Go/Golang job: Senior Backend Engineer (Go) | Studio AI at Creative Fabrica (Amsterdam, Netherlands)
🤖
AI
golangprojects.com
·
1d
1 day ago
Actions for 🇳🇱 Go/Golang job: Senior Backend Engineer (Go) | Studio AI at Creative Fabrica (Amsterdam, Netherlands)
Why I care so much about energy
per
token
🤖
AI
Content type:
Blog
ziraph.com
·
4d
4 days ago
·
Hacker News
Actions for Why I care so much about energy per token
The Death of the Four Golden Signals: Designing Telemetry for Non-Deterministic Infrastructure
🤖
AI
devops.com
·
6d
6 days ago
Actions for The Death of the Four Golden Signals: Designing Telemetry for Non-Deterministic Infrastructure
AMD's Lemonade SDK For Local AI Adds NVIDIA CUDA Support
🤖
AI
phoronix.com
·
1d
1 day ago
·
r/artificial
Actions for AMD's Lemonade SDK For Local AI Adds NVIDIA CUDA Support
[AINews] Open
Models
, Model Labs vs Agent Labs, and What's Untrainable — Sarah Guo
🤖
AI
Content type:
News
latent.space
·
18h
18 hours ago
Actions for [AINews] Open Models, Model Labs vs Agent Labs, and What's Untrainable — Sarah Guo
Rate Limits & Anti-Bots in Agentic Scraping
🤖
AI
alterlab.io
·
11h
11 hours ago
·
DEV
Actions for Rate Limits & Anti-Bots in Agentic Scraping
Intro — Sehastrajit
🤖
AI
Content type:
Blog
medium.com
·
3d
3 days ago
Actions for Intro — Sehastrajit
[AINews] FrontierCode: Benchmarking for Code Quality over Slop
🤖
AI
Content type:
News
latent.space
·
2d
2 days ago
Actions for [AINews] FrontierCode: Benchmarking for Code Quality over Slop
Running Qwen 35B MoE at 450k Context on a Single 32GB GPU
🤖
AI
local-llm.utop.workers.dev
·
4d
4 days ago
·
Hacker News
Actions for Running Qwen 35B MoE at 450k Context on a Single 32GB GPU
Nvidia DGX Spark GB10 – AI
Models
and Guide with
vLLM
and Autonomous Script
🤖
AI
Content type:
Code
github.com
·
6d
6 days ago
·
Hacker News
Actions for Nvidia DGX Spark GB10 – AI Models and Guide with vLLM and Autonomous Script
What Arm-based innovations happened in May 2026?
🤖
AI
Content type:
Blog
newsroom.arm.com
·
6d
6 days ago
Actions for What Arm-based innovations happened in May 2026?
The
1-Second
Timeout Hack: Running Infinite Parallel Workloads Natively on Google Apps Script
🤖
AI
Content type:
Blog
medium.com
·
1d
1 day ago
Actions for The 1-Second Timeout Hack: Running Infinite Parallel Workloads Natively on Google Apps Script
Alignment Collapse Under KV Cache
Quantization
: Diagnosis and Mitigation
🤖
AI
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Alignment Collapse Under KV Cache Quantization: Diagnosis and Mitigation
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help