AI Engineering

Feeds to Scour
SubscribedAll
Scoured 219 posts in 7.8 ms

Two Leaps to 1000 Tokens/s on a 1T-Parameter Model: On Inference Systems, Execution Boundaries, and Co-Design

 ⚙️Hardware Architecture  Content type: Blog

NVIDIA RTX Pro 6000 Blackwell: 96GB GDDR7 and the End of VRAM Anxiety

 🎮GPU Programming  Content type: Blog
fitservers.com·

Defense Against Prompt Inversion Attacks: An Information-Theoretic Approach for LLM Collaborative Inference

 🧠LLM Research  Content type: Academic
arxiv.org·

Why are cached input tokens cheaper with AI services?

 🎙️Speech AI
xeiaso.net·

Azure OpenAI Architecture: The Decisions That Actually Matter (Part 2)

 🌐Distributed Systems

🇳🇱 Go/Golang job: Senior Backend Engineer (Go) | Studio AI at Creative Fabrica (Amsterdam, Netherlands)

 🔧Backend Dev
golangprojects.com·

146th airhacks tv: Rust, Java 25, AI Agents, BCE, Web Components, zunit, zb

 🔧Backend Dev  Content type: Blog
adambien.blog·

Valkey: Unlocked Seattle: The Best Systems Let You Sleep At Night

 🔧Backend Dev  Content type: Blog
valkey.io·

AMD's Lemonade SDK For Local AI Adds NVIDIA CUDA Support

 🎮GPU Programming

Issue #390 - The ML Engineer 🤖

 🔧Backend Dev  Content type: News  Content type: Blog

Agentic AI Architecture: How CockroachDB Supports Memory, Context, and Control

 🌐Distributed Systems  Content type: Blog
cockroachlabs.com·

The Bill Arrives: How to Manage Agentic AI Costs at Scale

 🧠LLM Research  Content type: Blog
cockroachlabs.com·

Ask HN: Is software engineering still a good career choice for new students?

 🔧Backend Dev  Content type: Discussion

4× RTX Pro 6000 Blackwell on Water, and the One Card That Wouldn't Behave

 🎮GPU Programming  Content type: Blog

Running Qwen 35B MoE at 450k Context on a Single 32GB GPU

 🔮Multimodal AI

CommBench: Can LLMs Write Correct and Efficient GPU Communication Code?

 🎮GPU Programming

Predicting the World Cup Winner: Live Coding with Hopswor...

 ⚙️Systems Programming
hopsworks.ai··Hacker News

Intro — Sehastrajit

 🧠LLM Research  Content type: Blog
medium.com·

MiniPIC: Flexible Position-Independent Caching in <100LOC

 🗄️Database Internals  Content type: Academic
arxiv.org·

vicharak-in/Gati: Gati Accelerates Your CNN Algorithms!

 ⚙️Hardware Architecture  Content type: Code
github.com··Hacker News
Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help