rishabh's Feed

Feeds to Scour
SubscribedAll
Scoured 52 posts in 27.4 ms

defai-digital/ax-engine: Apple Silicon LLM runtime supporting Gemma 4 and Qwen 3.6 MTP modes

 🚀ML Inference  Content type: Code
github.com··Hacker News

Near-Optimal Distributed 2-Ruling Sets on Graphs with Low Arboricity

 🌐Distributed Systems  Content type: Academic
arxiv.org·

Achieving Cloud-Grade SLOs for Local Mixture-of-Experts Inference through CPU-GPU Hybrid Design

 📄Systems Papers  Content type: Academic
arxiv.org·

nomp: A Framework for Building Domain Specific Compilers

 🖥️GPU Computing  Content type: Academic
arxiv.org·

Multiversion Concurrency Control for Multiversion B-Trees

 🗄️Databases  Content type: Academic
arxiv.org·

Real-Time Language Model Jamming: A Case Study for Live Music Accompaniment Generation

 🚀ML Inference  Content type: Academic
arxiv.org·

From Fork-Join to Asynchronous Tasks: Parallelizing Tiled Cholesky Decomposition with OpenMP and HPX

 🛠️Compilers  Content type: Academic
arxiv.org·

AgentCompile: An LLM-Guided Compiler for Direct CUDA Inference

 🧠Deep Learning  Content type: Academic
arxiv.org·

M*: A Modular, Extensible, Serving System for Multimodal Models

 ⚙️ML Systems  Content type: Academic
arxiv.org·

FlashCP: Load-Balanced Communication-Efficient Context Parallelism for LLM Training

 🗄️Databases  Content type: Academic
arxiv.org·

APEX4: Efficient Pure W4A4 LLM Inference via Intra-SM Compute Rebalancing

 🖥️GPU Computing  Content type: Academic
arxiv.org·

Beyond Per-Token Pricing: A Concurrency-Aware Methodology for LLM Infrastructure Cost Estimation

 🚀ML Inference  Content type: Academic
arxiv.org·

Attention at the Theoretical Minimum: A Mathematics of Arrays Framework for Memory-Optimal Transformer Kernels

 🧠Deep Learning  Content type: Academic
arxiv.org·

SNN-MLIR: An MLIR Dialect for Compiling Neuromorphic SNNs from NIR to Bare-Metal C

 🛠️Compilers  Content type: Academic
arxiv.org·

Defeat the Heap: Zero-Copy Data Movement in AXI4MLIR

 🛠️Compilers  Content type: Academic
arxiv.org··Hacker News

Clairvoyant: Predictive SJF Scheduling to Mitigate Head-of-Line Blocking in Serial LLM Backends

 🚀ML Inference  Content type: Academic
arxiv.org·

Dynamic Software Updates using CRDTs

 📄Systems Papers  Content type: Academic
arxiv.org·
Sign up or login to customize your feed and get personalized topic recommendations

Toward Compiler World Models: Learning Latent Dynamics for Efficient Tensor Program Search

 🧠Deep Learning  Content type: Academic
arxiv.org·

Energy-Efficient On-Device RAG on a Mobile NPU: System Design and Benchmark on Snapdragon X Elite

 🖥️GPU Computing  Content type: Academic
arxiv.org·

ASTRA-sim 3.0: Next-Level Distributed Machine Learning Simulations via High-Fidelity GPU and Infrastructure Modeling

 🚀ML Inference  Content type: Academic
arxiv.org·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help