Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
rishabh's Feed
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
52
posts in
27.4
ms
Subscribe
14
interests
·
8
feeds
·
0
likes
defai-digital/ax-engine
: Apple Silicon LLM
runtime
supporting Gemma 4 and Qwen 3.6 MTP
modes
🚀
ML Inference
Content type:
Code
github.com
·
3d
3 days ago
·
Hacker News
Actions for defai-digital/ax-engine: Apple Silicon LLM runtime supporting Gemma 4 and Qwen 3.6 MTP modes
Near-Optimal
Distributed
2-Ruling Sets on Graphs with Low Arboricity
🌐
Distributed Systems
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Near-Optimal Distributed 2-Ruling Sets on Graphs with Low Arboricity
Achieving Cloud-Grade SLOs for Local Mixture-of-Experts Inference through CPU-GPU Hybrid Design
📄
Systems Papers
Content type:
Academic
arxiv.org
·
3d
3 days ago
Actions for Achieving Cloud-Grade SLOs for Local Mixture-of-Experts Inference through CPU-GPU Hybrid Design
nomp: A Framework for Building Domain Specific Compilers
🖥️
GPU Computing
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for nomp: A Framework for Building Domain Specific Compilers
Multiversion Concurrency Control for Multiversion
B-Trees
🗄️
Databases
Content type:
Academic
arxiv.org
·
4d
4 days ago
Actions for Multiversion Concurrency Control for Multiversion B-Trees
Real-Time Language
Model
Jamming: A Case Study for Live Music Accompaniment Generation
🚀
ML Inference
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Real-Time Language Model Jamming: A Case Study for Live Music Accompaniment Generation
From
Fork-Join
to Asynchronous Tasks: Parallelizing Tiled Cholesky Decomposition with OpenMP and HPX
🛠️
Compilers
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for From Fork-Join to Asynchronous Tasks: Parallelizing Tiled Cholesky Decomposition with OpenMP and HPX
AgentCompile: An LLM-Guided Compiler for Direct
CUDA
Inference
🧠
Deep Learning
Content type:
Academic
arxiv.org
·
4d
4 days ago
Actions for AgentCompile: An LLM-Guided Compiler for Direct CUDA Inference
M*: A Modular, Extensible,
Serving
System
for Multimodal
Models
⚙️
ML Systems
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for M*: A Modular, Extensible, Serving System for Multimodal Models
FlashCP: Load-Balanced Communication-Efficient Context Parallelism for LLM Training
🗄️
Databases
Content type:
Academic
arxiv.org
·
4d
4 days ago
Actions for FlashCP: Load-Balanced Communication-Efficient Context Parallelism for LLM Training
APEX4: Efficient Pure W4A4 LLM Inference via Intra-SM
Compute
Rebalancing
🖥️
GPU Computing
Content type:
Academic
arxiv.org
·
4d
4 days ago
Actions for APEX4: Efficient Pure W4A4 LLM Inference via Intra-SM Compute Rebalancing
Beyond Per-Token Pricing: A Concurrency-Aware Methodology for LLM Infrastructure Cost Estimation
🚀
ML Inference
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Beyond Per-Token Pricing: A Concurrency-Aware Methodology for LLM Infrastructure Cost Estimation
Attention at the Theoretical Minimum: A Mathematics of Arrays Framework for Memory-Optimal
Transformer
Kernels
🧠
Deep Learning
Content type:
Academic
arxiv.org
·
4d
4 days ago
Actions for Attention at the Theoretical Minimum: A Mathematics of Arrays Framework for Memory-Optimal Transformer Kernels
SNN-MLIR: An MLIR Dialect for
Compiling
Neuromorphic SNNs from NIR to Bare-Metal C
🛠️
Compilers
Content type:
Academic
arxiv.org
·
4d
4 days ago
Actions for SNN-MLIR: An MLIR Dialect for Compiling Neuromorphic SNNs from NIR to Bare-Metal C
Defeat the Heap: Zero-Copy Data Movement in AXI4MLIR
🛠️
Compilers
Content type:
Academic
arxiv.org
·
3d
3 days ago
·
Hacker News
Actions for Defeat the Heap: Zero-Copy Data Movement in AXI4MLIR
Clairvoyant: Predictive SJF Scheduling to Mitigate Head-of-Line Blocking in Serial LLM Backends
🚀
ML Inference
Content type:
Academic
arxiv.org
·
5d
5 days ago
Actions for Clairvoyant: Predictive SJF Scheduling to Mitigate Head-of-Line Blocking in Serial LLM Backends
Dynamic Software Updates using CRDTs
📄
Systems Papers
Content type:
Academic
arxiv.org
·
3d
3 days ago
Actions for Dynamic Software Updates using CRDTs
Sign up or login to customize your feed and get personalized topic recommendations
Sign Up
Login
Toward Compiler World Models:
Learning
Latent Dynamics for Efficient
Tensor
Program Search
🧠
Deep Learning
Content type:
Academic
arxiv.org
·
4d
4 days ago
Actions for Toward Compiler World Models: Learning Latent Dynamics for Efficient Tensor Program Search
Energy-Efficient On-Device RAG on a Mobile NPU: System Design and Benchmark on Snapdragon X Elite
🖥️
GPU Computing
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Energy-Efficient On-Device RAG on a Mobile NPU: System Design and Benchmark on Snapdragon X Elite
ASTRA-sim 3.0: Next-Level Distributed Machine Learning Simulations via High-Fidelity GPU and Infrastructure
Modeling
🚀
ML Inference
Content type:
Academic
arxiv.org
·
3d
3 days ago
Actions for ASTRA-sim 3.0: Next-Level Distributed Machine Learning Simulations via High-Fidelity GPU and Infrastructure Modeling
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help