ARMS Continuous Profiling Upgrade for Efficient and Accurate Performance Bottleneck Localization
dev.toยท19hยท
Discuss: DEV
๐Ÿ‘๏ธObservability
Flag this post
Which Chip Is Best?
blog.confident.securityยท8hยท
Discuss: Hacker News
๐Ÿ“ŠColumnar Engines
Flag this post
H-FA: A Hybrid Floating-Point and Logarithmic Approach to Hardware Accelerated FlashAttention
arxiv.orgยท2d
๐Ÿ—๏ธHardware Architecture
Flag this post
Boosting React Performance: A Guide to Optimization
dev.toยท1dยท
Discuss: DEV
๐Ÿ”ฌCode Analysis
Flag this post
My First Multi-GPU Kernel: Writing All-to-All for AMD MI300X
gau-nernst.github.ioยท4dยท
Discuss: Hacker News
๐Ÿ“ŠColumnar Engines
Flag this post
Perfetto: Swiss Army Knife for Linux Client Tracing
lalitm.comยท6dยท
๐Ÿ“‹Tokei
Flag this post
Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale
daft.aiยท2dยท
Discuss: Hacker News
๐Ÿ“ŠColumnar Engines
Flag this post
Why is AI Generated Rust slow when compared with Go/C#/Node/JavaScript
srid68.github.ioยท2dยท
Discuss: Hacker News
๐Ÿ“‹Tokei
Flag this post
eBPF Tutorial by Example: Monitoring GPU Driver Activity with Kernel Tracepoints
dev.toยท2dยท
Discuss: DEV
๐Ÿ”Memory Profilers
Flag this post
Co-Optimizing GPU Architecture And SW To Enhance Edge Inference Performance (NVIDIA)
semiengineering.comยท1d
๐Ÿ“ŠColumnar Engines
Flag this post
The Production Generative AI Stack: Architecture and Components
thenewstack.ioยท11h
๐Ÿ“ŠColumnar Engines
Flag this post
Dynamic Resource Allocation in CXL-Enabled Heterogeneous Compute Clusters
dev.toยท4dยท
Discuss: DEV
๐Ÿ‘๏ธObservability
Flag this post
Strix Halo's Memory Subsystem: Tackling iGPU Challenges
chipsandcheese.comยท6dยท
Discuss: Hacker News
๐Ÿ’พCache Optimization
Flag this post
Accelerating AI inferencing with external KV Cache on Managed Lustre
cloud.google.comยท6d
๐Ÿ“ŠColumnar Engines
Flag this post
10 Smart Performance Hacks For Faster Python Code
blog.jetbrains.comยท1d
๐Ÿ”ขNumPy
Flag this post
Show HN: a Rust ray tracer that runs on any GPU โ€“ even in the browser
github.comยท3dยท
Discuss: Hacker News
๐Ÿฆ€Rust Scientific
Flag this post
Cycle-accurate 6502 emulator as coroutine in Rust
github.comยท5dยท
๐Ÿฆ€Rust Scientific
Flag this post
Why Multimodal AI Broke the Data Pipeline โ€” And How Daft Is Beating Ray and Spark to Fix It
hackernoon.comยท3d
๐Ÿ“ŠColumnar Engines
Flag this post
Free Functions Don't Change Performance (Much)
16bpp.netยท3dยท
Discuss: Hacker News, r/cpp
๐Ÿ”ฌCode Analysis
Flag this post