H-FA: A Hybrid Floating-Point and Logarithmic Approach to Hardware Accelerated FlashAttention
arxiv.orgยท9h
โšกFlash Attention
Flag this post
(PR) ASUS IoT Launches APC-125U Ultra-Slim Panel PC Series
techpowerup.comยท3h
โฑ๏ธBenchmarking
Flag this post
flowengineR: A Modular and Extensible Framework for Fair and Reproducible Workflow Design in R
arxiv.orgยท9h
๐Ÿ”„ONNX
Flag this post
A portable picokernel for async I/O
ryansepassi.comยท3dยท
Discuss: Hacker News
๐Ÿ“ŠProfiling Tools
Flag this post
Predicting & Mitigating Data Corruption in Pure Storage Flash Arrays via Adaptive Bit Error Rate Modeling
dev.toยท3hยท
Discuss: DEV
โฑ๏ธBenchmarking
Flag this post
I was tired of 50ms+ shell latency, so I built a sub-millisecond prompt in Rust (prmt)
reddit.comยท19hยท
Discuss: r/rust
๐Ÿฆ€PyO3
Flag this post
A more native experience for Cloud TPUs with Ray on GKE
cloud.google.comยท21h
๐Ÿš€MLOps
Flag this post
AMD releases statement about new game support for older Radeon GPUs
tweaktown.comยท1d
๐ŸŽฎNVIDIA
Flag this post
Moving past speculation: How deterministic CPUs deliver predictable AI performance
venturebeat.comยท2d
๐Ÿง CPU Architecture
Flag this post
Project Banana
404wolf.comยท1d
๐ŸŒDistributed Computing
Flag this post
A behind-the-scenes look at Broadcomโ€™s design labs
techbrew.comยท17hยท
โฑ๏ธCUDA Events
Flag this post
Armada Launches Bridge to Power the Next Generation of AI Infrastructure
prnewswire.comยท1d
๐Ÿ”—NCCL
Flag this post
Utilizing Chiplet-Locality For Efficient Memory Mapping In MCM GPUs (ETRI, Sungkyunkwan Univ.)
semiengineering.comยท4d
๐Ÿ“ˆOccupancy Optimization
Flag this post
On Async Mutexes
matklad.github.ioยท14hยท
Discuss: Hacker News
๐Ÿ•Ruff
Flag this post
This feels like the early Internet moment for AI.
threadreaderapp.comยท5h
โšกONNX Runtime
Flag this post
Playing Around with ARM Assembly
blog.nobaralabs.comยท10hยท
Discuss: Hacker News
๐Ÿ“ŠProfiling Tools
Flag this post
Building Yantra: A Visual Workflow Automation Engine
patali.devยท1dยท
Discuss: Hacker News
๐Ÿค–Automation
Flag this post
Samsung and Nvidia join forces for AI megafactory with 50,000 GPUs
techspot.comยท19h
๐Ÿ”Nsight
Flag this post
Tetris: An SLA-aware Application Placement Strategy in the Edge-Cloud Continuum
arxiv.orgยท9h
๐ŸŒDistributed Computing
Flag this post
Implementing virtual list view with variable row heights
judi.systemsยท1hยท
Discuss: r/programming
โœ‚๏ธCUTLASS
Flag this post