⚡ Hardware Acceleration - jhcha.oyo · Scour

The Hardware That Makes AI Possible

towardsdatascience.com·

Exploring the Classic Xilinx XC5202-6PQ100I FPGA

💾Computer Architecture

FlexNPU: Transparent NPU Virtualization for Dynamic LLM Prefill-Decode Co-location

⚡HFT Academic

I stopped using most of Rust’s advanced features for my ML library

🤖AI Code

github.com··r/rust

Founding Engineer - FPGA, RTL, & ASIC Architect at Zettascale

💾Computer Architecture

ycombinator.com··Hacker News

Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks

aarushgupta.io··Lobsters, Hacker News

Why my SIMD code was silently running as scalar, and what debugging it taught me about production environment assumptions

🎮Game Engines Blog

coloneltoad.substack.com··Substack

Latency-Aware, High-Throughput Homomorphic AES Evaluation with CKKS

🔐Cryptography

eprint.iacr.org·

The Edge LLM Offload Story

semiengineering.com·

1-bit and 1.58 bit LLM Benchmarking on Jetson Orin Nano Super | Bonsai LM

smolhub.com··r/LocalLLaMA

DeepSeekV4 1.6T Day 0 to Day 43 Performance Over Time - Huawei, GB300 NVL72, MI355X, B200

💬LLMs News

newsletter.semianalysis.com

··Hacker News

Towards Autonomous Accelerator Design: FPGA Accelerator Generation with SECDA

💾Computer Architecture Academic

DiffusionGemma: 4x Faster Text Generation

💬LLMs News Blog

blog.google··Hacker News, r/LocalLLaMA, r/singularity

Niobium Opens Developer Partner Program for The Fog, the First IaaS Purpose-Built for Fully Homomorphic Encryption

🔐Cryptography

Why Compiler Engineers Rarely Use Strassen's Algorithm for Fast Matrix Multiplications

🧮Complexity Theory News Blog

leetarxiv.substack.com··Substack, r/programming

The copy_if Speedup That Wasn't About copy_if, Or AVX-512

hftuniversity.com··Substack

Unpacking AI: The Hardware Behind AI

🤖AI News

pathtostaff.com··Hacker News

Build settings in binary crates via cargo install and crates.io

nnethercote.github.io··r/rust

Arithmetic Packing on Wide Integer Datapaths in DSP Primitives of Modern FPGA Devices

💾Computer Architecture Academic

SWIFT: Shallow and SIMD-Aware CKKS Functional Bootstrapping for Low-Latency

⚡Speculative Decoding

eprint.iacr.org·

Log in to enable infinite scrolling