⚡ Hardware Acceleration - hello · Scour

🔬Deep Learning DEV Community·

TPUs vs GPUs: How Google's Tensor Processing Units Actually Work

Discussed on DEV

🔌FPGA Programming lil.law.harvard.edu·

An Open Hardware TPU on Your Desk

·

Korean Designer Juntae Kim Updates the Asics Gel-Kinetic With Signature Flower Motif

🔌FPGA Programming unsafeperform.io·

Retrocomputing with Clash – Haskell for FPGA Hardware Design

Discussed on Hacker News

🔧HPCToolkit GitHub·

Show HN: NanoEuler – GPT-2 scale model in pure C/CUDA from scratch

Discussed on Hacker News

🎮SIMT Execution arxiv.org·

From Tokens to Regions: CUDA-Sensitive Instruction Tuning for GPU Kernel Generation

🌊Seastar Framework Jon Peddie Research·

Arm Ethos-N78 scales NPU IP to 10 TOPS

💬Prompt Engineering thefrontierlab.ai·

Six months on the Strix Halo chip AMD now markets as "first-class ROCm"

Discussed on Hacker News

🔬Deep Learning Semiconductor Engineering·

Google Details Five Generations Of TPU Training Supercomputers

🔢Intel AMX Forbes·

Asics Expands Tennis Footwear Research And Product Updates Coming

🔄Concurrency developers.googleblog.com·

Unlocking the Power of the TPU Stack: Introducing our new Developer Hub

🔀SIMD Programming indianspeedster.github.io·

Occupancy Math on the AMD MI355X: A From-First-Principles Guide

Discussed on Hacker News, Hacker News, and Hacker News

🔌FPGA Programming Embedded·

Efinix Launches Titanium Edge FPGA Family

📊Performance Tools DEV Community·

AMD ATOM + ATOMesh: Prefill/decode Disaggregation on ROCm

Discussed on DEV

🦙Ollama GitHub·

Running a 35B MoE model on a 2017 AMD RX 580 8GB via Vulkan (no ROCm/CUDA)

Discussed on Hacker News

📊Performance Tools arxiv.org·

Latency Prediction for LLM Inference on NPU Systems

🎮SIMT Execution GitHub·

Show HN: cuTile Rust: Safe, data-race-free GPU kernels in Rust

Covers 2 stories including AlterLang InterCode: A Native Intercomprehension Paradigm in Programming, Powered by GuruDev

Covered by indiehacker.news

Discussed on Hacker News and DEV

🔌FPGA Programming arxiv.org·

NeuronFabric: A Software Reference Architecture for On-Chip Transformer Training with Local Adam

🔬Deep Learning GitHub

·

pytorch/executorch ciflow/cuda/20288

🔬Deep Learning arxiv.org·

Google's Training Supercomputers from TPU v2 to Ironwood: Architectural Stability, Scale, Resilience, Power Efficiency, and Sustainability Across Five Generatio...

Covered by Semiconductor Engineering

Discussed on Hacker News

No more posts from hello's subscribed feeds.

Scour all 25,324 feeds Learn more about Feeds

Log in to enable infinite scrolling