🏎️ TensorRT - miterion · Scour

michelangeloromerochisco/ternative: Inference engine for ternary-weight LLMs with runtime LoRA - the llama.cpp of BitNet models 🔄ONNX

github.com·1d·Hacker News

Running DramaBox on Strix Halo ⚡Flash Attention

sleepingrobots.com·6d

FSR 4.1 just made AMD's last-gen flagship a smarter buy than its current one 🔍Nsight

xda-developers.com·2d

KV Cache Optimization: 3x Faster LLM Inference on 24GB VRAM 🎛️CUDA Optimization

tildalice.io·6d

Runtime-Certified Bounded-Error Quantized Attention 👁️Attention Optimization

When Arm Meets RISC-V: SiPearl, Semidynamics to Co-Develop Sovereign AI Platform 🌊CUDA Streams

eetimes.com·1d

I got Qwen3-VL-Embedding-2B working with rkllm on an Orange Pi 5b 📉Model Quantization

huggingface.co·21h·r/LocalLLaMA

Ollama Cheat Sheet: Local LLMs, Models, API & Integration (2026) 💡LSP

meshworld.in·2d·DEV

Flipper One – A Rockchip RK3576-powered portable Arm Linux computer and networking multi-tool 🔧PTX

cnx-software.com·5h

Quantization From First Principles: Build Your Own INT8 Inference Engine 📉Model Quantization

·5d

Inside SambaNova's Inference Architecture ⚡ONNX Runtime

viksnewsletter.com

·1d

FSRS核心字段 🧠BF16

blog.est.im·4d

Less-relevant results

NASA builds space-grade AI compute 🧠CPU Architecture

jonpeddie.com·19h

Flipper One is a pocket-sized Linux computer and network hacking tool 📊Profiling Tools

liliputing.com·5h

Intel's Crescent Island PCB Leaks, Showing a Massive Xe3P GPU, 16-Pin Connector, 160GB LPDDR5X as Intel Sidesteps the HBM Shortage ⚡CUDA Programming Patterns

wccftech.com·2d·r/LocalLLaMA

AMD surprises RDNA 3 and RDNA 2 owners with FSR 4.1 support, arriving in July for RX 7000 series 🔍Nsight

tweaktown.com·6d

AMD FSR4 · Issue #2 · Korthos-Software/low_latency_layer 📈GPU Occupancy

github.com·2d·r/linux_gaming

Rockchip unveils RK3572 processor with 4 TOPS NPU and LPDDR5X support 🧠CPU Architecture

linuxgizmos.com·3d

China’s Plan for Winning the AI Race Hinges on the Token Economy, Not Chips 🤖AI Coding Tools

thediplomat.com

·2d

A cheap fix that saves the AI $400M dollars a year and brings 4B people online 🔄ONNX

codecai.net·4d·Hacker News

Log in to enable infinite scrolling