⚡ Parallel Computing - def4ultx · Scour

Kosovo votes again amid political deadlock, seeking EU and NATO progress

🔀Concurrency Video News

aljazeera.com·

Issue 753

🔭Observability

main--iosdevweekly.netlify.app·

A Double Victory for Web Speed: Chrome Breaks Records Again on Speedometer 3.1 and Jetstream 3

🗂️Data Structures News Blog

blog.google··Hacker News

AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis

🎮CUDA Academic

arxiv.org··Hacker News

Fast Exact Nearest-Neighbor Learning for High-Frequency Financial Time Series

📈Investing Academic

jeffhuen/RustyCSV: High-performance CSV parsing for Elixir. Rust NIF with SIMD acceleration, parallel parsing, and bounded-memory streaming. Drop-in NimbleCSV replacement.

λFunctional Programming Code

github.com··Hacker News

APEX4: Efficient Pure W4A4 LLM Inference via Intra-SM Compute Rebalancing

🎮CUDA Academic

FlashCP: Load-Balanced Communication-Efficient Context Parallelism for LLM Training

🌐Distributed Systems Academic

KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++

🎮CUDA Code

github.com··Hacker News

A remark on diagnosability verification

🔀Concurrency Academic

Structuring agentic AI for HPC code modernization

🚀High Performance Computing Academic

llama.cpp - Qwen3.6/3.5-MTP - Share your benchmarks t/s

🖼️GPU Computing Code

github.com··r/LocalLLaMA

docs: document release audit scripts · openclaw/openclaw@72547a1

📈Performance Code

zhongkaifu/TensorSharp: A C# inference engine for running large language models (LLMs) locally using GGUF model files. TensorSharp provides a console application, a web-based chatbot interface, and Ollama/OpenAI-compatible HTTP APIs for programmatic access. It supports Windows/MacOS/Linux with full GPU capability

🐧Linux Code

github.com··Hacker News

test(docker): cap npm scheduler concurrency · openclaw/openclaw@023427b

🦀Rust Code

LLM-Based Porting of Optimized C++ to CUDA Through Deoptimization and Reoptimization

🚀High Performance Computing Academic

Does anyone know what PCIe mode was used for these benchmarks?

🖼️GPU Computing Code

github.com··r/LocalLLaMA

SET: Stream-Event-Triggered Scheduling for Efficient CUDA Graph Pipelines

🖼️GPU Computing Academic

jdalang/jda-lang: Jda: A high-performance systems language bootstrapped from assembly. Beats C on sudoku & LZ77. Self-hosted compiler, no GC, built-in concurrency & ML.

🐧Linux Code

github.com··DEV

YouZhi: Towards High-Concurrency Financial LLMs via Adaptive GQA-to-MLA Transition

🌲LSM Trees Academic

Sign up or log in to see more results

Log in to enable infinite scrolling