bugrakadirhan's Feed

Writing GPU shaders in Rust (gpn24)

Rust-GPU compiles (embedded) Rust to GPU shaders. You can then use these shaders in the bevy game engine, your custom wgpu render or whatever else needs shaders. We'll look at how it compares to DSLs for GPU programming, such as glsl, wgsl or burn. And why are we targetting SPIR-V, yet can compile to wgsl and run on the web. And if we really can compile ordinary rust, what could we run on graphics cards? Licensed to the public under about this event: Read more ›

⚡ML Inference digitalocean.com·

Multi-Model API Cost Governance with the Inference Router

Learn how to use DigitalOcean’s Inference Router to govern multi-model API costs, route requests by task complexity, and reduce LLM inference spend. Read more ›

Covers GitHub - vllm-project/semantic-router: Intelligent Router for Mixture-of-Models

⚙️Model Training arXiv·

Statistically Valid Hyperparameter Selection: From Tuning to Guarantees

Hyperparameter selection is a critical step in the deployment of modern artificial intelligence systems, given the need to tune degrees of freedom such as inference-time parameters, implementation-level settings, and thresholds driving decision rules. Despite its practical importance, hyperparameter selection is typically performed using best-effort empirical methods such as grid search or Bayesian optimization, which provide no formal statistic... Read more ›

🦀Rust medium.com

Axum: Building Rust Web Apps Has Never Been Easier

Learn Axum fundamentals to build Rust web apps without sacrificing development speed. Read more ›

🛠️ML Frameworks medium.com

Train Neural Networks without Draining your Pocket: Can PyTorch use XLA for GPUs?

Learn if PyTorch models can leverage XLA to boost model training on GPUs. Explore how XLA integration works in PyTorch for GPUs and TPUs. Read more ›

🔄MLOps TildAlice·

MLflow Quickstart 2026: Track Your First Experiment in 10 Minutes

Track ML experiments with MLflow in under 10 minutes — log params, metrics, and models in 3 lines of Python. Real benchmarks on sklearn and PyTorch. Read more ›

📐Model Architecture Ai2·

Which tokens does a hybrid model predict better?

New token-level analyses of Olmo 3 and Olmo Hybrid show that hybrid models predict meaning-bearing, context-dependent tokens better than transformers, while transformers retain an edge on verbatim copying. Read more ›

🤖Machine Learning MoTaverse·

Deep Learning

Definitions of Deep Learning Read more ›

🕸️Neural Networks medium.com

HOW AI CREATE IMAGINARY PHOTOS?

One way is by pitting two convolutional neural networks (CNNs) against each other in a “contest” called a generative adversarial network… Read more ›

🖥️Systems ML medium.com

What is distributed training in AI?

Distributed training is a technique in AI and machine learning where a model is trained using multiple computers, GPUs, or servers working… Read more ›

🧠Deep Learning astledsa.substack.com·

Tree Transformers

A step towards generalizing the transformer architecture Read more ›

Discussed on Substack

⚙️Systems Programming cascade-router.github.io·

Show HN: Cascade – A bare-metal C++ proxy that cuts LLM API bills by 70%

A bare-metal C++ AI proxy that predicts prompt complexity in 4.59 milliseconds and dynamically routes traffic to the most cost-effective LLM. Read more ›

Discussed on Hacker News

🗜️Quantization alper.bearblog.dev·

Activate Gemma 4 MTP

Update llama first, tried with llama b9704 llama serve -m \"\.\.\gemma-4-12B-it-qat\gemma-4-12B-it-qat-UD-Q4\_K\_XL\.gguf\" --spec-type draft-mtp --spec-draft-n-max 2 --spec-draft-model \"\.\.\gemma-4-12B-it-qat\mtp-gemma-4-12B-it\.gguf\" Using MTP increases performance from 10 to 14\.8 tokens per second\. Someone recommends trying --spec-draft-n-max 3 for coding workloads Read more ›

🔧MLIR arXiv·

Reading AI Model Compilation in MLIR Through the Lens of Formal Theories

Compiler infrastructures such as MLIR rest on a set of design principles: IR abstractions, interfaces, match-and-rewrite, flow analysis, type conversion, staged lowering, and so on. These concepts have proven themselves in practice. Good designs typically arrive through engineering knowledge, intuition and experience. Many of them, however, have correspondences in formal theory. MLIR's match-and-rewrite engine has correspondence to a \emph{term-... Read more ›

🎮GPU Programming Jon Peddie Research·

European AI factories everywhere, still Nvidia

Nvidia’s European AI factory expansion also strengthens the quantum-classical stack. Read more ›

🔗Distributed Training linaro.org·

Introducing Linaro 26.0

Linaro Forge 26.0 introduces NCCL collective profiling in MAP and Performance Reports, giving full visibility into GPU-to-GPU communication at scale. We put it to the test on a multi-node cluster, read this blog and see what we found, with zero code changes required. Read more ›

⚡ML Inference GitHub·

Show HN: mlx-chronos - benchmark MLX inference engines on Apple Silicon

Community-driven benchmark suite for MLX inference engines on Apple Silicon - igurss/mlx-chronos Read more ›

Discussed on Hacker News

🦀Rust blog.rust-lang.org·

The many journeys of learning Rust

Empowering everyone to build reliable and efficient software. Read more ›

🕸️Neural Networks medium.com

Deep Learning (Part-02): Basics of Deep Learning & Neural Networks

Understanding Neurons, Neural Networks, Neural Connections, Activation Functions & More Read more ›

🔄MLOps docs.vultr.com·

Deploying MLflow Open-Source Machine Learning Experiment Tracking on Ubuntu 24.04

MLflow is an open-source platform for managing the machine learning lifecycle — experiment tracking, model registry, and reproducible runs. This guide deploys MLflow using Docker Compose with a PostgreSQL backend, S3-compatible artifact storage, basic-auth, and Traefik handling automatic HTTPS, then logs a sample scikit-learn run. By the end, you'll have MLflow recording experiments at your domain over HTTPS. Prerequisite: An S3-compatible bucket (e.g. Vultr Object Storage) with access key, s... Read more ›

Discussed on DEV