baidu/ERNIE-4.5-VL-28B-A3B-Thinking released. Curious case..
huggingface.co·7h·
Discuss: r/LocalLLaMA
🏗️LLM Infrastructure
Flag this post
The Underwear Fixed Point
notes.hella.cheap·18h·
🎨Chroma
Flag this post
Fast and Affordable LLMs serving on Intel Arc Pro B-Series GPUs with vLLM
blog.vllm.ai·12h
🏗️LLM Infrastructure
Flag this post
Nested Learning: How Your Neural Network Already Learns at Multiple Timescales
rewire.it·18h
🧠LLM Inference
Flag this post
Meta’s Generative Ads Model (GEM): The Central Brain Accelerating Ads Recommendation AI Innovation
engineering.fb.com·19h
📊Feed Optimization
Flag this post
Think SMART: New NVIDIA Dynamo Integrations Simplify AI Inference at Data Center Scale
blogs.nvidia.com·22h
🏗️LLM Infrastructure
Flag this post
This is a wild use case!
threadreaderapp.com·21h
🏗️LLM Infrastructure
Flag this post
Exploring RTEB, a New Benchmark To Evaluate Embedding Models
thenewstack.io·18h
🌏BGE Embeddings
Flag this post
Scaling Laws: How to Allocate Compute for Training Language Models
pub.towardsai.net·20m
📱Edge AI Optimization
Flag this post
Show HN: Autogenerate efficient backward kernels for Triton
github.com·2h·
Discuss: Hacker News
Glommio
Flag this post
Nested Learning: The Illusion of Deep Learning Architectures
arxiviq.substack.com·19h·
Discuss: Substack
🧠LLM Inference
Flag this post
AI Black&Blonde for a 230% boost on inference speed
reddit.com·10h·
Discuss: r/LocalLLaMA
🖥GPUs
Flag this post
Lessons from the DeepChip Wars: What a Decade-old Debate Teaches Us About Tech Evolution
semiwiki.com·18h
💻Chips
Flag this post
GKE: From containers to agents, the unified platform for every modern workload
cloud.google.com·23m
🏗️LLM Infrastructure
Flag this post
Synth: The New Data Frontier
pleias.fr·6h·
Discuss: Hacker News
🏗️LLM Infrastructure
Flag this post
Show HN: Charl – ML language with native tensors and autograd
charlbase.org·22h·
Discuss: Hacker News
🔥Burn
Flag this post
AI Memory: Enabling The Next Era Of High-Performance Computing
semiengineering.com·4h
💻Chips
Flag this post
Mojo: MLIR-Based Performance-Portable HPC Science Kernels on GPUs for the Python Ecosystem
arxiv.org·6h·
Discuss: Lobsters
🕯️Candle
Flag this post
Raycore: GPU accelerated and modular ray intersections
makie.org·21h·
Discuss: Hacker News
Glommio
Flag this post
Lossless Compression with Asymmetric Numeral Systems (2020)
bjlkeng.io·23h·
Discuss: Hacker News
🗜️Vector Compression
Flag this post