🔓 Open Source AI - marlonp · Scour

google/gemma-4-12B-it-qat-q4_0-gguf

huggingface.co·

Google fills out the middle with the Gemma 4 12B

jonpeddie.com·

Ollama 0.30 GPU Boost: Faster local Qwen inference on NVIDIA

everylocalai.com··DEV

fix(ollama): use provider thinking default in SDK session factory (#9… · openclaw/openclaw@4f3c2cd

🧠LLMs Code

NVIDIA Accelerates Google DeepMind’s DiffusionGemma for Local AI

🧠LLMs Blog

blogs.nvidia.com·

How Small Can You Go? LoRA Fine-Tuning 270M-8B Models for Merchant Information Extraction in Financial Transactions

🧠LLMs Academic

Unsloth Gemma 4 QAT

What Ollama Reveals About Local AI, Agents, and Open Models

🕵️AI Agents Blog

odsc.medium.com·

Google Gemma 4 12B brings native multimodal AI to standard laptops

🕵️AI Agents

Intelligent inference scheduling with llm-d on Red Hat AI

developers.redhat.com·

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

🧠LLMs News Blog

blog.google··Hacker News

MTP Isn't Always a Win: 1.95x on My 3090, but Speculative Decoding Is Hardware-Dependent

🧠LLMs Blog

bric.pe.kr··DEV

Government aims to make UK top spot for open source AI

🧠LLMs News

computerweekly.com

·

Google unveils DiffusionGemma, delivering up to 4x faster inference on dedicated GPUs

alternativeto.net·

lightmetal: GPU LLM Inference From a Single Java 25 JAR

🧠LLMs Blog

adambien.blog·

Improved performance and model support with GGUF

🧠LLMs Blog

Google's new open model DiffusionGemma generates text from noise instead of word by word

the-decoder.com

·

Fixing a stuck Ollama runner and building a GPU watchdog

🤖Autonomous Systems

patrickmccanna.net··Hacker News

AMD's Lemonade SDK For Local AI Adds NVIDIA CUDA Support

phoronix.com··r/artificial

The latest Gemma 4 models use a training trick to slash their on-device memory footprint

androidauthority.com·

Log in to enable infinite scrolling