🚀 LLM Deployment - ibrahimsharaf · Scour

froggeric/Qwen3.6-27B-MTP-GGUF ⚡Quantization

huggingface.co·3d·DEV

Nitsum: Serving Tiered LLM Requests with Adaptive Tensor Parallelism 🎯LLM Finetuning

mlsys.wuklab.io·2d·Hacker News

DFlash: The Trick That Makes LLMs Stop Crawling One Token at a Time 🎯LLM Finetuning

abvcreative.medium.com·5d

Blazing fast on-device GenAI with LiteRT-LM 🔬Small LMs

developers.googleblog.com·1d

Ollama on Mac: Setup and Optimization Guide (2026) 🎯LLM Finetuning

insiderllm.com·4d

Will TurboQuant save us from the RAM apocalypse? ⚡Quantization

Meta's WhatsApp Incognito Chat puts AI conversations in a black box 💻Local AI

ImpactArbiter – A PyTorch autograd trap for LLM memory bugs 🎯LLM Finetuning

github.com·2d·Hacker News

An LLM on a Sony PSP 🧠LLMs

·5d

SpecSA: Bridging Speculative Decoding and Sparse Attention for Efficient LLM Inference 🧠LLMs

Context pruning: cut LLM tokens without losing quality (9 minute read) 🎯LLM Finetuning

The Best Open Source and Open-Weight LLM Models to Run Locally in 2026 💻Local AI

huggingface.co·2d

not much happened today 🤖AI Agents

news.smol.ai·5d

A cheap fix that saves the AI $400M dollars a year and brings 4B people online ⚡Quantization

codecai.net·3d·Hacker News

Why Vision LLMs Force A Rethink Of Edge AI Hardware 🎯LLM Finetuning

semiengineering.com·6d

Find bugs in YOUR code using OpenCode, Llama.cpp and Qwen3.6 💻Local AI

wtarreau.blogspot.com·3d·Lobsters, Hacker News, wtarreau.blogspot.com

Lever: Speculative LLM Inference on Smartphones 🧠LLMs

Maker packs an opinionated, googly-eyed AI chatbot into a mobile suitcase, powered by an Nvidia Jetson — entirely local machine entity runs Gemma 4 E4B and can respond in 200ms 🔓Open Source AI

tomshardware.com

·3d

michelangeloromerochisco/ternative: Inference engine for ternary-weight LLMs with runtime LoRA - the llama.cpp of BitNet models 💻Local AI

github.com·1d·Hacker News

KVDrive: A Holistic Multi-Tier KV Cache Management System for Long-Context LLM Inference 💻Local AI

Sign up or log in to see more results

Log in to enable infinite scrolling