⚡ Quantization - jhcha.oyo · Scour

SEAM: Shortcut-Aware Real-Time Detection of Scripted vs. Spontaneous Speech for Interview Guardrails

📈Optimization Academic

Minimizing the Hidden Cost of Scales: Graph-Guided Ultra-Low-Bit Quantization for Large Language Models

💬LLMs Academic

Qwen3.6 + MTP: Calculated context size is smaller when I use `--spec-draft-type-* q4_0`. is this normal? · ggml-org llama.cpp · Discussion #24102

🤖AI Discussion Code

github.com··r/LocalLLaMA

QuBLAST: A Framework for Quantizing Large Language Models with Block-Level Compression Approach and Activation Scaling Strategy

💬LLMs Academic

STaR-Quant: State-Time Consistent Post-Training Quantization for Diffusion Large Language Models

💬LLMs Academic

Does anyone know what PCIe mode was used for these benchmarks?

💬LLMs Code

github.com··r/LocalLLaMA

MorphoQuant: Modality-Aware Quantization for Omni-modal Large Language Models

👁️Computer Vision Academic

Recover-LoRA for Aggressive Quantization: Reclaiming Accuracy in 2-Bit Language Models via Low-Rank Adaptation with Knowledge Distillation on Synthetic Data

🎛️Fine-tuning Academic

vla.cpp: A Unified Inference Runtime for Vision-Language-Action Models

🤖AI Academic

model: Granite4 Vision by gabe-l-hart · Pull Request #23545 · ggml-org/llama.cpp

🖥️GPU Programming Code

github.com··r/LocalLLaMA

Value-and-Structure Alignment for Routing-Consistent Quantization of Mixture-of-Experts Models

📊Vector Quantization Academic

[PoC] server: support requantizing kv cache by wadealexc · Pull Request #24134 · ggml-org/llama.cpp

💬LLMs Code

github.com··r/LocalLLaMA

LLMCodec: Adapting Video Codecs for Efficient Weight Compression of Large Language Models

💬LLMs Academic

Knowledge Distillation for Visual Autoregressive Models

👁️Computer Vision Academic

not much happened today | AINews

SecRL-Prune: Structured Reinforcement Learning-Based Pruning of CodeLLMs for Preserving Adversarial Code Mutation

🎮Reinforcement Learning Academic

TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies

🎮Reinforcement Learning Academic

No more posts from jhcha.oyo's subscribed feeds.

Scour all 25257 feeds Learn more about Feeds

Log in to enable infinite scrolling