Quantization

Feeds to Scour
SubscribedAll
Scoured 57 posts in 10.6 ms

SEAM: Shortcut-Aware Real-Time Detection of Scripted vs. Spontaneous Speech for Interview Guardrails

 📈Optimization  Content type: Academic
arxiv.org·

Minimizing the Hidden Cost of Scales: Graph-Guided Ultra-Low-Bit Quantization for Large Language Models

 💬LLMs  Content type: Academic
arxiv.org·

Qwen3.6 + MTP: Calculated context size is smaller when I use `--spec-draft-type-* q4_0`. is this normal? · ggml-org llama.cpp · Discussion #24102

 🤖AI  Content type: Discussion  Content type: Code
github.com··r/LocalLLaMA

QuBLAST: A Framework for Quantizing Large Language Models with Block-Level Compression Approach and Activation Scaling Strategy

 💬LLMs  Content type: Academic
arxiv.org·

STaR-Quant: State-Time Consistent Post-Training Quantization for Diffusion Large Language Models

 💬LLMs  Content type: Academic
arxiv.org·

Does anyone know what PCIe mode was used for these benchmarks?

 💬LLMs  Content type: Code
github.com··r/LocalLLaMA

MorphoQuant: Modality-Aware Quantization for Omni-modal Large Language Models

 👁️Computer Vision  Content type: Academic
arxiv.org·

Recover-LoRA for Aggressive Quantization: Reclaiming Accuracy in 2-Bit Language Models via Low-Rank Adaptation with Knowledge Distillation on Synthetic Data

 🎛️Fine-tuning  Content type: Academic
arxiv.org·

vla.cpp: A Unified Inference Runtime for Vision-Language-Action Models

 🤖AI  Content type: Academic
arxiv.org·

model: Granite4 Vision by gabe-l-hart · Pull Request #23545 · ggml-org/llama.cpp

 🖥️GPU Programming  Content type: Code
github.com··r/LocalLLaMA

Value-and-Structure Alignment for Routing-Consistent Quantization of Mixture-of-Experts Models

 📊Vector Quantization  Content type: Academic
arxiv.org·

[PoC] server: support requantizing kv cache by wadealexc · Pull Request #24134 · ggml-org/llama.cpp

 💬LLMs  Content type: Code
github.com··r/LocalLLaMA

LLMCodec: Adapting Video Codecs for Efficient Weight Compression of Large Language Models

 💬LLMs  Content type: Academic
arxiv.org·

Knowledge Distillation for Visual Autoregressive Models

 👁️Computer Vision  Content type: Academic
arxiv.org·

not much happened today | AINews

 🤖AI
news.smol.ai·

SecRL-Prune: Structured Reinforcement Learning-Based Pruning of CodeLLMs for Preserving Adversarial Code Mutation

 🎮Reinforcement Learning  Content type: Academic
arxiv.org·

TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies

 🎮Reinforcement Learning  Content type: Academic
arxiv.org·

No more posts from jhcha.oyo's subscribed feeds.

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help