Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Quantization of LLMs
🔢 Quantization of LLMs
Specific
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
75
posts in
7.4
ms
LLM
Research Papers: The 2026 List (January to May)
🧠
Large Language Models (LLMs)
Content type:
News
magazine.sebastianraschka.com
·
5d
5 days ago
·
Hacker News
Actions for LLM Research Papers: The 2026 List (January to May)
stable-diffusion.cpp/docs/quantization
_and_
gguf.md
at master ·
leejet/stable-diffusion.cpp
✨
Model optimizations in LLMs
Content type:
Code
github.com
·
4d
4 days ago
·
r/StableDiffusion
Actions for stable-diffusion.cpp/docs/quantization_and_gguf.md at master · leejet/stable-diffusion.cpp
Holding the FP8 Quality Ceiling at
8-Bit
Weights
and Activations: INT8 and
GGUF
Post-Training Quantization of Ideogram 4.0 for Consumer GPUs
✨
Model optimizations in LLMs
Content type:
Academic
arxiv.org
·
21h
21 hours ago
Actions for Holding the FP8 Quality Ceiling at 8-Bit Weights and Activations: INT8 and GGUF Post-Training Quantization of Ideogram 4.0 for Consumer GPUs
Show HN: Ext-Infer
🧠
Large Language Models (LLMs)
infer.displace.tech
·
4d
4 days ago
·
Hacker News
Actions for Show HN: Ext-Infer
Gemma 4 QAT
models
: Optimizing model
compression
for mobile and laptop efficiency
✨
Model optimizations in LLMs
Content type:
News
Content type:
Blog
blog.google
·
6d
6 days ago
·
Hacker News
Actions for Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency
The Order Matters: Sequential Fine-Tuning of
LLaMA
for Coherent Automated Essay Scoring
🧠
Large Language Models (LLMs)
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for The Order Matters: Sequential Fine-Tuning of LLaMA for Coherent Automated Essay Scoring
Google Shrank Gemma 4 by 72% and Unsloth Fixed the
4-Bit
Bug Nobody Else Caught on One 4090, and
4-Bit
Shouldn’t Be This Good
🧠
Large Language Models (LLMs)
Content type:
Blog
towardsai.net
·
3d
3 days ago
Actions for Google Shrank Gemma 4 by 72% and Unsloth Fixed the 4-Bit Bug Nobody Else Caught on One 4090, and 4-Bit Shouldn’t Be This Good
NeuroBait: I fine-tuned a
model
to spark dopamine for ADHD brain
✨
Model optimizations in LLMs
Content type:
Blog
huggingface.co
·
2d
2 days ago
Actions for NeuroBait: I fine-tuned a model to spark dopamine for ADHD brain
Florian Brand, Prime Intellect research engineer, adopts Gemma 4 E4B
6-bit
quantized
as his primary local Mac
LLM
🧠
Large Language Models (LLMs)
Content type:
News
digg.com
·
4d
4 days ago
·
Hacker News
Actions for Florian Brand, Prime Intellect research engineer, adopts Gemma 4 E4B 6-bit quantized as his primary local Mac LLM
I wired a fully offline voice loop to Ollama + LM Studio — 100% CPU, no GPU, nothing leaves your machine (Silero VAD + Parakeet STT + Supertonic TTS 3)
🚀
LLM serving frameworks
Content type:
Code
github.com
·
22h
22 hours ago
·
r/LocalLLaMA
Actions for I wired a fully offline voice loop to Ollama + LM Studio — 100% CPU, no GPU, nothing leaves your machine (Silero VAD + Parakeet STT + Supertonic TTS 3)
2x GH200 for
LLM
inference, Part 2: vLLM, DeepSeek V4 Flash, and MTP
🔧
Systems-level optimizations for LLM serving
Content type:
Blog
dnhkng.github.io
·
4d
4 days ago
Actions for 2x GH200 for LLM inference, Part 2: vLLM, DeepSeek V4 Flash, and MTP
not much happened today | AINews
✨
Model optimizations in LLMs
news.smol.ai
·
6d
6 days ago
Actions for not much happened today | AINews
Launch HN: General Instinct (YC P26) – Frontier
models
on edge devices
🧠
Large Language Models (LLMs)
Content type:
Discussion
news.ycombinator.com
·
6d
6 days ago
·
Hacker News
Actions for Launch HN: General Instinct (YC P26) – Frontier models on edge devices
1-bit
and 1.58
bit
LLM
Benchmarking on Jetson Orin Nano Super | Bonsai LM
🧠
Large Language Models (LLMs)
smolhub.com
·
3d
3 days ago
·
r/LocalLLaMA
Actions for 1-bit and 1.58 bit LLM Benchmarking on Jetson Orin Nano Super | Bonsai LM
Google Colab CLI opens runtimes to Claude Code and Codex
🤖
Agents using LLMs
helpnetsecurity.com
·
3d
3 days ago
·
r/ClaudeAI
Actions for Google Colab CLI opens runtimes to Claude Code and Codex
harshuljain13/llm-inference-at-scale
: A Practitioner handbook for production
llm
serving.
🔧
Systems-level optimizations for LLM serving
Content type:
Code
github.com
·
5d
5 days ago
·
Hacker News
,
r/LLM
Actions for harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.
google/gemma-4-12B-it-qat-q4_
0-gguf
🧠
Large Language Models (LLMs)
huggingface.co
·
6d
6 days ago
Actions for google/gemma-4-12B-it-qat-q4_0-gguf
146th airhacks tv: Rust, Java 25, AI Agents, BCE, Web Components, zunit, zb
🧠
Large Language Models (LLMs)
Content type:
Blog
adambien.blog
·
1d
1 day ago
Actions for 146th airhacks tv: Rust, Java 25, AI Agents, BCE, Web Components, zunit, zb
Semantic Grading of Written Answers in Low-Resource
Language
Bangla Using a Fine-Tuned Lightweight
Language
Model
✨
Model optimizations in LLMs
Content type:
Academic
arxiv.org
·
21h
21 hours ago
Actions for Semantic Grading of Written Answers in Low-Resource Language Bangla Using a Fine-Tuned Lightweight Language Model
techjarves/Portable-AI-USB: A 100% offline, fully portable, zero-trace AI (Ollama +
Llama
3 + AnythingLLM) that runs natively from a USB drive on Windows and Mac.
🧠
Large Language Models (LLMs)
Content type:
Code
github.com
·
2d
2 days ago
Actions for techjarves/Portable-AI-USB: A 100% offline, fully portable, zero-trace AI (Ollama + Llama 3 + AnythingLLM) that runs natively from a USB drive on Windows and Mac.
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help