Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Quantization
⚡ Quantization
Model Compression, INT8, Weight Quantization
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
46
posts in
12.4
ms
youyeetoo updates R1 SBC and lists K1 N100-based x86 computer
⚙️
Zstandard
linuxgizmos.com
·
6h
6 hours ago
Actions for youyeetoo updates R1 SBC and lists K1 N100-based x86 computer
harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.
🎲
Probabilistic Inference
Content type:
Code
github.com
·
4d
4 days ago
·
Hacker News
,
r/LLM
Actions for harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.
The Order Matters: Sequential Fine-Tuning of LLaMA for Coherent Automated Essay Scoring
🗣️
Natural Language Parsing
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for The Order Matters: Sequential Fine-Tuning of LLaMA for Coherent Automated Essay Scoring
Running Qwen 35B MoE at 450k Context on a Single 32GB GPU
🏷️
Named Entity Recognition
local-llm.utop.workers.dev
·
4d
4 days ago
·
Hacker News
Actions for Running Qwen 35B MoE at 450k Context on a Single 32GB GPU
Density Field State Space
Models
:
1-Bit
Distillation, Efficient Inference, and Knowledge Organization in Mamba-2
🎲
Probabilistic Inference
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Density Field State Space Models: 1-Bit Distillation, Efficient Inference, and Knowledge Organization in Mamba-2
Where to Host Your Open-Source
Model
(Under 10B Parameters)
🗂️
Hash Tables
digitalocean.com
·
6d
6 days ago
Actions for Where to Host Your Open-Source Model (Under 10B Parameters)
[AINews] not much happened today
🏷️
Named Entity Recognition
Content type:
News
latent.space
·
5d
5 days ago
Actions for [AINews] not much happened today
Less-relevant results
Day 8 of #100DaysOfClickHouse: Understanding ClickHouse® Data Types
🗂️
Columnar Storage
quantrail-data.com
·
2d
2 days ago
·
DEV
Actions for Day 8 of #100DaysOfClickHouse: Understanding ClickHouse® Data Types
Nvidia's RTX Spark is a developer's dream, but AMD's Ryzen AI Max+ is what most people actually need for local AI
🗂️
Columnar Storage
xda-developers.com
·
3d
3 days ago
Actions for Nvidia's RTX Spark is a developer's dream, but AMD's Ryzen AI Max+ is what most people actually need for local AI
CoreML vs TFLite: iPhone 15 Pro GPU 2.3x Faster
🗜️
Compression Algorithms
Content type:
Blog
Content type:
Discussion
tildalice.io
·
4d
4 days ago
Actions for CoreML vs TFLite: iPhone 15 Pro GPU 2.3x Faster
bigattichouse/packed-twin-inference: PTI achieves ~2× throughput using a single
quantized
model
(Q5_K_M or better) by running 4 generation streams in one batched decode call. The GPU loads
model
weights
once per step and produces 4 predictions simultaneously. KV cache overhead is ~0.8 GiB total for all 4 streams. No draft
model
. No quality loss
🗂️
Hash Tables
Content type:
Code
github.com
·
2d
2 days ago
·
r/LocalLLaMA
Actions for bigattichouse/packed-twin-inference: PTI achieves ~2× throughput using a single quantized model (Q5_K_M or better) by running 4 generation streams in one batched decode call. The GPU loads model weights once per step and produces 4 predictions simultaneously. KV cache overhead is ~0.8 GiB total for all 4 streams. No draft model. No quality loss
Deep X XM2 NPU: 80 TOPS Generative AI Accelerator at 5W
🗜️
Compression Algorithms
armdevices.net
·
6d
6 days ago
Actions for Deep X XM2 NPU: 80 TOPS Generative AI Accelerator at 5W
Correlation Is Not Enough: Embedding Human Metadata for Individual Causal Discovery
🎲
Probabilistic Inference
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Correlation Is Not Enough: Embedding Human Metadata for Individual Causal Discovery
AMD's Frank Azor pushes back against claim that FSR 4.1 won't be ported to RDNA 3.5 GPUs — says 'no such decision' has been made
🔀
CRDTs
tomshardware.com
·
6d
6 days ago
Actions for AMD's Frank Azor pushes back against claim that FSR 4.1 won't be ported to RDNA 3.5 GPUs — says 'no such decision' has been made
AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis
🌲
Binary Search Trees
Content type:
Academic
arxiv.org
·
2d
2 days ago
·
Hacker News
Actions for AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis
Does anyone know what PCIe
mode
was used for these benchmarks?
🗂️
Hash Tables
Content type:
Code
github.com
·
4d
4 days ago
·
r/LocalLLaMA
Actions for Does anyone know what PCIe mode was used for these benchmarks?
Nvidia RTX Spark: The $2,900 Floor Tells You Everything
🗂️
Columnar Storage
Content type:
Blog
Content type:
Discussion
tildalice.io
·
6d
6 days ago
Actions for Nvidia RTX Spark: The $2,900 Floor Tells You Everything
Thundercomm TurboX C7790 Android and Linux development kit features Qualcomm Dragonwing Q-7790 Edge AI SoC - CNX Software
🗜️
Compression Algorithms
Content type:
News
cnx-software.com
·
5d
5 days ago
Actions for Thundercomm TurboX C7790 Android and Linux development kit features Qualcomm Dragonwing Q-7790 Edge AI SoC - CNX Software
Beyond Generative Decoding: Discriminative Hidden-State Readout from a Native Omni-Modal LLM for Multimodal Sentiment Analysis
🗜️
Compression Algorithms
Content type:
Academic
arxiv.org
·
6d
6 days ago
Actions for Beyond Generative Decoding: Discriminative Hidden-State Readout from a Native Omni-Modal LLM for Multimodal Sentiment Analysis
Nvidia DGX Spark GB10 – AI
Models
and Guide with vLLM and Autonomous Script
🗂️
Hash Tables
Content type:
Code
github.com
·
5d
5 days ago
·
Hacker News
Actions for Nvidia DGX Spark GB10 – AI Models and Guide with vLLM and Autonomous Script
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help