Model Quantization, Inference Optimization, GGUF Format, Privacy-preserving AI

Feeds to Scour
SubscribedAll
Scoured 7695 posts in 90.4 ms
PRIMAL: Processing-In-Memory Based Low-Rank Adaptation for LLM Inference Accelerator
arxiv.org·22h
🏗️AI Infrastructure
Preview
Report Post
From 75% to 99.6%: The Math of LLM Ensembles
shibaprasadb.com·19h·
Discuss: Hacker News
🎯Hindley-Milner
Preview
Report Post
Privacy-Preserving Active Learning for heritage language revitalization programs with zero-trust governance guarantees
dev.to·18h·
Discuss: DEV
🏠Self-hosted AI
Preview
Report Post
MLSN #18: Adversarial Diffusion, Activation Oracles, Weird Generalization
lesswrong.com·1d
🏠Self-hosted AI
Preview
Report Post
The three types of LLM workloads and how to serve them
modal.com·11h·
Discuss: Hacker News
🏗️AI Infrastructure
Preview
Report Post
Beyond Memorization: Testing LLM Reasoning on Unseen Theory of Computation Tasks
arxiv.org·22h
🧩Constraint Programming
Preview
Report Post
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation
machinelearning.apple.com·1d
Incremental Computation
Preview
Report Post
Using Local LLMs to Discover High-Performance Algorithms
towardsdatascience.com·2d
⚙️LLVM
Preview
Report Post
MIT’s new ‘recursive’ framework lets LLMs process 10 million tokens without context rot
venturebeat.com·1d·
🏗️AI Infrastructure
Preview
Report Post
Quiz: How to Integrate Local LLMs With Ollama and Python
realpython.com·15h
🚀MLOps
Preview
Report Post
featurestorebook/mlfs-book: O'Reilly book - Building Machine Learning Systems with a feature store: batch, real-time, and LLMs
github.com·2h·
Discuss: Hacker News
🚀MLOps
Preview
Report Post
Everything Moe
ianbarber.blog·1d·
Discuss: Hacker News
📱Edge AI
Preview
Report Post
A Visual Guide to Quantization
newsletter.maartengrootendorst.com·2d
📱Edge AI
Preview
Report Post
Co-optimization Approaches For Reliable and Efficient AI Acceleration (Peking University et al.)
semiengineering.com·10h
Hardware Acceleration
Preview
Report Post
Evolution of LLMs use by a programmer
asfaload.com·10h·
Discuss: Hacker News
💬Language Servers
Preview
Report Post
The coming industrialisation of exploit generation with LLMs
dev.to·1d·
Discuss: DEV
⚙️LLVM
Preview
Report Post
Ensemble Listening Model (ELM): State-of-the Art Foundation Model Accuracy. A Fraction of the Cost.
ensemblelisteningmodel.com·1d·
Discuss: Hacker News
🎙️Whisper
Preview
Report Post
Norm-Preserving Biprojected Abliteration
huggingface.co·2d·
Discuss: Hacker News
🔥PyTorch
Preview
Report Post
Local LLMs became useful when I wired them to Home Assistant
xda-developers.com·3d
🏠Self-hosted AI
Preview
Report Post
Can We Build an NX Bit for LLMs
bogdandeac.com·1d·
Discuss: Hacker News
Hardware Acceleration
Preview
Report Post

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help