🐿️ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
🏠 Local LLM Deployment

Model Optimization, GPU Acceleration, Inference, Privacy

eLLM: Elastic Memory Management Framework for Efficient LLM Serving
arxiv.org·4h
🗃️SQLite
Show HN: DeepThink Plugin – Bring Gemini 2.5's parallel reasoning to open models
news.ycombinator.com·6h·
Discuss: Hacker News
🗃️SQLite
Praxos: Kernel for AI Agents
praxos.ai·10h·
Discuss: Hacker News
🖥️Self-hosted apps
Deploying the Magistral vLLM Server on Modal
kdnuggets.com·1d
🪟Awesome windows command-line
LiteGD: Lightweight and dynamic GPU Dispatching for Large-scale Heterogeneous Clusters
arxiv.org·4h
🖥️Self-hosted apps
Introduction to vLLM: A High-Performance LLM Serving Engine
thenewstack.io·5d
🖥️Self-hosted apps
[Promotional] HighNoon LLM: Open-Source AI That Thinks Like Humans, Runs Locally
reddit.com·3d·
Discuss: r/opensource
🖥️Self-hosted apps
WFGY: Instantly Boost LLM Reasoning & Stability (Open Source, +22% Accuracy)
dev.to·1d·
Discuss: DEV
⭐Awesome lists
SecFwT: Efficient Privacy-Preserving Fine-Tuning of Large Language Models Using Forward-Only Passes
arxiv.org·4h
🖥️Self-hosted apps
Efficient Serving of LLM Applications with Probabilistic Demand Modeling
arxiv.org·4h
🗃️SQLite
Show HN: Portle – A Client-Side LLM Interface That Doesn't Store Your Data
portle.ai·17h·
Discuss: Hacker News
🖥️Self-hosted apps
Predicting Onflow Parameters Using Transfer Learning for Domain and Task Adaptation
arxiv.org·4h
🗃️SQLite
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
arxiv.org·1d
🗃️SQLite
Parallel Paradigms in Modern HPC: A Comparative Analysis of MPI, OpenMP, and CUDA
arxiv.org·4h
🖥Home Lab Setup
A Multi-Agent SQL Assistant You Can Trust with Human-in-Loop Checkpoint & LLM Cost Control
towardsdatascience.com·14h
🗃️SQLite
MLA: K/V cache compression with low-rank projection
huggingface.co·1d·
Discuss: Hacker News
🗃️SQLite
[D] 500+ Case Studies of Machine Learning and LLM System Design
reddit.com·11h·
Discuss: r/MachineLearning
🖥️Self-hosted apps
Modular: Modular 25.4: One Container, AMD and NVIDIA GPUs, No Lock-In
modular.com·1d·
Discuss: Hacker News
🖥️Self-hosted apps
Cost-Efficient Serving of LLM Agents via Test-Time Plan Caching
arxiv.org·4h
🗃️SQLite
In-Memory C++ Leap in Blockchain Analysis
caudena.com·12h·
Discuss: Hacker News
🗃️SQLite
Loading...Loading more...
AboutBlogChangelogRoadmap