🤖 AI - feibilanceon · Scour

SLUUG Talk: Demystifying Large Language Models on Linux

🤖LLM Code

github.com··DEV·Cited by 1 article

LLM KV Cache Optimization, Open Model Evaluation, & Agent Engineering Skills for Local Deployment

🤖LLM Blog

DiffusionGemma: 4x Faster Text Generation

🤖LLM News Blog 21

blog.google··Hacker News, r/LocalLLaMA, r/singularity·Cited by 21 articles

PyTorch from Scratch — Part 1: Tensors, Gradients & Activations

Framework Desktop AMD 395+ (rdna 3.5) cannot run confyui err Fix 2026

⚡Vite Blog

runaihome.com··DEV

Teaching a Reranker the Language of Security Tickets (+41% MRR@10)

linkedin.com··DEV·Cited by 1 article

Three sleep intervals for three APIs: Steam 250ms, GitHub 100ms, HuggingFace none

🔄TanStack Query Reference

docs.github.com··DEV·Cited by 1 article

FlashAttention Explained: The Optimization That Made Modern LLMs Practical

🤖LLM Blog

Stop Downloading 8GB Models on Every Pod Restart - Use OCI Object Storage as a Model Cache

🚀DevOps Blog

Flowork: Self-Hosted AI Stack with Sovereign Agent OS and LLM Gateway

🚀DevOps Blog

Why JAX Is a Much Better Backend for Quantum Circuit Simulation Than PyTorch

🔶Svelte Code

··DEV

8GB to 70B: A Real Hardware Guide for Local LLMs

🤖LLM Blog

Run Codex CLI with Local LLM - Gemma4 with llama.cpp on WSL2

🤖LLM Blog

Token Cost Optimization: How to Cut LLM Inference Spend Without Cutting Quality

🤖LLM Blog

I Made Two AI Models Fight Each Other. They Agreed Way Too Much.

🤖LLM Blog

Local Ai Deployment Cost Analysis 2024

🤖LLM Blog

RFC: pluggable publisher verification as a trust tier for community skills · Issue #40555 · NousResearch/hermes-agent

⚡Vite Discussion Code

github.com··DEV

Quantization formats compared: GGUF vs GPTQ vs AWQ vs NF4

🤖LLM Blog

Mixture of Experts (MoE): what it actually does under the hood, and when it pays off

🤖LLM Blog

I Built a Python Agent That Uses a Vector DB as Memory, Not Retrieval

🤖LLM Blog

Log in to enable infinite scrolling