🤖 AI - hop1.ng.1357 · Scour

Can You Run LLMs Locally Without a GPU? I Tested 8 Models on Linux ⚙️MLOps

itsfoss.com·5d·Hacker News

ndom91/llama-dash: The operations layer for your local LLM stack 💉Prompt Injection

github.com·1d·Hacker News

Andrej Karpathy Joined Anthropic. What It Says About Where AI Is Heading. 🛡️Anthropic PBC

firethering.com·19h·Hacker News

I tried 4 LLM speedup techniques on CPU. Three made it slower. 🔓Side-Channel Attacks

deemwar-products.github.io·9h·Hacker News

Running PyTorch Models on Apple Silicon GPUs with the ExecuTorch MLX Delegate ⚙️MLOps

pytorch.org·2d·Hacker News

Find bugs in YOUR code using OpenCode, Llama.cpp and Qwen3.6 💉Prompt Injection

wtarreau.blogspot.com·3d·Lobsters, Hacker News, wtarreau.blogspot.com

GPU Memory Math for LLMs: Formula That Tells You What Fits on Your GPU 📱Edge AI Optimization

theahmadosman.substack.com·7h·Substack, r/LocalLLaMA

HPC-LLM: Practical Domain Adaptation and Retrieval-Augmented Generation for HPC Support 📚RAG

dcostenco/prism-coder: The Mind Palace for AI Agents - HIPAA-hardened Cognitive Architecture with on-device LLM (prism-coder:7b), Hebbian learning, ACT-R spreading activation, adversarial evaluation, persistent memory, multi-agent Hivemind and visual dashboard. Zero API keys required. 🔧Agent Tooling

github.com·11h·Hacker News

From Compute Overhang to Compute Crunch 🇨🇳Chinese AI

secondthoughts.ai·1d·Hacker News

Towards local plug-and-play AI 📱Edge AI Optimization

adlrocha.substack.com·3d·Substack

The Evaluation Game: Beyond Static LLM Benchmarking 🤖LLM

Agent harnesses, like OpenClaw, are changing how we build and run AI models 🔧Agent Tooling

theregister.com·3d·Hacker News

slokam-ai/localgcp: LocalStack for GCP. One Go binary emulating 14 Google Cloud services locally: Vertex AI, BigQuery, Spanner, Firestore, Pub/Sub, Cloud Storage, Bigtable, Cloud SQL, Memorystore, Cloud Tasks, KMS, Secret Manager, Cloud Run, Cloud Logging. Zero cloud bills. 💾Local-First Software

github.com·12h·Lobsters, Hacker News

What can a local model do for you in early May 2026? 🤖Agent Payments

manichord.com·2d·Hacker News

SuperInfer: SLO-Aware Rotary Scheduling and Memory Management for LLM Inference on Superchips 🔌Embedded Systems

supercomputing-system-ai-lab.github.io·2d·Hacker News

The Ultimate LLM Fine-Tuning Guide ✨LLMs

promptinjection.net·3d·Hacker News

Nitsum: Serving Tiered LLM Requests with Adaptive Tensor Parallelism ⚙️MLOps

mlsys.wuklab.io·2d·Hacker News

Fixing LLM Writing with Distribution Fine Tuning 📋Text Quality

rosmine.ai·2d·Hacker News

Recursive Self-Improvement Delivers New SOTA Coding Performance 🇨🇳Chinese AI

poetiq.ai·6d·Hacker News, r/singularity

Log in to enable infinite scrolling