🤖 AI - bjtitus

martidu4/honey-ai: 🍯 All-in-one AI honeypot powered by local LLMs. SSH, HTTP, FTP, Telnet, SMTP, MySQL, Redis, Git, VNC, RDP — with canary tokens, tarpits, GZIP bombs, and threat intel reporting.

⌨️Command Line Interfaces Code

github.com··Hacker News

Breaking the Ice: Analyzing Cold Start Latency in vLLM

💾Local-First Academic

arxiv.org··Hacker News

Large companies can add a local LLM filter layer to considerably reducing their AI costs

💾Local-First

umrashrf.github.io··Hacker News

local AI agents for Cursor with pre-tuned marketplace/commu

💾Local-First

locaible.com··Hacker News

DeepSeekV4 1.6T Day 0 to Day 43 Performance Over Time - Huawei, GB300 NVL72, MI355X, B200

💾Local-First News

newsletter.semianalysis.com

··Hacker News

Apple WWDC On-Device AI Deep Dive - Google Docs

🦅Swift

gist.is··Hacker News

DiffusionGemma: The Developer Guide- Google Developers Blog

💾Local-First Blog

developers.googleblog.com··r/LocalLLaMA

google/gemma-4-31B-it · fix: chat template — null handling, reasoning preservation, turn-tag balance, input validation

💾Local-First

huggingface.co··r/LocalLLaMA

OpenAI Help: Lockdown Mode

💾Local-First

simonwillison.net·

Show HN: See what your AIs know about you that the others don't

💾Local-First

memkeeper.eu··Hacker News

Fixing a stuck Ollama runner and building a GPU watchdog

💾Local-First

patrickmccanna.net··Hacker News

On-device AI is a margin decision

🦅Swift Blog

ziraph.com··Hacker News

DiffusionGemma: 4x Faster Text Generation

💾Local-First News Blog

blog.google··Hacker News, r/LocalLLaMA, r/singularity

How we fight GPU scarcity without compromise

💾Local-First Blog

equixly.com··Hacker News

A system programmer’s guide to LLM inference

💾Local-First Blog

blog.xiangpeng.systems··Hacker News

Dynamic ReACT Loop with Conductor

⌨️Command Line Interfaces

conductor-oss.github.io··Hacker News

Show HN: RiskKernel, kill -9 an AI agent and resume it without paying twice

💾Local-First

riskkernel.com··Hacker News

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

Inferoa AI harness claimed 90% cache savings. We ran it and measured 97.8%

Token4Token — pay-per-token inference on Gnosis + Swarm

martidu4/honey-ai: 🍯 All-in-one AI honeypot powered by local LLMs. SSH, HTTP, FTP, Telnet, SMTP, MySQL, Redis, Git, VNC, RDP — with canary tokens, tarpits, GZIP bombs, and threat intel reporting.

Breaking the Ice: Analyzing Cold Start Latency in vLLM

Large companies can add a local LLM filter layer to considerably reducing their AI costs

local AI agents for Cursor with pre-tuned marketplace/commu

DeepSeekV4 1.6T Day 0 to Day 43 Performance Over Time - Huawei, GB300 NVL72, MI355X, B200

Apple WWDC On-Device AI Deep Dive - Google Docs

DiffusionGemma: The Developer Guide- Google Developers Blog

google/gemma-4-31B-it · fix: chat template — null handling, reasoning preservation, turn-tag balance, input validation

OpenAI Help: Lockdown Mode

Show HN: See what your AIs know about you that the others don't

Fixing a stuck Ollama runner and building a GPU watchdog

On-device AI is a margin decision

DiffusionGemma: 4x Faster Text Generation

How we fight GPU scarcity without compromise

A system programmer’s guide to LLM inference

Dynamic ReACT Loop with Conductor

Show HN: RiskKernel, kill -9 an AI agent and resume it without paying twice