Llama

Feeds to Scour
SubscribedAll
Scoured 233 posts in 6.4 ms

Running Ollama on a 15W CPU sounded ridiculous until I got it working with decent results

 🤖LLM
xda-developers.com·

"AI" Is Eating Platform Monopolist Free Cash Flow, Not the World: CHART OF THE DAY

 🤖LLM  Content type: News  Content type: Blog

martidu4/honey-ai: 🍯 All-in-one AI honeypot powered by local LLMs. SSH, HTTP, FTP, Telnet, SMTP, MySQL, Redis, Git, VNC, RDP — with canary tokens, tarpits, GZIP bombs, and threat intel reporting.

 🤖LLM  Content type: Code
github.com··Hacker News

Using local LLMs for agentic coding

 🤖LLM  Content type: Blog
blog.alexewerlof.com·

Running Qwen 35B MoE at 450k Context on a Single 32GB GPU

 Inference Optimization

Running LLM Inference on Kubernetes: What It Actually Takes

 Kubernetes  Content type: Blog
fairwinds.com·

How Small Can You Go? LoRA Fine-Tuning 270M-8B Models for Merchant Information Extraction in Financial Transactions

 🎯Fine-tuning  Content type: Academic
arxiv.org·

Burning 2.1M Tokens Version of Misadventures in Vibe-Programming: LAUGH OF THE DAY

 🤖LLM
substackcdn.com··Substack

Would a prepaid pass for a coding agent solve a real need or is it just my itch?

 🤖Agent

vishal-dehurdle/state-harness: Runtime safety net for LLM agents. Detects token spirals, kills doomed tasks early, tells you exactly why. Rust core, Python SDK. pip install state-harness

 🤖Agent  Content type: Code
github.com··Hacker News

Unsloth Gemma 4 QAT

 Inference Optimization
unsloth.ai·

How to Measure Time To First Token (TTFT) in AI Systems

 🧠OpenAI

Shared Latent Structures Enable Unified Backdoor Detection and Mitigation in LLMs

 🤖LLM  Content type: Academic
arxiv.org·

Anthropic Oceanus leaks 🤖, ChatGPT Dreaming 💭, recursive self improvement 🚀

 🤖Agent
tldr.tech·

Creating ADK Agent using locally running Gemma 4

 🤖LLM  Content type: Blog
medium.com·

KaiFelixBennett/gemma4-turboquant-rdna4: Run Gemma-4-31B at full 256K context on a $1,400 AMD RDNA4 GPU (gfx1201): TurboQuant KV cache + HIP-graph-safe Flash-Attention for llama.cpp, fully measured on real hardware.

 Inference Optimization  Content type: Code
github.com··Hacker News

When AI builds itself 👷, AI is not a line item 📝, local LLMs for agentic coding 🤖

 🤖LLM
tldr.tech·

I built an open-source persistent memory layer for AI coding agents

 🧠OpenAI  Content type: Code

The Amplifying Mirror: Locating and Steering the Partisan Direction inside a Large Language Model

 🤖LLM  Content type: Academic
arxiv.org·
Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help