Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Llama
🦙 Llama
Specific
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
233
posts in
6.4
ms
Running
Ollama
on a 15W CPU sounded ridiculous until I got it working with decent results
🤖
LLM
xda-developers.com
·
6d
6 days ago
Actions for Running Ollama on a 15W CPU sounded ridiculous until I got it working with decent results
"AI" Is Eating Platform Monopolist Free Cash Flow, Not the World: CHART OF THE DAY
🤖
LLM
Content type:
News
Content type:
Blog
braddelong.substack.com
·
2d
2 days ago
·
Substack
Actions for "AI" Is Eating Platform Monopolist Free Cash Flow, Not the World: CHART OF THE DAY
martidu4/honey-ai: 🍯 All-in-one AI honeypot powered by local LLMs. SSH, HTTP, FTP, Telnet, SMTP, MySQL, Redis, Git, VNC, RDP — with canary tokens, tarpits, GZIP bombs, and threat intel reporting.
🤖
LLM
Content type:
Code
github.com
·
14h
14 hours ago
·
Hacker News
Actions for martidu4/honey-ai: 🍯 All-in-one AI honeypot powered by local LLMs. SSH, HTTP, FTP, Telnet, SMTP, MySQL, Redis, Git, VNC, RDP — with canary tokens, tarpits, GZIP bombs, and threat intel reporting.
Using local LLMs for agentic coding
🤖
LLM
Content type:
Blog
blog.alexewerlof.com
·
6d
6 days ago
Actions for Using local LLMs for agentic coding
Running Qwen 35B MoE at 450k Context on a Single 32GB GPU
⚡
Inference Optimization
local-llm.utop.workers.dev
·
3d
3 days ago
·
Hacker News
Actions for Running Qwen 35B MoE at 450k Context on a Single 32GB GPU
Running
LLM
Inference on Kubernetes: What It Actually Takes
⚓
Kubernetes
Content type:
Blog
fairwinds.com
·
5d
5 days ago
Actions for Running LLM Inference on Kubernetes: What It Actually Takes
How Small Can You Go? LoRA
Fine-Tuning
270M-8B
Models
for Merchant Information Extraction in Financial Transactions
🎯
Fine-tuning
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for How Small Can You Go? LoRA Fine-Tuning 270M-8B Models for Merchant Information Extraction in Financial Transactions
Burning 2.1M Tokens Version of Misadventures in Vibe-Programming: LAUGH OF THE DAY
🤖
LLM
substackcdn.com
·
4d
4 days ago
·
Substack
Actions for Burning 2.1M Tokens Version of Misadventures in Vibe-Programming: LAUGH OF THE DAY
Would a prepaid pass for a coding agent solve a real need or is it just my itch?
🤖
Agent
codehamr.com
·
5d
5 days ago
·
r/SideProject
Actions for Would a prepaid pass for a coding agent solve a real need or is it just my itch?
vishal-dehurdle/state-harness: Runtime safety net for
LLM
agents. Detects token spirals, kills doomed tasks early, tells you exactly why. Rust core, Python SDK. pip install state-harness
🤖
Agent
Content type:
Code
github.com
·
1d
1 day ago
·
Hacker News
Actions for vishal-dehurdle/state-harness: Runtime safety net for LLM agents. Detects token spirals, kills doomed tasks early, tells you exactly why. Rust core, Python SDK. pip install state-harness
Unsloth Gemma 4 QAT
⚡
Inference Optimization
unsloth.ai
·
5d
5 days ago
Actions for Unsloth Gemma 4 QAT
How to Measure Time To First Token (TTFT) in AI Systems
🧠
OpenAI
qainsights.com
·
4d
4 days ago
·
Hacker News
Actions for How to Measure Time To First Token (TTFT) in AI Systems
Shared Latent Structures Enable Unified Backdoor Detection and Mitigation in LLMs
🤖
LLM
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Shared Latent Structures Enable Unified Backdoor Detection and Mitigation in LLMs
Anthropic Oceanus leaks 🤖, ChatGPT Dreaming 💭, recursive self improvement 🚀
🤖
Agent
tldr.tech
·
6d
6 days ago
Actions for Anthropic Oceanus leaks 🤖, ChatGPT Dreaming 💭, recursive self improvement 🚀
Creating ADK Agent using locally running Gemma 4
🤖
LLM
Content type:
Blog
medium.com
·
3d
3 days ago
Actions for Creating ADK Agent using locally running Gemma 4
KaiFelixBennett/gemma4-turboquant-rdna4: Run Gemma-4-31B at full 256K context on a $1,400 AMD RDNA4 GPU (gfx1201): TurboQuant KV cache + HIP-graph-safe Flash-Attention for
llama.cpp
, fully measured on real hardware.
⚡
Inference Optimization
Content type:
Code
github.com
·
13h
13 hours ago
·
Hacker News
Actions for KaiFelixBennett/gemma4-turboquant-rdna4: Run Gemma-4-31B at full 256K context on a $1,400 AMD RDNA4 GPU (gfx1201): TurboQuant KV cache + HIP-graph-safe Flash-Attention for llama.cpp, fully measured on real hardware.
When AI builds itself 👷, AI is not a line item 📝, local LLMs for agentic coding 🤖
🤖
LLM
tldr.tech
·
6d
6 days ago
Actions for When AI builds itself 👷, AI is not a line item 📝, local LLMs for agentic coding 🤖
How to Train Your Goblin
🎮
Reinforcement Learning
goblins.mchen.workers.dev
·
3d
3 days ago
·
Hacker News
,
Hacker News
Actions for How to Train Your Goblin
I built an open-source persistent memory layer for AI coding agents
🧠
OpenAI
Content type:
Code
github.com
·
1d
1 day ago
·
r/GithubCopilot
Actions for I built an open-source persistent memory layer for AI coding agents
The Amplifying Mirror: Locating and Steering the Partisan Direction inside a
Large
Language
Model
🤖
LLM
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for The Amplifying Mirror: Locating and Steering the Partisan Direction inside a Large Language Model
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help