ai models

Feeds to Scour
SubscribedAll
Scoured 14 posts in 6.9 ms

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

馃language modelsContent type: Code
github.comHacker News, r/LLM

DiffusionGemma: 4x Faster Text Generation

馃挰LLMContent type: NewsContent type: Blog

Initial impressions of Claude Fable 5

馃摑Git

Show HN: Ext-Infer

馃Ollama

A wild idea: Abstract reality using ontology

馃language modelsContent type: Discussion

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

馃Ollama

Show HN: Audit any AI/data pairing with Veritrooper

馃啎New AI
veritrooper.comHacker News

Running Qwen 35B MoE at 450k Context on a Single 32GB GPU

馃惂Linux

How to Train Your Goblin

馃啎New AI

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

馃Ollama

vishal-dehurdle/state-harness: Runtime safety net for LLM agents. Detects token spirals, kills doomed tasks early, tells you exactly why. Rust core, Python SDK. pip install state-harness

馃啎New AIContent type: Code

How to Set Up Codebase Indexing in Kilo Code

馃OllamaContent type: NewsContent type: Blog
blog.kilo.ai

Unsloth Gemma 4 QAT

馃Ollama
unsloth.ai

ninoxAI/nightwatch: Open-source, local-first, read-only AI SRE: clusters alert storms, investigates root cause over your live systems, proposes human-gated fixes.

馃AI AgentContent type: Code
github.comHacker News

No more posts from comwena's subscribed feeds.

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help