Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
LLMs
🧠 LLMs
Specific
Large Language Models, GPT, Transformers
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
93
posts in
7.8
ms
Nvidia Nemotron 3 Ultra
🤖
Large Language Models
research.nvidia.com
·
6d
6 days ago
·
Hacker News
Actions for Nvidia Nemotron 3 Ultra
Tokenminning: Because Tokenmaxxing Is a Bad Idea
💬
Prompt Engineering
tokenminning.com
·
1d
1 day ago
·
Hacker News
Actions for Tokenminning: Because Tokenmaxxing Is a Bad Idea
Research Proposal: Decoupled
RISC-LLM
Architectures via Circadian Synaptic Consolidation
🧠
LLM
aermia.com
·
4d
4 days ago
·
Hacker News
Actions for Research Proposal: Decoupled RISC-LLM Architectures via Circadian Synaptic Consolidation
KaiFelixBennett/gemma4-turboquant-rdna4: Run Gemma-4-31B at full 256K context on a $1,400 AMD RDNA4 GPU (gfx1201): TurboQuant
KV
cache
+ HIP-graph-safe
Flash-Attention
for llama.cpp, fully measured on real hardware.
🧠
LLM
Content type:
Code
github.com
·
12h
12 hours ago
·
Hacker News
Actions for KaiFelixBennett/gemma4-turboquant-rdna4: Run Gemma-4-31B at full 256K context on a $1,400 AMD RDNA4 GPU (gfx1201): TurboQuant KV cache + HIP-graph-safe Flash-Attention for llama.cpp, fully measured on real hardware.
Show HN:
Ext-Infer
💬
Prompt Engineering
infer.displace.tech
·
4d
4 days ago
·
Hacker News
Actions for Show HN: Ext-Infer
Running Qwen 35B MoE at 450k Context on a Single 32GB GPU
🧠
LLM
local-llm.utop.workers.dev
·
3d
3 days ago
·
Hacker News
Actions for Running Qwen 35B MoE at 450k Context on a Single 32GB GPU
Show HN: Run
Llama.cpp
In-Process from Java with Project Panama FFM
🤖
AI
deemwar-products.github.io
·
5d
5 days ago
·
Hacker News
Actions for Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM
google/gemma-4-12B-it-qat-q4_0-gguf
🤖
AI
huggingface.co
·
5d
5 days ago
Actions for google/gemma-4-12B-it-qat-q4_0-gguf
Less-relevant results
AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis
🤖
AI
Content type:
Academic
arxiv.org
·
2d
2 days ago
·
Hacker News
Actions for AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis
Appraising Artworks with Joins and
LLMs
(Ultorg Database UI)
🤖
ChatGPT
ultorg.com
·
6d
6 days ago
·
Hacker News
Actions for Appraising Artworks with Joins and LLMs (Ultorg Database UI)
Don't dethrone consciousness
🧠
LLM
Content type:
News
theintrinsicperspective.com
·
5d
5 days ago
·
Hacker News
Actions for Don't dethrone consciousness
Arithmetic Without Numbers – How
LLMs
Do Math
🧠
LLM
alvaro-videla.com
·
5d
5 days ago
·
Hacker News
Actions for Arithmetic Without Numbers – How LLMs Do Math
How to Measure Time To First Token (TTFT) in AI Systems
🤖
AI
qainsights.com
·
4d
4 days ago
·
Hacker News
Actions for How to Measure Time To First Token (TTFT) in AI Systems
defai-digital/ax-engine: Apple Silicon
LLM
runtime supporting Gemma 4 and Qwen 3.6 MTP
modes
🤖
AI
Content type:
Code
github.com
·
1d
1 day ago
·
Hacker News
Actions for defai-digital/ax-engine: Apple Silicon LLM runtime supporting Gemma 4 and Qwen 3.6 MTP modes
Show HN: Audit any AI/data pairing with Veritrooper
🧠
LLM
veritrooper.com
·
5d
5 days ago
·
Hacker News
Actions for Show HN: Audit any AI/data pairing with Veritrooper
Introducing Granite Libraries and Project Granite Switch
💬
Prompt Engineering
Content type:
Blog
research.ibm.com
·
6d
6 days ago
·
Hacker News
Actions for Introducing Granite Libraries and Project Granite Switch
Show HN: Axiomax – Cryptographic proof of AI
inference
carbon footprint
💬
Prompt Engineering
axiomaxllc.com
·
3d
3 days ago
·
Hacker News
Actions for Show HN: Axiomax – Cryptographic proof of AI inference carbon footprint
What an
LLM
Actually Does With Your Prompt First
🧠
LLM
siliconopera.com
·
5d
5 days ago
Actions for What an LLM Actually Does With Your Prompt First
GGUF vs GPTQ vs AWQ: The Plain-English Guide to
LLM
Quantization (and Which One to Pick)
🤖
AI
vettedconsumer.com
·
4d
4 days ago
·
Hacker News
Actions for GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)
vishal-dehurdle/state-harness: Runtime safety net for
LLM
agents. Detects token spirals, kills doomed tasks early, tells you exactly why. Rust core, Python SDK. pip install state-harness
🤖
AI
Content type:
Code
github.com
·
1d
1 day ago
·
Hacker News
Actions for vishal-dehurdle/state-harness: Runtime safety net for LLM agents. Detects token spirals, kills doomed tasks early, tells you exactly why. Rust core, Python SDK. pip install state-harness
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help