Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Fine-tuning
🎯 Fine-tuning
Specific
LoRA, PEFT, instruction tuning, model fine-tuning, QLoRA
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
168
posts in
9.4
ms
How to Train Your Goblin
🎮
Reinforcement Learning
goblins.mchen.workers.dev
·
4d
4 days ago
·
Hacker News
,
Hacker News
Actions for How to Train Your Goblin
Architecture-Aware Reinforcement Learning Makes Sliding-Window Attention Competitive in Math Reasoning
🧠
Transformers
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for Architecture-Aware Reinforcement Learning Makes Sliding-Window Attention Competitive in Math Reasoning
Running Qwen 35B MoE at 450k Context on a Single 32GB GPU
🧠
LLM Inference
local-llm.utop.workers.dev
·
4d
4 days ago
·
Hacker News
Actions for Running Qwen 35B MoE at 450k Context on a Single 32GB GPU
Claude Fable 5 and new AI safety fables
🧩
Cognitive Science
Content type:
News
interconnects.ai
·
1d
1 day ago
·
Hacker News
Actions for Claude Fable 5 and new AI safety fables
It blocked us at 'hello!' Anthropic Fable 5 refusing innocuous prompts
🪟
Context Windows
Content type:
News
theregister.com
·
23h
23 hours ago
·
Hacker News
Actions for It blocked us at 'hello!' Anthropic Fable 5 refusing innocuous prompts
Substrate Asymmetry in User-Side Memory: A Diagnostic Framework
🧠
LLMs
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for Substrate Asymmetry in User-Side Memory: A Diagnostic Framework
Show HN: Bosun – a small
model
that keeps an agent's memory graph clean
🔤
Tokenization
huggingface.co
·
1h
1 hour ago
·
Hacker News
Actions for Show HN: Bosun – a small model that keeps an agent's memory graph clean
MoQ GGUFs and GSQ: Low-Bit GGUFs Are About to Get Much Better
🧠
LLM Inference
Content type:
News
Content type:
Blog
kaitchup.substack.com
·
5d
5 days ago
·
r/LocalLLaMA
Actions for MoQ GGUFs and GSQ: Low-Bit GGUFs Are About to Get Much Better
"North Mini Code"; open weights, 30B param, Canadian coding
model
🤖
Data science
Content type:
Blog
cohere.com
·
2d
2 days ago
·
Hacker News
,
Hacker News
Actions for "North Mini Code"; open weights, 30B param, Canadian coding model
KJLdefeated/RL.cu
: RLVR training for
LLM
in CUDA/C++
🔬
Deep Learning
Content type:
Code
github.com
·
4d
4 days ago
·
Hacker News
Actions for KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++
GPT-2: Too Dangerous To Release (2019)
🧠
Transformers
Content type:
Blog
naokishibuya.github.io
·
2d
2 days ago
·
Hacker News
Actions for GPT-2: Too Dangerous To Release (2019)
GraphInfer-Bench: Benchmarking
LLM
's Inference Capability on Graphs
🕸️
Knowledge Graphs
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for GraphInfer-Bench: Benchmarking LLM's Inference Capability on Graphs
Stack Overflow didn't just help AI learn to code
🤖
LLM
zozo123.github.io
·
4d
4 days ago
·
Hacker News
Actions for Stack Overflow didn't just help AI learn to code
When
RL
Fails after
SFT
: Rejuvenating
Model
Plasticity for Robust
SFT-to-RL
Handoff
🎮
Reinforcement Learning
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for When RL Fails after SFT: Rejuvenating Model Plasticity for Robust SFT-to-RL Handoff
PhysicsIntern: From an Autonomous Benchmark-Runner to a Research Sidekick
🤖
Data science
Content type:
Blog
huggingface.co
·
6h
6 hours ago
·
Hacker News
Actions for PhysicsIntern: From an Autonomous Benchmark-Runner to a Research Sidekick
ApodexAI/AgentHarness: Evaluation harness for Apodex-1.0 on public deep-research benchmarks.
🤖
Data science
Content type:
Code
github.com
·
2d
2 days ago
·
Hacker News
Actions for ApodexAI/AgentHarness: Evaluation harness for Apodex-1.0 on public deep-research benchmarks.
Pythia 1.4B reproduces 3.6% of training samples verbatim given 950-token prompts
🤖
Data science
Content type:
Blog
ret2libc.com
·
4d
4 days ago
·
Hacker News
Actions for Pythia 1.4B reproduces 3.6% of training samples verbatim given 950-token prompts
Doc-to-Atom: Learning to Compile and Compose Memory Atoms
🧠
LLM Inference
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for Doc-to-Atom: Learning to Compile and Compose Memory Atoms
The Philosophy of the Out-of-Office Email
🪨
Obsidian
Content type:
News
theatlantic.com
·
5d
5 days ago
·
Hacker News
Actions for The Philosophy of the Out-of-Office Email
A Unifying Lens on
Supervised
Fine-Tuning
Through Target Distribution Design
🎮
Reinforcement Learning
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for A Unifying Lens on Supervised Fine-Tuning Through Target Distribution Design
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help