Skip to main content
Scour
Discover
Docs
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
LLM Training
🧠 LLM Training
Specific
LLM training, pretraining, RLHF, model training, arxiv ML
Filter Results
Timeframe
Choose a timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
231
posts in
22.1
ms
fareedkhan-dev.github.io
·
4d
4 days ago
Train
LLM
from Scratch
Discussed on
Hacker News
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Train LLM from Scratch
arXiv
·
2d
2 days ago
Structured Hyperedge Adaptation for
Parameter-Efficient
Fine-Tuning
of Vision Transformers
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Structured Hyperedge Adaptation for Parameter-Efficient Fine-Tuning of Vision Transformers
codingsprints.medium.com
·
12h
12 hours ago
Making Machine Learning Accessible: Building Interactive AI Demos with
Gradio
and
Hugging
Face
BLIP
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Making Machine Learning Accessible: Building Interactive AI Demos with Gradio and Hugging Face BLIP
Hugging Face
·
2d
2 days ago
PP-OCRv6 on
Hugging
Face
: 50-Language OCR from 1.5M to 34.5M
Parameters
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for PP-OCRv6 on Hugging Face: 50-Language OCR from 1.5M to 34.5M Parameters
medium.com
·
3h
3 hours ago
Enterprise-grade
AI infrastructure with AWS SageMaker HyperPod
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Enterprise-grade AI infrastructure with AWS SageMaker HyperPod
Nature
·
6d
6 days ago
Memorization in large language
models
in medicine prevalence characteristics and implications
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Memorization in large language models in medicine prevalence characteristics and implications
Bloomberg
·
2d
2 days ago
Tech Disruptors: Invisible Technologies on
RLHF
and
LLM
Training
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Tech Disruptors: Invisible Technologies on RLHF and LLM Training
Blogccasion
·
1d
1 day ago
Guest blog post on Cross-Origin Storage in
Transformers.js
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Guest blog post on Cross-Origin Storage in Transformers.js
GitHub
·
5d
5 days ago
Show HN: NanoEuler – GPT-2 scale
model
in pure C/CUDA from scratch
Discussed on
Hacker News
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Show HN: NanoEuler – GPT-2 scale model in pure C/CUDA from scratch
mlx-lora-studio.netlify.app
·
5d
5 days ago
MLX LoRA Studio —
Fine-tune
LLMs on your Mac
Covers
ml-explore/mlx
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for MLX LoRA Studio — Fine-tune LLMs on your Mac
krea.ai
·
1d
1 day ago
Krea 2 Technical Report
Covers
8 stories
See all stories this covers
including
DeepSeek-V3 Technical Report
Covered by
3 sources
See all sources covering this story
including
tldr.tech
,
VentureBeat
Discussed on
Hacker News
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Krea 2 Technical Report
VentureBeat
·
11h
11 hours ago
Alibaba's
model
never
trained
as an agent — and improved agent performance across seven benchmarks
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Alibaba's model never trained as an agent — and improved agent performance across seven benchmarks
kaggle.com
·
3d
3 days ago
LoRA: I
Trained
<1% of a 1.5B
Model
and Matched a Full
Fine-Tune
Discussed on
DEV
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for LoRA: I Trained <1% of a 1.5B Model and Matched a Full Fine-Tune
agenticresourcediscovery.org
·
5d
5 days ago
Agentic Resource Discovery Specification
Covered by
8 sources
See all sources covering this story
including
InfoWorld
,
Hugging Face
Discussed on
Hacker News
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Agentic Resource Discovery Specification
mail.bycloud.ai
·
1d
1 day ago
What even is a > < former)
Covers
5 stories
See all stories this covers
including
GLM-5.2 (6 minute read)
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for What even is a > < former)
Machine Learning Mastery
·
1d
1 day ago
Clustering Unstructured Text with
LLM
Embeddings and HDBSCAN
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Clustering Unstructured Text with LLM Embeddings and HDBSCAN
ByteByteGo Newsletter
·
15h
15 hours ago
Large Language
Models
vs Small Language
Models
Covers
6 stories
See all stories this covers
including
Attention is all you need (2017)
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Large Language Models vs Small Language Models
biorxiv.org
·
5d
5 days ago
Tox21mer, A
transformer
foundation
model
for Tox21 high-throughput concentration-response curves data
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Tox21mer, A transformer foundation model for Tox21 high-throughput concentration-response curves data
Microsoft Tech Community
·
14h
14 hours ago
Inside Llama 3.1 405B MLPerf
Training
on Azure: System-Level Insights at 8K+ GPU Scale
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Inside Llama 3.1 405B MLPerf Training on Azure: System-Level Insights at 8K+ GPU Scale
pyimagesearch.com
·
2d
2 days ago
Google DeepMind’s Gemma 4: MoE,
Efficiency
Tricks, and Benchmarks
Covers
7 stories
See all stories this covers
including
GitHub here . You can follow the build instructions below as well. Change -DGGML_CUDA=ON to -DGGML_CUDA=OFF if you don't have a GPU or just want CPU inferen...
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Google DeepMind’s Gemma 4: MoE, Efficiency Tricks, and Benchmarks
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous post
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Discover
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help
Like
Save
Not for me
Report