Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
Giles' blog
gilesthomas.com
Writing an LLM from scratch, part
32d
--
Interventions
: adding attention bias
gilesthomas.com
·
1w
·
Discuss:
Hacker News
Writing an LLM from scratch, part
32c
– Interventions: removing
dropout
gilesthomas.com
·
1w
·
Discuss:
Hacker News
Writing an LLM from scratch, part
32b
-- Interventions: gradient
clipping
gilesthomas.com
·
1w
·
Discuss:
Hacker News
Writing an LLM from scratch, part
32a
--
Interventions
: training a baseline model
gilesthomas.com
·
1w
·
Discuss:
Hacker News
,
Hacker News
Getting a custom
PyTorch
LLM onto the Hugging Face Hub (Transformers:
AutoModel
, pipeline, and Trainer)
gilesthomas.com
·
2w
·
Discuss:
Hacker News
Writing an LLM from
scratch
, part 31 -- the models are now on
Hugging
Face
gilesthomas.com
·
4w
·
Discuss:
Hacker News
Writing an LLM from
scratch
, part 30 --
digging
into the LLM-as-a-judge results
gilesthomas.com
·
5w
Writing an LLM from
scratch
, part 29 -- using
DistributedDataParallel
to train a base model from
scratch
in the cloud
gilesthomas.com
·
5w
·
Discuss:
Hacker News
,
Hacker News
LLM from
scratch
, part 28 – training a
base
model from
scratch
on an RTX 3090
gilesthomas.com
·
10w
·
Discuss:
Hacker News
original ↗
gilesthomas.com
·
15w
·
Discuss:
Hacker News
Writing
an LLM from
scratch
, part 27 – what's left, and what's next?
gilesthomas.com
·
14w
·
Discuss:
Hacker News
Writing an LLM from
scratch
, part 26 – evaluating the
fine-tuned
model
gilesthomas.com
·
14w
·
Discuss:
Hacker News
Writing an LLM from
scratch
, part 25 –
instruction
fine-tuning
gilesthomas.com
·
15w
·
Discuss:
Hacker News
original ↗
gilesthomas.com
·
19w
original ↗
gilesthomas.com
·
19w
Retro Language Models: Rebuilding
Karpathy
's
RNN
in PyTorch
gilesthomas.com
·
16w
·
Discuss:
Hacker News
Writing an LLM from
scratch
, part 23 –
fine-tuning
for classification
gilesthomas.com
·
16w
·
Discuss:
Hacker News
Writing
an LLM from
scratch
, part 22 – training our LLM
gilesthomas.com
·
17w
·
Discuss:
Hacker News
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help