Skip to main content

ScourDiscover Docs

Discover About Docs Changelog

You are offline. Trying to reconnect...

Copied to clipboard

Unable to share or copy to clipboard

Back to article

arxiv.org28w28 weeks ago

[2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model (opens in new tab)

Covered by 6 sources See all sources covering this story including KDnuggets, DEV Community

|

|

Feeds

✨ Discovered from this domain

[2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model arxiv.org

Abstract page for arXiv paper 2305.18290: Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Pinboard (recent) feeds.pinboard.in

Using bc, Part 120h20 hours ago

Unix Programming20h20 hours ago

Unix BC Programming20h20 hours ago

+85 more in the past day

Keyboard Shortcuts

Navigation

Next / previous post: j/k
Open post: oorEnter
Preview post: v

Post Actions

Love post: a
Like post: l
Dislike post: d
Undo reaction: u
Save / unsave: s

Recommendations

Add interest / feed: Enter
Not interested: x

Go to

Home: gh
Interests: gi
Feeds: gf
Likes: gl
History: gy
Changelog: gc
Settings: gs
Discover: gb
Search: /

Pagination

Next page: n
Previous page: p

General

Show this help: ?
Submit feedback: !
Close modal / unfocus: Esc

Press ? anytime to show this help

Docs Blog (opens in new tab)Changelog Roadmap (opens in new tab)