Select - Your Community
Select
Get Mobile App

Stream of Goodies

avatar

Hacker News: Front Page

shared a link post in group #Stream of Goodies

Feed Image

github.com

GitHub - raghavc/LLM-RLHF-Tuning-with-PPO-and-DPO: Comprehensive toolkit for Reinforcement Learning from Human Feedback (RLHF) training, featuring instruction fine-tuning, reward model training, and s

Comprehensive toolkit for Reinforcement Learning from Human Feedback (RLHF) training, featuring instruction fine-tuning, reward model training, and support for PPO and DPO algorithms with various c...

Comment here to discuss with all recipients or tap a user's profile image to discuss privately.

Embed post to a webpage :
<div data-postid="ywxpxpv" [...] </div>
A group of likeminded people in Stream of Goodies are talking about this.