Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
vwxyzjn
's Collections
Async RLHF Paper Checkpoints
lm-human-preference-details
TL;DR summarization checkpoints
RLOO / PPOv2 TL;DR summarize checkpoints
RLOO / PPOv2 TL;DR summarize checkpoints
updated
Jun 11, 2024
Upvote
1
vwxyzjn/ppo_tldr
Text Generation
•
1B
•
Updated
May 24, 2024
•
11
•
1
vwxyzjn/ppo_tldr_6.9b
Text Generation
•
7B
•
Updated
Jun 7, 2024
•
2
vwxyzjn/rloo_tldr
Text Generation
•
1B
•
Updated
Jun 11, 2024
•
8
vwxyzjn/rloo_tldr_6.9b
Text Generation
•
7B
•
Updated
Jun 7, 2024
•
1
Upvote
1
Share collection
View history
Collection guide
Browse collections