Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
pt-sk
/
GPT2-IMDB-Sentiment-FineTuned-with-PPO
like
0
Text Generation
Transformers
Safetensors
pt-sk/imdb
gpt2
PPO
RLHF
text-generation-inference
Inference Endpoints
License:
mit
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
5e189fd
GPT2-IMDB-Sentiment-FineTuned-with-PPO
/
reference materials
1 contributor
History:
2 commits
pt-sk
Upload 5 files
5e189fd
verified
7 months ago
Direct Preference Optimization (DPO).pdf
Safe
2.21 MB
LFS
Upload 5 files
7 months ago
HIGH-DIMENSIONAL CONTINUOUS CONTROL USING GENERALIZED ADVANTAGE ESTIMATION.pdf
Safe
1.8 MB
LFS
Upload 5 files
7 months ago
Proximal Policy Optimization Algorithms.pdf
Safe
2.92 MB
LFS
Upload 5 files
7 months ago
Slides.pdf
Safe
903 kB
Upload 5 files
7 months ago
Training language models to follow instructions.pdf
Safe
1.8 MB
LFS
Upload 5 files
7 months ago
sample.py
0 Bytes
Create reference materials/sample.py
7 months ago