pt-sk
/

GPT2-IMDB-Sentiment-FineTuned-with-PPO

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

pt-sk commited on Jun 25

Commit

b451002

•

1 Parent(s): c9af816

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -3,7 +3,7 @@ license: mit
 datasets: pt-sk/imdb
 tags: ["PPO", "RLHF"]
 ---
-Fine-tuning a GPT-2 model on the IMDB dataset using Proximal Policy Optimization (PPO). The goal is to train the model to generate positive sentiment reviews. The training process utilizes the `trl` library for reinforcement learning, the `transformers` library for model handling, and `datasets` for dataset management.
 Implementation code is available here: [GitHub](https://github.com/sathishkumar67/GPT-2-IMDB-Sentiment-Fine-Tuning-with-PPO)
 ```python
 # Load model and tokenizer directly

 datasets: pt-sk/imdb
 tags: ["PPO", "RLHF"]
 ---
+GPT2-IMDB is pretrained on IMDB dataset. Aligning the model using Proximal Policy Optimization (PPO). The goal is to train the model to generate positive sentiment reviews. The training process utilizes the `trl` library for reinforcement learning, the `transformers` library for model handling, and `datasets` for dataset management.
 Implementation code is available here: [GitHub](https://github.com/sathishkumar67/GPT-2-IMDB-Sentiment-Fine-Tuning-with-PPO)
 ```python
 # Load model and tokenizer directly