Update README.md
Browse files
README.md
CHANGED
@@ -4,7 +4,7 @@ datasets: pt-sk/imdb
|
|
4 |
tags: ["PPO", "RLHF"]
|
5 |
---
|
6 |
Fine-tuning a GPT-2 model on the IMDB dataset using Proximal Policy Optimization (PPO). The goal is to train the model to generate positive sentiment reviews. The training process utilizes the `trl` library for reinforcement learning, the `transformers` library for model handling, and `datasets` for dataset management.
|
7 |
-
Implementation code is available here: [
|
8 |
```python
|
9 |
# Load model and tokenizer directly
|
10 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
|
|
4 |
tags: ["PPO", "RLHF"]
|
5 |
---
|
6 |
Fine-tuning a GPT-2 model on the IMDB dataset using Proximal Policy Optimization (PPO). The goal is to train the model to generate positive sentiment reviews. The training process utilizes the `trl` library for reinforcement learning, the `transformers` library for model handling, and `datasets` for dataset management.
|
7 |
+
Implementation code is available here: [GitHub](https://github.com/sathishkumar67/GPT-2-IMDB-Sentiment-Fine-Tuning-with-PPO)
|
8 |
```python
|
9 |
# Load model and tokenizer directly
|
10 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|