vibhorg's picture
Update README.md
e1542dd verified
metadata
license: apache-2.0
datasets:
  - scientific_papers
metrics:
  - bertscore
  - rouge
tags:
  - text-generation-inference
  - rlhf
  - PPO
language:
  - en

This model is fintuned using PPO based NLPO RL algorithm, on ccdv/arxiv-summarization dataset. The base model is pretunerd version of flan-t5-base model.