Utilitarian Deberta 01
Model description
This is a Deberta model fine-tuned on for computing utility estimates of experiences, represented in first-person sentences. It was trained from human-annotated pairwise utility comparisons, from the ETHICS dataset.
Intended use
The main use case is the computation of utility estimates of first-person text scenarios.
Limitations
The model was only trained on a limited number of scenarios, and only on first-person sentences. It does not have the capability of interpreting highly complex or unusual scenarios, and it does not have hard guarantees on its domain of accuracy.
How to use
The model receives a sentence describing a scenario in first-person, and outputs a scalar representing a utility estimate.
Training data
The training data is the train split from the Utilitarianism part of the ETHICS dataset.
Training procedure
Training can be reproduced by executing the training procedure from tune.py
as follows:
python tune.py --ngpus 1 --model microsoft/deberta-v3-large --learning_rate 1e-5 --batch_size 16 --nepochs 2
Evaluation results
The model achieves 92.2% accuracy on The Moral Uncertainty Research Competition, which consists of a subset of the ETHICS dataset.
- Downloads last month
- 13