Conditional Utilitarian Deberta 01
Model description
This is a Deberta-based model. It was first fine-tuned on for computing utility estimates of experiences (see utilitarian-deberta-01. It was then further fine-tuned on 160 examples of pairwise comparisons of conditional utilities.
Intended use
The main use case is the computation of utility estimates of first-person text scenarios, under extra contextual information.
Limitations
The model was fine-tuned on only 160 examples, so it should be expected to have limited performance.
Further, while the base model was trained on ~10000 examples, they are still restricted, and only on first-person sentences. It does not have the capability of interpreting highly complex or unusual scenarios, and it does not have hard guarantees on its domain of accuracy.
How to use
Given a scenario S under a context C, and the model U, one computes the estimated conditional utility with U(f'{C} {S}') - U(C)
.
Training data
The first training data is the train split from the Utilitarianism part of the ETHICS dataset.
The second training data consists of 160 crowdsourced examples of triples (S, C0, C1) consisting of one scenario and two possible contexts, where U(S | C0) > U(S | C1)
.
Training procedure
Starting from utilitarian-deberta-01, we fine-tune the model over the training data of 160 examples, with a learning rate of 1e-5
, a batch size of 8
, and for 2 epochs.
Evaluation results
The model achieves ~80% accuracy over 40 crowdsourced examples, from the same distribution as the training data.
- Downloads last month
- 16