Upload README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,7 @@ license: mit
|
|
5 |
|
6 |
# model-card-testing
|
7 |
|
8 |
-
model-card-testing is a
|
9 |
|
10 |
## Model Details
|
11 |
|
@@ -24,8 +24,6 @@ Use the code below to get started with the model.
|
|
24 |
|
25 |
|
26 |
|
27 |
-
|
28 |
-
|
29 |
Here is how to use this model to get the features of a given text in Pytorch:
|
30 |
|
31 |
NOTE: This will need customization/fixing.
|
@@ -78,6 +76,11 @@ Using the model in high-stakes settings is out of scope for this model. The mod
|
|
78 |
Significant research has explored bias and fairness issues with models for language generation (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). This model also has persistent bias issues, as highlighted in these demonstrative examples below. Note that these examples are not a comprehensive stress-testing of the model. Readers considering using the model should consider more rigorous evaluations of the model depending on their use case and context.
|
79 |
|
80 |
|
|
|
|
|
|
|
|
|
|
|
81 |
|
82 |
|
83 |
|
|
|
5 |
|
6 |
# model-card-testing
|
7 |
|
8 |
+
model-card-testing is a distilled language model that can be used for text generation. Users of this model card should also consider information about the design, training, and limitations of gpt2.
|
9 |
|
10 |
## Model Details
|
11 |
|
|
|
24 |
|
25 |
|
26 |
|
|
|
|
|
27 |
Here is how to use this model to get the features of a given text in Pytorch:
|
28 |
|
29 |
NOTE: This will need customization/fixing.
|
|
|
76 |
Significant research has explored bias and fairness issues with models for language generation (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). This model also has persistent bias issues, as highlighted in these demonstrative examples below. Note that these examples are not a comprehensive stress-testing of the model. Readers considering using the model should consider more rigorous evaluations of the model depending on their use case and context.
|
77 |
|
78 |
|
79 |
+
The impact of model compression techniques, such as knowledge distillation, on bias and fairness issues associated with language models is an active area of research. For example:
|
80 |
+
- [Silva, Tambwekar and Gombolay (2021)](https://aclanthology.org/2021.naacl-main.189.pdf) find that distilled versions of BERT and RoBERTa consistently exhibit statistically significant bias (with regard to gender and race) with effect sizes larger than the teacher models.
|
81 |
+
- [Xu and Hu (2022)](https://arxiv.org/pdf/2201.08542.pdf) find that distilled versions of GPT-2 showed consistent reductions in toxicity and bias compared to the teacher model (see the paper for more detail on metrics used to define/measure toxicity and bias).
|
82 |
+
- [Gupta et al. (2022)](https://arxiv.org/pdf/2203.12574.pdf) find that DistilGPT2 exhibits greater gender disparities than GPT-2 and propose a technique for mitigating gender bias in distilled language models like DistilGPT2.
|
83 |
+
|
84 |
|
85 |
|
86 |
|