Marissa commited on
Commit
601f379
1 Parent(s): 3cba355

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -3
README.md CHANGED
@@ -5,7 +5,7 @@ license: mit
5
 
6
  # model-card-testing
7
 
8
- model-card-testing is a pretrained language model that can be used for text generation. Users of this model card should also consider information about the design, training, and limitations of gpt2.
9
 
10
  ## Model Details
11
 
@@ -24,8 +24,6 @@ Use the code below to get started with the model.
24
 
25
 
26
 
27
-
28
-
29
  Here is how to use this model to get the features of a given text in Pytorch:
30
 
31
  NOTE: This will need customization/fixing.
@@ -78,6 +76,11 @@ Using the model in high-stakes settings is out of scope for this model. The mod
78
  Significant research has explored bias and fairness issues with models for language generation (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). This model also has persistent bias issues, as highlighted in these demonstrative examples below. Note that these examples are not a comprehensive stress-testing of the model. Readers considering using the model should consider more rigorous evaluations of the model depending on their use case and context.
79
 
80
 
 
 
 
 
 
81
 
82
 
83
 
 
5
 
6
  # model-card-testing
7
 
8
+ model-card-testing is a distilled language model that can be used for text generation. Users of this model card should also consider information about the design, training, and limitations of gpt2.
9
 
10
  ## Model Details
11
 
 
24
 
25
 
26
 
 
 
27
  Here is how to use this model to get the features of a given text in Pytorch:
28
 
29
  NOTE: This will need customization/fixing.
 
76
  Significant research has explored bias and fairness issues with models for language generation (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). This model also has persistent bias issues, as highlighted in these demonstrative examples below. Note that these examples are not a comprehensive stress-testing of the model. Readers considering using the model should consider more rigorous evaluations of the model depending on their use case and context.
77
 
78
 
79
+ The impact of model compression techniques, such as knowledge distillation, on bias and fairness issues associated with language models is an active area of research. For example:
80
+ - [Silva, Tambwekar and Gombolay (2021)](https://aclanthology.org/2021.naacl-main.189.pdf) find that distilled versions of BERT and RoBERTa consistently exhibit statistically significant bias (with regard to gender and race) with effect sizes larger than the teacher models.
81
+ - [Xu and Hu (2022)](https://arxiv.org/pdf/2201.08542.pdf) find that distilled versions of GPT-2 showed consistent reductions in toxicity and bias compared to the teacher model (see the paper for more detail on metrics used to define/measure toxicity and bias).
82
+ - [Gupta et al. (2022)](https://arxiv.org/pdf/2203.12574.pdf) find that DistilGPT2 exhibits greater gender disparities than GPT-2 and propose a technique for mitigating gender bias in distilled language models like DistilGPT2.
83
+
84
 
85
 
86