ufal
/

Tomlim commited on
Commit
c48b619
1 Parent(s): 9f0dc4c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +90 -14
README.md CHANGED
@@ -3,11 +3,14 @@ license: llama2
3
  language:
4
  - en
5
  ---
6
- # DAMA 13B
7
 
8
  <!-- Provide a quick summary of what the model is/does. -->
9
 
10
- ## Model Details
 
 
 
11
 
12
  ### Model Description
13
 
@@ -16,35 +19,105 @@ language:
16
 
17
 
18
  - **Developed by:** Tomasz Limisiewicz, David Mareček, Tomáš Musil
 
19
  - **Language(s) (NLP):** English
20
- - **Adapted from model:** LLaMA 13B
 
 
 
 
 
 
 
21
 
22
  ### Model Sources
23
 
24
  <!-- Provide the basic links for the model. -->
25
 
26
- - **Repository:** [Link](github.com/tomlimi/DAMA)
27
- - **Paper:** [Link](openreview.net/pdf?id=XIZEFyVGC9)
28
 
29
- ## Uses
30
 
31
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
32
 
33
- ### Direct Use
34
 
35
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
36
 
37
- The model is the adapted version of LLaMA 13B for decreasing gender bias.
 
 
38
 
39
 
40
- ## Bias, Risks, and Limitations
41
 
42
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
43
 
44
- The model mitigates the gender bias of the original model.
45
- Thus, it is better suited for the generation and processing of texts in sensitive domains.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
 
49
  ## Citation
50
 
@@ -63,3 +136,6 @@ url={https://openreview.net/forum?id=XIZEFyVGC9}
63
  }
64
  ```
65
 
 
 
 
 
3
  language:
4
  - en
5
  ---
6
+ # DAMA
7
 
8
  <!-- Provide a quick summary of what the model is/does. -->
9
 
10
+ ## Model
11
+
12
+ LLaMA model adapted to mitigate gender bias in text generation.
13
+ For adaptation, we used **D**ebiasing **A**lgorithm through **M**odel **A**daptation (DAMA) method described in [Limisiewicz et al., 2024](https://openreview.net/pdf?id=XIZEFyVGC9).
14
 
15
  ### Model Description
16
 
 
19
 
20
 
21
  - **Developed by:** Tomasz Limisiewicz, David Mareček, Tomáš Musil
22
+ - **Funded by:** Grant Agency Czech Republic
23
  - **Language(s) (NLP):** English
24
+ - **Adapted from model:** LLaMA
25
+
26
+ ### Model Sizes
27
+
28
+ - **[7B](https://huggingface.co/ufal/DAMA-7B)**
29
+ - **[13B](https://huggingface.co/ufal/DAMA-13B)**
30
+ - **[33B](https://huggingface.co/ufal/DAMA-33B)**
31
+ - **[65B](https://huggingface.co/ufal/DAMA-65B)**
32
 
33
  ### Model Sources
34
 
35
  <!-- Provide the basic links for the model. -->
36
 
37
+ - **[Repository](github.com/tomlimi/DAMA)**
38
+ - **[Paper](openreview.net/pdf?id=XIZEFyVGC9)**
39
 
 
40
 
 
41
 
42
+ ## Bias, Risks, and Limitations
43
 
44
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
45
 
46
+ The model mitigates the gender bias of the original model.
47
+ It is better suited for generating and processing texts in sensitive domains.
48
+ However, we recommend caution for such use cases because the models retain bias.
49
 
50
 
 
51
 
52
+ ## Adaptation
53
 
54
+ <!-- Include image. -->
55
+
56
+ ![Dama Schema](DamaSchema.png)
57
+
58
+ Schema (b) shows DAMA intervention in a LLaMA layer.
59
+ Even though `I - P_c` is depicted as a separate module, in practice, it is multiplied with the output matrix of a feed-forward layer (`W_FF`).
60
+ Therefore, DAMA is neutral to the model's parameter count and architecture.
61
+ (a) We show the behavior of the model when presented with a stereotypical prompt.
62
+ Specifically, (c) shows the projections of the feed-forward latent vector (`u`) onto the output space.
63
+ With DAMA (lower arrow), we nullify the gender component of the representation.
64
+ It results in balanced probabilities of gendered tokens in the model's output, as shown in (d).
65
+
66
+ The method for obtaining `P_c` is based on the Partial Least Square algorithm.
67
+ For more details, please refer to the [paper](https://openreview.net/pdf?id=XIZEFyVGC9).
68
+
69
+ ## Evaluation
70
+
71
+ We evaluate the models on multiple benchmarks to assess gender bias and language understanding capabilities.
72
+ DAMA models are compared with the original LLaMA models.
73
 
74
 
75
+ ### Bias Evaluation
76
+
77
+ We introduced a metric for evaluating gender bias in text generation.
78
+ It measures to which extent the models' output is affected by stereotypical `a_s` and factual `a_f` gender signals.
79
+
80
+ Moreover, we provide the scores for two established bias benchmarks: **WinoBias** and **Stereoset**.
81
+
82
+ ### Results
83
+
84
+
85
+ | | Bias | in | LM | | WinoBias | | | StereoSet | |
86
+ |--------------------------------------------------------------------|--------|-------|--------|--------|-----------|-----------|------|-----------|------|
87
+ | | `a_s` | `a_f` | `b` | Acc | `Delta S` | `Delta G` | lms | ss | ICAT |
88
+ | LLaMA 7B | 0.235 | 0.320 | 0.072 | 59.1\% | 40.3\% | 3.0\% | 95.5 | 71.9 | 53.7 |
89
+ | DAMA 7B | -0.005 | 0.038 | -0.006 | 57.3\% | 31.5\% | 2.3\% | 95.5 | 69.3 | 58.5 |
90
+ | LLaMA 13B | 0.270 | 0.351 | 0.070 | 70.5\% | 35.7\% | -1.5\% | 95.2 | 71.4 | 54.4 |
91
+ | DAMA 13B | 0.148 | 0.222 | 0.059 | 66.4\% | 31.1\% | -1.1\% | 94.4 | 68.6 | 59.4 |
92
+ | LLaMA 33B | 0.265 | 0.343 | 0.092 | 71.0\% | 36.0\% | -4.0\% | 94.7 | 68.4 | 59.9 |
93
+ | DAMA 33B | 0.105 | 0.172 | 0.059 | 63.7\% | 26.7\% | -3.7\% | 94.8 | 65.7 | 65.0 |
94
+ | LLaMA 65B | 0.249 | 0.316 | 0.095 | 73.3\% | 35.7\% | 1.4\% | 94.9 | 69.5 | 57.9 |
95
+ | DAMA 65B | 0.185 | 0.251 | 0.100 | 71.1\% | 27.2\% | 0.8\% | 92.8 | 67.1 | 61.1 |
96
+ | Bias evaluation for the LLaMA models and their debiased instances. | | | | | | | | | |
97
+
98
+
99
+ ### Performance Evaluation
100
+
101
+ To check the effect of debiasing on LM capabilities, we compute perplexity on Wikipedia corpus.
102
+ We also test performance on four language understanding end-tasks: **OpenBookQA**, **AI2 Reasoning Challenge** (Easy and Chalange Sets), and **Massive Multitask Language Understanding**.
103
+
104
+
105
+ ### Results
106
+
107
+ | | Perpelexity | ARC-C | ARC-E |OBQA | MMLU |
108
+ |-----------|----------------|----------------|-----------|-----------------|-------|
109
+ | LLaMA 7B | 26.1 | 42.2 |69.1 | 57.2 | 30.3 |
110
+ | DAMA 7B | 28.9 | 41.8 | 68.3 | 56.2 | 30.8 |
111
+ | LLaMA 13B | 19.8 | 44.9 | 70.6 | 55.4 | 43.3 |
112
+ | DAMA 13B | 21.0 | 44.7 | 70.3 | 56.2 | 43.5 |
113
+ | LLaMA 33B | 20.5 | 47.4 | 72.9 | 59.2 | 55.7* |
114
+ | DAMA 33B | 19.6 | 45.2 | 71.6 | 58.2 | 56.1* |
115
+ | LLaMA 65B | 19.5 | 44.5 | 73.9 | 59.6 | ---* |
116
+ | DAMA 65B | 20.1 | 40.5 | 67.7 | 57.2 | --- * |
117
+
118
+ Performance evaluation for the \llama{} models and their debiased instances.
119
+ Due to hardware limitations, we could not run MMLU inference for 65B models.
120
+ In the evaluation of 33B model, we excluded 4\% longest prompts.
121
 
122
  ## Citation
123
 
 
136
  }
137
  ```
138
 
139
+ ## Model Card Author
140
+
141
+ [Tomasz Limisiewicz](mailto:limisewicz@ufal.mff.cuni.cz)