Files changed (1) hide show
  1. README.md +108 -0
README.md ADDED
@@ -0,0 +1,108 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ # For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
3
+ # Doc / guide: https://huggingface.co/docs/hub/model-cards
4
+ {}
5
+ ---
6
+
7
+ # Model Card for Model ID
8
+
9
+ <!-- Provide a quick summary of what the model is/does. -->
10
+
11
+ This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
12
+
13
+ ## Model Details
14
+
15
+ ### Model Description
16
+
17
+ <!-- Provide a longer summary of what this model is. -->
18
+
19
+
20
+
21
+ - **Developed by:** [More Information Needed]
22
+ - **Funded by [optional]:** [More Information Needed]
23
+ - **Shared by [optional]:** [More Information Needed]
24
+ - **Model type:** [More Information Needed]
25
+ - **Language(s) (NLP):** [More Information Needed]
26
+ - **License:** [More Information Needed]
27
+ - **Finetuned from model [optional]:** [More Information Needed]
28
+
29
+ ### Model Sources [optional]
30
+
31
+ <!-- Provide the basic links for the model. -->
32
+
33
+ - **Repository:** [More Information Needed]
34
+ - **Paper [optional]:** [More Information Needed]
35
+ - **Demo [optional]:** [More Information Needed]
36
+
37
+
38
+ ## Training Details
39
+
40
+ ### Training Data
41
+
42
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
43
+
44
+ [More Information Needed]
45
+
46
+ ### Training Procedure
47
+ Supervised Fine-Tuning (SFT) on chosen examples and Direct Preference Optimiazion (DPO) on preference data.
48
+
49
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
50
+
51
+ #### Preprocessing [optional]
52
+
53
+ [More Information Needed]
54
+
55
+
56
+ #### Training Hyperparameters
57
+
58
+ DPO hyperparameters
59
+ * `beta=0.1`
60
+ * `learning_rate=5e-6`
61
+ * `gradient_accumulation=8`
62
+ * `num_train_epochs=2`
63
+
64
+ ### Testing Data, Factors & Metrics
65
+
66
+ #### Testing Data
67
+
68
+ <!-- This should link to a Dataset Card if possible. -->
69
+
70
+ [More Information Needed]
71
+
72
+ #### Metrics
73
+
74
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
75
+
76
+ [More Information Needed]
77
+
78
+ ### Results
79
+
80
+ [More Information Needed]
81
+
82
+ #### Summary
83
+
84
+ [More Information Needed]
85
+
86
+
87
+
88
+ ## Technical Specifications
89
+
90
+ ### Compute Infrastructure
91
+
92
+ [More Information Needed]
93
+
94
+ #### Hardware
95
+
96
+ [More Information Needed]
97
+
98
+ #### Software
99
+
100
+ [More Information Needed]
101
+
102
+
103
+
104
+ ## Model Card Authors and Contacts
105
+ **DebuggingFace**
106
+ Antonio Mari (antonio.mari@epfl.ch)
107
+ Matteo Santelmo (matteo.santelmo@epfl.ch)
108
+ Stefano Viel (stefano.viel@epfl.ch)