wissamantoun
commited on
Commit
•
2cf4cca
1
Parent(s):
ecbc748
Upload folder using huggingface_hub
Browse files- README.md +161 -0
- camembertav2_base_p2_17k_last_layer.yaml +32 -0
- fr_sequoia-ud-dev.parsed.conllu +0 -0
- fr_sequoia-ud-test.parsed.conllu +0 -0
- model/config.json +1 -0
- model/lexers/camembertav2_base_p2_17k_last_layer/config.json +1 -0
- model/lexers/camembertav2_base_p2_17k_last_layer/model/config.json +41 -0
- model/lexers/camembertav2_base_p2_17k_last_layer/model/special_tokens_map.json +51 -0
- model/lexers/camembertav2_base_p2_17k_last_layer/model/tokenizer.json +0 -0
- model/lexers/camembertav2_base_p2_17k_last_layer/model/tokenizer_config.json +57 -0
- model/lexers/char_level_embeddings/config.json +1 -0
- model/lexers/fasttext/config.json +1 -0
- model/lexers/fasttext/fasttext_model.bin +3 -0
- model/lexers/word_embeddings/config.json +0 -0
- model/weights.pt +3 -0
- train.log +111 -0
README.md
ADDED
@@ -0,0 +1,161 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: fr
|
3 |
+
license: mit
|
4 |
+
tags:
|
5 |
+
- deberta-v2
|
6 |
+
- token-classification
|
7 |
+
base_model: almanach/camembertav2-base
|
8 |
+
datasets:
|
9 |
+
- Sequoia
|
10 |
+
metrics:
|
11 |
+
- las
|
12 |
+
- upos
|
13 |
+
model-index:
|
14 |
+
- name: almanach/camembertav2-base-sequoia
|
15 |
+
results:
|
16 |
+
- task:
|
17 |
+
type: token-classification
|
18 |
+
name: Part-of-Speech Tagging
|
19 |
+
dataset:
|
20 |
+
type: Sequoia
|
21 |
+
name: Sequoia
|
22 |
+
metrics:
|
23 |
+
- name: upos
|
24 |
+
type: upos
|
25 |
+
value: 0.99423
|
26 |
+
verified: false
|
27 |
+
- task:
|
28 |
+
type: token-classification
|
29 |
+
name: Dependency Parsing
|
30 |
+
dataset:
|
31 |
+
type: Sequoia
|
32 |
+
name: Sequoia
|
33 |
+
metrics:
|
34 |
+
- name: las
|
35 |
+
type: las
|
36 |
+
value: 0.94883
|
37 |
+
verified: false
|
38 |
+
---
|
39 |
+
|
40 |
+
# Model Card for almanach/camembertav2-base-sequoia
|
41 |
+
|
42 |
+
almanach/camembertav2-base-sequoia is a deberta-v2 model for token classification. It is trained on the Sequoia dataset for the task of Part-of-Speech Tagging and Dependency Parsing.
|
43 |
+
The model achieves an f1 score of on the Sequoia dataset.
|
44 |
+
|
45 |
+
The model is part of the almanach/camembertav2-base family of model finetunes.
|
46 |
+
|
47 |
+
## Model Details
|
48 |
+
|
49 |
+
### Model Description
|
50 |
+
|
51 |
+
- **Developed by:** Wissam Antoun (Phd Student at Almanach, Inria-Paris)
|
52 |
+
- **Model type:** deberta-v2
|
53 |
+
- **Language(s) (NLP):** French
|
54 |
+
- **License:** MIT
|
55 |
+
- **Finetuned from model :** almanach/camembertav2-base
|
56 |
+
|
57 |
+
### Model Sources
|
58 |
+
|
59 |
+
<!-- Provide the basic links for the model. -->
|
60 |
+
|
61 |
+
- **Repository:** https://github.com/WissamAntoun/camemberta
|
62 |
+
- **Paper:** https://arxiv.org/abs/2411.08868
|
63 |
+
|
64 |
+
## Uses
|
65 |
+
|
66 |
+
The model can be used for token classification tasks in French for Part-of-Speech Tagging and Dependency Parsing.
|
67 |
+
|
68 |
+
## Bias, Risks, and Limitations
|
69 |
+
|
70 |
+
The model may exhibit biases based on the training data. The model may not generalize well to other datasets or tasks. The model may also have limitations in terms of the data it was trained on.
|
71 |
+
|
72 |
+
|
73 |
+
## How to Get Started with the Model
|
74 |
+
|
75 |
+
You can use the models directly with the hopsparser library in server mode https://github.com/hopsparser/hopsparser/blob/main/docs/server.md
|
76 |
+
|
77 |
+
|
78 |
+
## Training Details
|
79 |
+
|
80 |
+
### Training Procedure
|
81 |
+
|
82 |
+
Model trained with the [hopsparser](https://github.com/hopsparser/hopsparser) library on the Sequoia dataset.
|
83 |
+
|
84 |
+
|
85 |
+
#### Training Hyperparameters
|
86 |
+
|
87 |
+
```yml
|
88 |
+
# Layer dimensions
|
89 |
+
mlp_input: 1024
|
90 |
+
mlp_tag_hidden: 16
|
91 |
+
mlp_arc_hidden: 512
|
92 |
+
mlp_lab_hidden: 128
|
93 |
+
# Lexers
|
94 |
+
lexers:
|
95 |
+
- name: word_embeddings
|
96 |
+
type: words
|
97 |
+
embedding_size: 256
|
98 |
+
word_dropout: 0.5
|
99 |
+
- name: char_level_embeddings
|
100 |
+
type: chars_rnn
|
101 |
+
embedding_size: 64
|
102 |
+
lstm_output_size: 128
|
103 |
+
- name: fasttext
|
104 |
+
type: fasttext
|
105 |
+
- name: camembertav2_base_p2_17k_last_layer
|
106 |
+
type: bert
|
107 |
+
model: /scratch/camembertv2/runs/models/camembertav2-base-bf16/post/ckpt-p2-17000/pt/discriminator/
|
108 |
+
layers: [11]
|
109 |
+
subwords_reduction: "mean"
|
110 |
+
# Training hyperparameters
|
111 |
+
encoder_dropout: 0.5
|
112 |
+
mlp_dropout: 0.5
|
113 |
+
batch_size: 8
|
114 |
+
epochs: 64
|
115 |
+
lr:
|
116 |
+
base: 0.00003
|
117 |
+
schedule:
|
118 |
+
shape: linear
|
119 |
+
warmup_steps: 100
|
120 |
+
|
121 |
+
```
|
122 |
+
|
123 |
+
#### Results
|
124 |
+
|
125 |
+
**UPOS:** 0.99423
|
126 |
+
**LAS:** 0.94883
|
127 |
+
|
128 |
+
## Technical Specifications
|
129 |
+
|
130 |
+
### Model Architecture and Objective
|
131 |
+
|
132 |
+
deberta-v2 custom model for token classification.
|
133 |
+
|
134 |
+
## Citation
|
135 |
+
|
136 |
+
**BibTeX:**
|
137 |
+
|
138 |
+
```bibtex
|
139 |
+
@misc{antoun2024camembert20smarterfrench,
|
140 |
+
title={CamemBERT 2.0: A Smarter French Language Model Aged to Perfection},
|
141 |
+
author={Wissam Antoun and Francis Kulumba and Rian Touchent and Éric de la Clergerie and Benoît Sagot and Djamé Seddah},
|
142 |
+
year={2024},
|
143 |
+
eprint={2411.08868},
|
144 |
+
archivePrefix={arXiv},
|
145 |
+
primaryClass={cs.CL},
|
146 |
+
url={https://arxiv.org/abs/2411.08868},
|
147 |
+
}
|
148 |
+
|
149 |
+
@inproceedings{grobol:hal-03223424,
|
150 |
+
title = {Analyse en dépendances du français avec des plongements contextualisés},
|
151 |
+
author = {Grobol, Loïc and Crabbé, Benoît},
|
152 |
+
url = {https://hal.archives-ouvertes.fr/hal-03223424},
|
153 |
+
booktitle = {Actes de la 28ème Conférence sur le Traitement Automatique des Langues Naturelles},
|
154 |
+
eventtitle = {TALN-RÉCITAL 2021},
|
155 |
+
venue = {Lille, France},
|
156 |
+
pdf = {https://hal.archives-ouvertes.fr/hal-03223424/file/HOPS_final.pdf},
|
157 |
+
hal_id = {hal-03223424},
|
158 |
+
hal_version = {v1},
|
159 |
+
}
|
160 |
+
|
161 |
+
```
|
camembertav2_base_p2_17k_last_layer.yaml
ADDED
@@ -0,0 +1,32 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Layer dimensions
|
2 |
+
mlp_input: 1024
|
3 |
+
mlp_tag_hidden: 16
|
4 |
+
mlp_arc_hidden: 512
|
5 |
+
mlp_lab_hidden: 128
|
6 |
+
# Lexers
|
7 |
+
lexers:
|
8 |
+
- name: word_embeddings
|
9 |
+
type: words
|
10 |
+
embedding_size: 256
|
11 |
+
word_dropout: 0.5
|
12 |
+
- name: char_level_embeddings
|
13 |
+
type: chars_rnn
|
14 |
+
embedding_size: 64
|
15 |
+
lstm_output_size: 128
|
16 |
+
- name: fasttext
|
17 |
+
type: fasttext
|
18 |
+
- name: camembertav2_base_p2_17k_last_layer
|
19 |
+
type: bert
|
20 |
+
model: /scratch/camembertv2/runs/models/camembertav2-base-bf16/post/ckpt-p2-17000/pt/discriminator/
|
21 |
+
layers: [11]
|
22 |
+
subwords_reduction: "mean"
|
23 |
+
# Training hyperparameters
|
24 |
+
encoder_dropout: 0.5
|
25 |
+
mlp_dropout: 0.5
|
26 |
+
batch_size: 8
|
27 |
+
epochs: 64
|
28 |
+
lr:
|
29 |
+
base: 0.00003
|
30 |
+
schedule:
|
31 |
+
shape: linear
|
32 |
+
warmup_steps: 100
|
fr_sequoia-ud-dev.parsed.conllu
ADDED
The diff for this file is too large to render.
See raw diff
|
|
fr_sequoia-ud-test.parsed.conllu
ADDED
The diff for this file is too large to render.
See raw diff
|
|
model/config.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"mlp_input": 1024, "mlp_tag_hidden": 16, "mlp_arc_hidden": 512, "mlp_lab_hidden": 128, "biased_biaffine": true, "default_batch_size": 8, "encoder_dropout": 0.5, "extra_annotations": {}, "labels": ["acl", "acl:relcl", "advcl", "advcl:cleft", "advmod", "amod", "appos", "aux:caus", "aux:pass", "aux:tense", "case", "cc", "ccomp", "conj", "cop", "csubj", "csubj:pass", "dep", "det", "discourse", "dislocated", "expl:comp", "expl:pass", "expl:subj", "fixed", "flat:foreign", "flat:name", "goeswith", "iobj", "iobj:agent", "mark", "nmod", "nsubj", "nsubj:caus", "nsubj:pass", "nummod", "obj", "obj:agent", "obl:agent", "obl:arg", "obl:mod", "orphan", "parataxis", "punct", "root", "vocative", "xcomp"], "mlp_dropout": 0.5, "tagset": ["ADJ", "ADP", "ADV", "AUX", "CCONJ", "DET", "INTJ", "NOUN", "NUM", "PRON", "PROPN", "PUNCT", "SCONJ", "SYM", "VERB", "X"], "lexers": {"word_embeddings": "words", "char_level_embeddings": "chars_rnn", "fasttext": "fasttext", "camembertav2_base_p2_17k_last_layer": "bert"}, "multitask_loss": "sum"}
|
model/lexers/camembertav2_base_p2_17k_last_layer/config.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"layers": [11], "subwords_reduction": "mean", "weight_layers": false}
|
model/lexers/camembertav2_base_p2_17k_last_layer/model/config.json
ADDED
@@ -0,0 +1,41 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_name_or_path": "/scratch/camembertv2/runs/models/camembertav2-base-bf16/post/ckpt-p2-17000/pt/discriminator/",
|
3 |
+
"architectures": [
|
4 |
+
"DebertaV2Model"
|
5 |
+
],
|
6 |
+
"attention_probs_dropout_prob": 0.1,
|
7 |
+
"bos_token_id": 1,
|
8 |
+
"conv_act": "gelu",
|
9 |
+
"conv_kernel_size": 0,
|
10 |
+
"embedding_size": 768,
|
11 |
+
"eos_token_id": 2,
|
12 |
+
"hidden_act": "gelu",
|
13 |
+
"hidden_dropout_prob": 0.1,
|
14 |
+
"hidden_size": 768,
|
15 |
+
"initializer_range": 0.02,
|
16 |
+
"intermediate_size": 3072,
|
17 |
+
"layer_norm_eps": 1e-07,
|
18 |
+
"max_position_embeddings": 1024,
|
19 |
+
"max_relative_positions": -1,
|
20 |
+
"model_name": "camembertav2-base-bf16",
|
21 |
+
"model_type": "deberta-v2",
|
22 |
+
"norm_rel_ebd": "layer_norm",
|
23 |
+
"num_attention_heads": 12,
|
24 |
+
"num_hidden_layers": 12,
|
25 |
+
"pad_token_id": 0,
|
26 |
+
"pooler_dropout": 0,
|
27 |
+
"pooler_hidden_act": "gelu",
|
28 |
+
"pooler_hidden_size": 768,
|
29 |
+
"pos_att_type": [
|
30 |
+
"p2c",
|
31 |
+
"c2p"
|
32 |
+
],
|
33 |
+
"position_biased_input": false,
|
34 |
+
"position_buckets": 256,
|
35 |
+
"relative_attention": true,
|
36 |
+
"share_att_key": true,
|
37 |
+
"torch_dtype": "float32",
|
38 |
+
"transformers_version": "4.44.2",
|
39 |
+
"type_vocab_size": 0,
|
40 |
+
"vocab_size": 32768
|
41 |
+
}
|
model/lexers/camembertav2_base_p2_17k_last_layer/model/special_tokens_map.json
ADDED
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"bos_token": {
|
3 |
+
"content": "[CLS]",
|
4 |
+
"lstrip": false,
|
5 |
+
"normalized": false,
|
6 |
+
"rstrip": false,
|
7 |
+
"single_word": false
|
8 |
+
},
|
9 |
+
"cls_token": {
|
10 |
+
"content": "[CLS]",
|
11 |
+
"lstrip": false,
|
12 |
+
"normalized": false,
|
13 |
+
"rstrip": false,
|
14 |
+
"single_word": false
|
15 |
+
},
|
16 |
+
"eos_token": {
|
17 |
+
"content": "[SEP]",
|
18 |
+
"lstrip": false,
|
19 |
+
"normalized": false,
|
20 |
+
"rstrip": false,
|
21 |
+
"single_word": false
|
22 |
+
},
|
23 |
+
"mask_token": {
|
24 |
+
"content": "[MASK]",
|
25 |
+
"lstrip": false,
|
26 |
+
"normalized": false,
|
27 |
+
"rstrip": false,
|
28 |
+
"single_word": false
|
29 |
+
},
|
30 |
+
"pad_token": {
|
31 |
+
"content": "[PAD]",
|
32 |
+
"lstrip": false,
|
33 |
+
"normalized": false,
|
34 |
+
"rstrip": false,
|
35 |
+
"single_word": false
|
36 |
+
},
|
37 |
+
"sep_token": {
|
38 |
+
"content": "[SEP]",
|
39 |
+
"lstrip": false,
|
40 |
+
"normalized": false,
|
41 |
+
"rstrip": false,
|
42 |
+
"single_word": false
|
43 |
+
},
|
44 |
+
"unk_token": {
|
45 |
+
"content": "[UNK]",
|
46 |
+
"lstrip": false,
|
47 |
+
"normalized": false,
|
48 |
+
"rstrip": false,
|
49 |
+
"single_word": false
|
50 |
+
}
|
51 |
+
}
|
model/lexers/camembertav2_base_p2_17k_last_layer/model/tokenizer.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
model/lexers/camembertav2_base_p2_17k_last_layer/model/tokenizer_config.json
ADDED
@@ -0,0 +1,57 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"add_prefix_space": true,
|
3 |
+
"added_tokens_decoder": {
|
4 |
+
"0": {
|
5 |
+
"content": "[PAD]",
|
6 |
+
"lstrip": false,
|
7 |
+
"normalized": false,
|
8 |
+
"rstrip": false,
|
9 |
+
"single_word": false,
|
10 |
+
"special": true
|
11 |
+
},
|
12 |
+
"1": {
|
13 |
+
"content": "[CLS]",
|
14 |
+
"lstrip": false,
|
15 |
+
"normalized": false,
|
16 |
+
"rstrip": false,
|
17 |
+
"single_word": false,
|
18 |
+
"special": true
|
19 |
+
},
|
20 |
+
"2": {
|
21 |
+
"content": "[SEP]",
|
22 |
+
"lstrip": false,
|
23 |
+
"normalized": false,
|
24 |
+
"rstrip": false,
|
25 |
+
"single_word": false,
|
26 |
+
"special": true
|
27 |
+
},
|
28 |
+
"3": {
|
29 |
+
"content": "[UNK]",
|
30 |
+
"lstrip": false,
|
31 |
+
"normalized": false,
|
32 |
+
"rstrip": false,
|
33 |
+
"single_word": false,
|
34 |
+
"special": true
|
35 |
+
},
|
36 |
+
"4": {
|
37 |
+
"content": "[MASK]",
|
38 |
+
"lstrip": false,
|
39 |
+
"normalized": false,
|
40 |
+
"rstrip": false,
|
41 |
+
"single_word": false,
|
42 |
+
"special": true
|
43 |
+
}
|
44 |
+
},
|
45 |
+
"bos_token": "[CLS]",
|
46 |
+
"clean_up_tokenization_spaces": true,
|
47 |
+
"cls_token": "[CLS]",
|
48 |
+
"eos_token": "[SEP]",
|
49 |
+
"errors": "replace",
|
50 |
+
"mask_token": "[MASK]",
|
51 |
+
"model_max_length": 1000000000000000019884624838656,
|
52 |
+
"pad_token": "[PAD]",
|
53 |
+
"sep_token": "[SEP]",
|
54 |
+
"tokenizer_class": "RobertaTokenizer",
|
55 |
+
"trim_offsets": true,
|
56 |
+
"unk_token": "[UNK]"
|
57 |
+
}
|
model/lexers/char_level_embeddings/config.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"char_embeddings_dim": 64, "output_dim": 128, "special_tokens": ["<root>"], "charset": ["<pad>", "<special>", " ", "!", "\"", "$", "%", "&", "'", "(", ")", "+", ",", "-", ".", "/", "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", ":", ";", "<", "=", "?", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "[", "]", "^", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "\u00a9", "\u00b0", "\u00b1", "\u00bd", "\u00c0", "\u00c9", "\u00ce", "\u00df", "\u00e0", "\u00e1", "\u00e2", "\u00e4", "\u00e7", "\u00e8", "\u00e9", "\u00ea", "\u00eb", "\u00ee", "\u00ef", "\u00f3", "\u00f4", "\u00f6", "\u00f9", "\u00fb"]}
|
model/lexers/fasttext/config.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"special_tokens": ["<root>"]}
|
model/lexers/fasttext/fasttext_model.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:cf09011cf6593888c882b0464e1b82ae5a3d05fce8e5c2861014f45557861568
|
3 |
+
size 801050258
|
model/lexers/word_embeddings/config.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
model/weights.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6626b13e315a9bdd4c63e4321755366a968ab2595f4cb625cf45b67312ba5790
|
3 |
+
size 1745757420
|
train.log
ADDED
@@ -0,0 +1,111 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[hops] 2024-09-24 16:01:20.681 | INFO | Initializing a parser from /workspace/configs/exp_camembertv2/camembertav2_base_p2_17k_last_layer.yaml
|
2 |
+
[hops] 2024-09-24 16:01:20.730 | INFO | Generating a FastText model from the treebank
|
3 |
+
[hops] 2024-09-24 16:01:20.745 | INFO | Training fasttext model
|
4 |
+
[hops] 2024-09-24 16:01:28.399 | INFO | Start training on cuda:0
|
5 |
+
[hops] 2024-09-24 16:01:28.403 | WARNING | You're using a RobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
|
6 |
+
[hops] 2024-09-24 16:01:47.740 | INFO | Epoch 0: train loss 2.7794 dev loss 1.9428 dev tag acc 49.62% dev head acc 27.95% dev deprel acc 53.97%
|
7 |
+
[hops] 2024-09-24 16:01:47.741 | INFO | New best model: head accuracy 27.95% > 0.00%
|
8 |
+
[hops] 2024-09-24 16:02:09.374 | INFO | Epoch 1: train loss 1.5546 dev loss 1.0005 dev tag acc 72.25% dev head acc 61.63% dev deprel acc 80.28%
|
9 |
+
[hops] 2024-09-24 16:02:09.375 | INFO | New best model: head accuracy 61.63% > 27.95%
|
10 |
+
[hops] 2024-09-24 16:02:31.031 | INFO | Epoch 2: train loss 0.9383 dev loss 0.6218 dev tag acc 80.12% dev head acc 79.32% dev deprel acc 87.02%
|
11 |
+
[hops] 2024-09-24 16:02:31.032 | INFO | New best model: head accuracy 79.32% > 61.63%
|
12 |
+
[hops] 2024-09-24 16:02:51.968 | INFO | Epoch 3: train loss 0.6428 dev loss 0.4519 dev tag acc 87.60% dev head acc 84.25% dev deprel acc 90.09%
|
13 |
+
[hops] 2024-09-24 16:02:51.969 | INFO | New best model: head accuracy 84.25% > 79.32%
|
14 |
+
[hops] 2024-09-24 16:03:13.587 | INFO | Epoch 4: train loss 0.4912 dev loss 0.3727 dev tag acc 91.58% dev head acc 85.25% dev deprel acc 92.18%
|
15 |
+
[hops] 2024-09-24 16:03:13.588 | INFO | New best model: head accuracy 85.25% > 84.25%
|
16 |
+
[hops] 2024-09-24 16:03:34.880 | INFO | Epoch 5: train loss 0.3884 dev loss 0.3005 dev tag acc 95.05% dev head acc 88.23% dev deprel acc 93.64%
|
17 |
+
[hops] 2024-09-24 16:03:34.881 | INFO | New best model: head accuracy 88.23% > 85.25%
|
18 |
+
[hops] 2024-09-24 16:03:56.210 | INFO | Epoch 6: train loss 0.3135 dev loss 0.2582 dev tag acc 96.33% dev head acc 90.19% dev deprel acc 94.54%
|
19 |
+
[hops] 2024-09-24 16:03:56.211 | INFO | New best model: head accuracy 90.19% > 88.23%
|
20 |
+
[hops] 2024-09-24 16:04:18.250 | INFO | Epoch 7: train loss 0.2602 dev loss 0.2364 dev tag acc 96.99% dev head acc 90.92% dev deprel acc 95.35%
|
21 |
+
[hops] 2024-09-24 16:04:18.251 | INFO | New best model: head accuracy 90.92% > 90.19%
|
22 |
+
[hops] 2024-09-24 16:04:39.889 | INFO | Epoch 8: train loss 0.2206 dev loss 0.2207 dev tag acc 97.62% dev head acc 91.55% dev deprel acc 95.82%
|
23 |
+
[hops] 2024-09-24 16:04:39.890 | INFO | New best model: head accuracy 91.55% > 90.92%
|
24 |
+
[hops] 2024-09-24 16:05:00.941 | INFO | Epoch 9: train loss 0.1885 dev loss 0.2092 dev tag acc 97.86% dev head acc 92.34% dev deprel acc 96.26%
|
25 |
+
[hops] 2024-09-24 16:05:00.942 | INFO | New best model: head accuracy 92.34% > 91.55%
|
26 |
+
[hops] 2024-09-24 16:05:22.018 | INFO | Epoch 10: train loss 0.1633 dev loss 0.1818 dev tag acc 98.23% dev head acc 92.87% dev deprel acc 96.84%
|
27 |
+
[hops] 2024-09-24 16:05:22.019 | INFO | New best model: head accuracy 92.87% > 92.34%
|
28 |
+
[hops] 2024-09-24 16:05:42.839 | INFO | Epoch 11: train loss 0.1444 dev loss 0.1800 dev tag acc 98.40% dev head acc 93.47% dev deprel acc 96.77%
|
29 |
+
[hops] 2024-09-24 16:05:42.840 | INFO | New best model: head accuracy 93.47% > 92.87%
|
30 |
+
[hops] 2024-09-24 16:06:04.127 | INFO | Epoch 12: train loss 0.1289 dev loss 0.1718 dev tag acc 98.58% dev head acc 93.67% dev deprel acc 97.08%
|
31 |
+
[hops] 2024-09-24 16:06:04.128 | INFO | New best model: head accuracy 93.67% > 93.47%
|
32 |
+
[hops] 2024-09-24 16:06:26.120 | INFO | Epoch 13: train loss 0.1136 dev loss 0.1875 dev tag acc 98.59% dev head acc 93.56% dev deprel acc 97.00%
|
33 |
+
[hops] 2024-09-24 16:06:45.617 | INFO | Epoch 14: train loss 0.1033 dev loss 0.1923 dev tag acc 98.85% dev head acc 93.73% dev deprel acc 97.03%
|
34 |
+
[hops] 2024-09-24 16:06:45.618 | INFO | New best model: head accuracy 93.73% > 93.67%
|
35 |
+
[hops] 2024-09-24 16:07:07.243 | INFO | Epoch 15: train loss 0.0938 dev loss 0.1859 dev tag acc 98.89% dev head acc 94.23% dev deprel acc 97.14%
|
36 |
+
[hops] 2024-09-24 16:07:07.244 | INFO | New best model: head accuracy 94.23% > 93.73%
|
37 |
+
[hops] 2024-09-24 16:07:29.285 | INFO | Epoch 16: train loss 0.0855 dev loss 0.1829 dev tag acc 98.89% dev head acc 94.30% dev deprel acc 97.31%
|
38 |
+
[hops] 2024-09-24 16:07:29.287 | INFO | New best model: head accuracy 94.30% > 94.23%
|
39 |
+
[hops] 2024-09-24 16:07:50.544 | INFO | Epoch 17: train loss 0.0789 dev loss 0.1872 dev tag acc 98.91% dev head acc 94.57% dev deprel acc 97.40%
|
40 |
+
[hops] 2024-09-24 16:07:50.545 | INFO | New best model: head accuracy 94.57% > 94.30%
|
41 |
+
[hops] 2024-09-24 16:08:12.269 | INFO | Epoch 18: train loss 0.0730 dev loss 0.1901 dev tag acc 98.95% dev head acc 94.59% dev deprel acc 97.29%
|
42 |
+
[hops] 2024-09-24 16:08:12.270 | INFO | New best model: head accuracy 94.59% > 94.57%
|
43 |
+
[hops] 2024-09-24 16:08:33.803 | INFO | Epoch 19: train loss 0.0665 dev loss 0.1852 dev tag acc 98.91% dev head acc 94.72% dev deprel acc 97.43%
|
44 |
+
[hops] 2024-09-24 16:08:33.804 | INFO | New best model: head accuracy 94.72% > 94.59%
|
45 |
+
[hops] 2024-09-24 16:08:55.220 | INFO | Epoch 20: train loss 0.0620 dev loss 0.2002 dev tag acc 98.96% dev head acc 95.00% dev deprel acc 97.45%
|
46 |
+
[hops] 2024-09-24 16:08:55.221 | INFO | New best model: head accuracy 95.00% > 94.72%
|
47 |
+
[hops] 2024-09-24 16:09:16.939 | INFO | Epoch 21: train loss 0.0574 dev loss 0.2063 dev tag acc 99.04% dev head acc 94.79% dev deprel acc 97.56%
|
48 |
+
[hops] 2024-09-24 16:09:36.037 | INFO | Epoch 22: train loss 0.0534 dev loss 0.2002 dev tag acc 99.06% dev head acc 95.04% dev deprel acc 97.55%
|
49 |
+
[hops] 2024-09-24 16:09:36.038 | INFO | New best model: head accuracy 95.04% > 95.00%
|
50 |
+
[hops] 2024-09-24 16:09:57.103 | INFO | Epoch 23: train loss 0.0503 dev loss 0.2077 dev tag acc 99.06% dev head acc 95.07% dev deprel acc 97.55%
|
51 |
+
[hops] 2024-09-24 16:09:57.103 | INFO | New best model: head accuracy 95.07% > 95.04%
|
52 |
+
[hops] 2024-09-24 16:10:18.368 | INFO | Epoch 24: train loss 0.0469 dev loss 0.2004 dev tag acc 99.06% dev head acc 95.43% dev deprel acc 97.68%
|
53 |
+
[hops] 2024-09-24 16:10:18.369 | INFO | New best model: head accuracy 95.43% > 95.07%
|
54 |
+
[hops] 2024-09-24 16:10:40.222 | INFO | Epoch 25: train loss 0.0432 dev loss 0.2043 dev tag acc 99.02% dev head acc 95.30% dev deprel acc 97.67%
|
55 |
+
[hops] 2024-09-24 16:10:59.739 | INFO | Epoch 26: train loss 0.0416 dev loss 0.2225 dev tag acc 99.06% dev head acc 95.08% dev deprel acc 97.48%
|
56 |
+
[hops] 2024-09-24 16:11:18.934 | INFO | Epoch 27: train loss 0.0375 dev loss 0.2118 dev tag acc 99.08% dev head acc 95.39% dev deprel acc 97.62%
|
57 |
+
[hops] 2024-09-24 16:11:37.455 | INFO | Epoch 28: train loss 0.0369 dev loss 0.2139 dev tag acc 99.08% dev head acc 95.28% dev deprel acc 97.76%
|
58 |
+
[hops] 2024-09-24 16:11:57.066 | INFO | Epoch 29: train loss 0.0351 dev loss 0.2086 dev tag acc 99.12% dev head acc 95.46% dev deprel acc 97.72%
|
59 |
+
[hops] 2024-09-24 16:11:57.067 | INFO | New best model: head accuracy 95.46% > 95.43%
|
60 |
+
[hops] 2024-09-24 16:12:18.271 | INFO | Epoch 30: train loss 0.0329 dev loss 0.2321 dev tag acc 99.06% dev head acc 95.49% dev deprel acc 97.58%
|
61 |
+
[hops] 2024-09-24 16:12:18.272 | INFO | New best model: head accuracy 95.49% > 95.46%
|
62 |
+
[hops] 2024-09-24 16:12:40.183 | INFO | Epoch 31: train loss 0.0320 dev loss 0.2237 dev tag acc 99.14% dev head acc 95.79% dev deprel acc 97.80%
|
63 |
+
[hops] 2024-09-24 16:12:40.184 | INFO | New best model: head accuracy 95.79% > 95.49%
|
64 |
+
[hops] 2024-09-24 16:13:01.544 | INFO | Epoch 32: train loss 0.0291 dev loss 0.2373 dev tag acc 99.10% dev head acc 95.38% dev deprel acc 97.79%
|
65 |
+
[hops] 2024-09-24 16:13:20.203 | INFO | Epoch 33: train loss 0.0288 dev loss 0.2393 dev tag acc 99.13% dev head acc 95.61% dev deprel acc 97.77%
|
66 |
+
[hops] 2024-09-24 16:13:38.371 | INFO | Epoch 34: train loss 0.0261 dev loss 0.2499 dev tag acc 99.08% dev head acc 95.62% dev deprel acc 97.76%
|
67 |
+
[hops] 2024-09-24 16:13:57.799 | INFO | Epoch 35: train loss 0.0254 dev loss 0.2435 dev tag acc 99.11% dev head acc 95.74% dev deprel acc 97.81%
|
68 |
+
[hops] 2024-09-24 16:14:15.970 | INFO | Epoch 36: train loss 0.0235 dev loss 0.2542 dev tag acc 99.15% dev head acc 95.53% dev deprel acc 97.85%
|
69 |
+
[hops] 2024-09-24 16:14:34.739 | INFO | Epoch 37: train loss 0.0226 dev loss 0.2540 dev tag acc 99.07% dev head acc 95.39% dev deprel acc 97.71%
|
70 |
+
[hops] 2024-09-24 16:14:53.771 | INFO | Epoch 38: train loss 0.0218 dev loss 0.2529 dev tag acc 99.09% dev head acc 95.24% dev deprel acc 97.74%
|
71 |
+
[hops] 2024-09-24 16:15:13.379 | INFO | Epoch 39: train loss 0.0206 dev loss 0.2571 dev tag acc 99.09% dev head acc 95.57% dev deprel acc 97.83%
|
72 |
+
[hops] 2024-09-24 16:15:32.562 | INFO | Epoch 40: train loss 0.0195 dev loss 0.2649 dev tag acc 99.19% dev head acc 95.64% dev deprel acc 97.80%
|
73 |
+
[hops] 2024-09-24 16:15:50.944 | INFO | Epoch 41: train loss 0.0194 dev loss 0.2632 dev tag acc 99.14% dev head acc 95.73% dev deprel acc 97.77%
|
74 |
+
[hops] 2024-09-24 16:16:09.974 | INFO | Epoch 42: train loss 0.0178 dev loss 0.2683 dev tag acc 99.21% dev head acc 95.81% dev deprel acc 97.85%
|
75 |
+
[hops] 2024-09-24 16:16:09.975 | INFO | New best model: head accuracy 95.81% > 95.79%
|
76 |
+
[hops] 2024-09-24 16:16:31.272 | INFO | Epoch 43: train loss 0.0162 dev loss 0.2753 dev tag acc 99.20% dev head acc 95.74% dev deprel acc 97.80%
|
77 |
+
[hops] 2024-09-24 16:16:49.960 | INFO | Epoch 44: train loss 0.0233 dev loss 0.2764 dev tag acc 99.21% dev head acc 95.75% dev deprel acc 97.84%
|
78 |
+
[hops] 2024-09-24 16:17:09.618 | INFO | Epoch 45: train loss 0.0157 dev loss 0.2860 dev tag acc 99.23% dev head acc 95.84% dev deprel acc 97.92%
|
79 |
+
[hops] 2024-09-24 16:17:09.619 | INFO | New best model: head accuracy 95.84% > 95.81%
|
80 |
+
[hops] 2024-09-24 16:17:31.311 | INFO | Epoch 46: train loss 0.0141 dev loss 0.2782 dev tag acc 99.24% dev head acc 95.67% dev deprel acc 97.92%
|
81 |
+
[hops] 2024-09-24 16:17:50.429 | INFO | Epoch 47: train loss 0.0135 dev loss 0.2823 dev tag acc 99.20% dev head acc 95.93% dev deprel acc 97.84%
|
82 |
+
[hops] 2024-09-24 16:17:50.430 | INFO | New best model: head accuracy 95.93% > 95.84%
|
83 |
+
[hops] 2024-09-24 16:18:12.154 | INFO | Epoch 48: train loss 0.0139 dev loss 0.2830 dev tag acc 99.17% dev head acc 95.77% dev deprel acc 97.81%
|
84 |
+
[hops] 2024-09-24 16:18:30.927 | INFO | Epoch 49: train loss 0.0129 dev loss 0.2882 dev tag acc 99.17% dev head acc 95.87% dev deprel acc 97.78%
|
85 |
+
[hops] 2024-09-24 16:18:50.677 | INFO | Epoch 50: train loss 0.0120 dev loss 0.2876 dev tag acc 99.18% dev head acc 95.82% dev deprel acc 97.80%
|
86 |
+
[hops] 2024-09-24 16:19:09.749 | INFO | Epoch 51: train loss 0.0115 dev loss 0.2998 dev tag acc 99.19% dev head acc 95.69% dev deprel acc 97.85%
|
87 |
+
[hops] 2024-09-24 16:19:28.709 | INFO | Epoch 52: train loss 0.0116 dev loss 0.2948 dev tag acc 99.22% dev head acc 95.76% dev deprel acc 97.95%
|
88 |
+
[hops] 2024-09-24 16:19:47.697 | INFO | Epoch 53: train loss 0.0102 dev loss 0.3000 dev tag acc 99.22% dev head acc 95.95% dev deprel acc 97.90%
|
89 |
+
[hops] 2024-09-24 16:19:47.698 | INFO | New best model: head accuracy 95.95% > 95.93%
|
90 |
+
[hops] 2024-09-24 16:20:09.017 | INFO | Epoch 54: train loss 0.0104 dev loss 0.3013 dev tag acc 99.23% dev head acc 96.02% dev deprel acc 97.89%
|
91 |
+
[hops] 2024-09-24 16:20:09.018 | INFO | New best model: head accuracy 96.02% > 95.95%
|
92 |
+
[hops] 2024-09-24 16:20:30.175 | INFO | Epoch 55: train loss 0.0105 dev loss 0.2964 dev tag acc 99.26% dev head acc 96.01% dev deprel acc 97.95%
|
93 |
+
[hops] 2024-09-24 16:20:50.301 | INFO | Epoch 56: train loss 0.0098 dev loss 0.2959 dev tag acc 99.23% dev head acc 95.97% dev deprel acc 97.89%
|
94 |
+
[hops] 2024-09-24 16:21:09.359 | INFO | Epoch 57: train loss 0.0091 dev loss 0.3062 dev tag acc 99.25% dev head acc 95.88% dev deprel acc 97.87%
|
95 |
+
[hops] 2024-09-24 16:21:28.289 | INFO | Epoch 58: train loss 0.0089 dev loss 0.3102 dev tag acc 99.25% dev head acc 95.87% dev deprel acc 97.85%
|
96 |
+
[hops] 2024-09-24 16:21:48.019 | INFO | Epoch 59: train loss 0.0089 dev loss 0.3086 dev tag acc 99.24% dev head acc 95.97% dev deprel acc 97.91%
|
97 |
+
[hops] 2024-09-24 16:22:06.944 | INFO | Epoch 60: train loss 0.0086 dev loss 0.3091 dev tag acc 99.25% dev head acc 95.98% dev deprel acc 97.91%
|
98 |
+
[hops] 2024-09-24 16:22:25.872 | INFO | Epoch 61: train loss 0.0080 dev loss 0.3121 dev tag acc 99.27% dev head acc 95.96% dev deprel acc 97.91%
|
99 |
+
[hops] 2024-09-24 16:22:44.686 | INFO | Epoch 62: train loss 0.0087 dev loss 0.3127 dev tag acc 99.25% dev head acc 96.02% dev deprel acc 97.91%
|
100 |
+
[hops] 2024-09-24 16:23:03.731 | INFO | Epoch 63: train loss 0.0083 dev loss 0.3123 dev tag acc 99.25% dev head acc 96.03% dev deprel acc 97.92%
|
101 |
+
[hops] 2024-09-24 16:23:03.732 | INFO | New best model: head accuracy 96.03% > 96.02%
|
102 |
+
[hops] 2024-09-24 16:23:11.038 | WARNING | You're using a RobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
|
103 |
+
[hops] 2024-09-24 16:23:16.913 | WARNING | You're using a RobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
|
104 |
+
[hops] 2024-09-24 16:23:18.926 | INFO | Metrics for Sequoia-camembertav2_base_p2_17k_last_layer+rand_seed=42
|
105 |
+
───────────────────────────────
|
106 |
+
Split UPOS UAS LAS
|
107 |
+
───────────────────────────────
|
108 |
+
Dev 99.25 96.08 94.89
|
109 |
+
Test 99.42 95.98 94.88
|
110 |
+
───────────────────────────────
|
111 |
+
|