Token Classification
Flair
PyTorch
Spanish
sequence-tagger-model
mmarimon commited on
Commit
1238bf1
1 Parent(s): e5877f6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -22
README.md CHANGED
@@ -12,34 +12,19 @@ widget:
12
  - text: "PODACESA OBRAS Y SERVICIOS, S.A, y ECR INFRAESTRUCTURAS Y SERVICIOS HIDRÁULICOS S.L., constituidos en UTE PODACESA-ECR realizan la siguiente oferta:"
13
  - text: "PODACESA OBRAS Y SERVICIOS, S.A realiza la siguiente oferta:"
14
  ---
15
- ## Recognition of UTEs and company mentions in Flair
16
 
17
- This is a model trained using [Flair](https://github.com/flairNLP/flair/) to recognise mentions of UTEs (Unión Temporal de Empresas) and companies in public tenders.
18
 
19
- It is a finetune of the flair/ner-spanish-large model (retrained from scratch to include additional tags).
20
-
21
- ```
22
- Results:
23
- - F-score (micro) 0.7431
24
- - F-score (macro) 0.7429
25
- - Accuracy 0.5944
26
-
27
- By class:
28
- precision recall f1-score support
29
 
30
- UTE 0.7568 0.7887 0.7724 71
31
- SINGLE_COMPANY 0.6538 0.7846 0.7133 65
32
-
33
- micro avg 0.7039 0.7868 0.7431 136
34
- macro avg 0.7053 0.7867 0.7429 136
35
- weighted avg 0.7076 0.7868 0.7442 136
36
- ```
37
 
38
  Based on document-level XLM-R embeddings and [FLERT](https://arxiv.org/pdf/2011.06993v1.pdf/).
39
 
40
  ---
41
 
42
- ### Demo: How to use in Flair
43
 
44
  Requires: **[Flair](https://github.com/flairNLP/flair/)** (`pip install flair`)
45
 
@@ -78,7 +63,7 @@ Span[0:6]: "PODACESA OBRAS Y SERVICIOS, S.A" _ SINGLE_COMPANY (1.0)
78
 
79
  ---
80
 
81
- ### Training: Script to train this model
82
 
83
  The following Flair script was used to train this model (**TODO: update**):
84
 
@@ -128,6 +113,58 @@ trainer.train('resources/taggers/ner-spanish-large',
128
  )
129
  ```
130
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
131
 
 
 
132
 
133
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  - text: "PODACESA OBRAS Y SERVICIOS, S.A, y ECR INFRAESTRUCTURAS Y SERVICIOS HIDRÁULICOS S.L., constituidos en UTE PODACESA-ECR realizan la siguiente oferta:"
13
  - text: "PODACESA OBRAS Y SERVICIOS, S.A realiza la siguiente oferta:"
14
  ---
 
15
 
16
+ # Recognition of UTEs and company mentions in Flair
17
 
18
+ This is a model trained using [Flair](https://github.com/flairNLP/flair/) to recognise mentions of UTEs (Unión Temporal de Empresas)
19
+ and companies in public tenders.
 
 
 
 
 
 
 
 
20
 
21
+ It is a finetune of the flair/ner-spanish-large model (retrained from scratch to include additional tags).
 
 
 
 
 
 
22
 
23
  Based on document-level XLM-R embeddings and [FLERT](https://arxiv.org/pdf/2011.06993v1.pdf/).
24
 
25
  ---
26
 
27
+ ## Demo: How to use in Flair
28
 
29
  Requires: **[Flair](https://github.com/flairNLP/flair/)** (`pip install flair`)
30
 
 
63
 
64
  ---
65
 
66
+ ## Training: Script to train this model
67
 
68
  The following Flair script was used to train this model (**TODO: update**):
69
 
 
113
  )
114
  ```
115
 
116
+ ## Evaluation Results
117
+
118
+ ```
119
+ Results:
120
+ - F-score (micro) 0.7431
121
+ - F-score (macro) 0.7429
122
+ - Accuracy 0.5944
123
+
124
+ By class:
125
+ precision recall f1-score support
126
+
127
+ UTE 0.7568 0.7887 0.7724 71
128
+ SINGLE_COMPANY 0.6538 0.7846 0.7133 65
129
+
130
+ micro avg 0.7039 0.7868 0.7431 136
131
+ macro avg 0.7053 0.7867 0.7429 136
132
+ weighted avg 0.7076 0.7868 0.7442 136
133
+ ```
134
+
135
+ ## Additional information
136
 
137
+ ### Author
138
+ The Language Technologies Unit from Barcelona Supercomputing Center.
139
 
140
+ ### Contact
141
+ For further information, please send an email to <langtech@bsc.es>.
142
+
143
+ ### Copyright
144
+ Copyright(c) 2023 by Language Technologies Unit, Barcelona Supercomputing Center.
145
+
146
+ ### License
147
+ [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
148
+
149
+ ### Funding
150
+ This work has been promoted and financed by the European Commission Health and Digital Executive Agency, Connecting Europe Facility,
151
+ Grant Agreement Nº INEA/CEF/ICT/A2020/2373713,
152
+ Action Title Open Harmonized and Enriched Procurement Data Platform (nextProcurement),
153
+ Action number 2020-ES-IA-0255.
154
+
155
+ ### Disclaimer
156
+ <details>
157
+ <summary>Click to expand</summary>
158
+
159
+ The model published in this repository is intended for a generalist purpose and is available to third parties under a permissive Apache License, Version 2.0.
160
+
161
+ Be aware that the model may have biases and/or any other undesirable distortions.
162
+
163
+ When third parties deploy or provide systems and/or services to other parties using this model (or any system based on it)
164
+ or become users of the model, they should note that it is their responsibility to mitigate the risks arising from its use and,
165
+ in any event, to comply with applicable regulations, including regulations regarding the use of Artificial Intelligence.
166
+
167
+ In no event shall the owner and creator of the model (Barcelona Supercomputing Center)
168
+ be liable for any results arising from the use made by third parties.
169
+
170
+ </details>