Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,56 @@
|
|
1 |
---
|
|
|
2 |
license: cc-by-nc-sa-4.0
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
language: de
|
3 |
license: cc-by-nc-sa-4.0
|
4 |
+
|
5 |
---
|
6 |
+
|
7 |
+
|
8 |
+
## jobBERT-de
|
9 |
+
|
10 |
+
This is a domain-adapted language model for German-speaking job advertisements.
|
11 |
+
|
12 |
+
Is is based on [bert-base-german-cased](https://huggingface.co/bert-base-german-cased) and adapted to the domain of job advertisements trough continued in-domain pretraining on 4 million German-speaking job ads from Switzerland (5.9 GB).
|
13 |
+
|
14 |
+
3k empty spots in the vocabulary of the base model were filled with most frequent domain-specific words, subtokens and abbreviations.
|
15 |
+
|
16 |
+
### Overview
|
17 |
+
|
18 |
+
**Architecture:** BERT base <br>
|
19 |
+
**Language:** German <br>
|
20 |
+
**Domain:** Job advertisements <br>
|
21 |
+
**See also:** [agne/jobGBERT](https://huggingface.co/agne/jobGBERT)
|
22 |
+
|
23 |
+
### License
|
24 |
+
|
25 |
+
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (cc-by-nc-sa-4.0)
|
26 |
+
|
27 |
+
Please us the following citation when using our model:
|
28 |
+
|
29 |
+
```bibtex
|
30 |
+
@inproceedings{
|
31 |
+
title = “Evaluation of Transfer Learning and Domain Adaptation for Analyzing German-Speaking Job Advertisements”,
|
32 |
+
author = “Gnehm, Ann-Sophie and
|
33 |
+
Bühlmann, Eva and
|
34 |
+
Clematide, Simon”,
|
35 |
+
booktitle = “Proceedings of the 13th Language Resources and Evaluation Conference”,
|
36 |
+
month = june,
|
37 |
+
year = “2022",
|
38 |
+
address = “Marseille, France”,
|
39 |
+
publisher = “European Language Resources Association”,
|
40 |
+
}
|
41 |
+
```
|
42 |
+
### Intended usage and limitations
|
43 |
+
|
44 |
+
You can use the model for masked language modeling, but it's intended to be fine-tuned on a downstream task.
|
45 |
+
|
46 |
+
The model is trained on German-Speaking job ads from Switzerland. It inherits potential bias of its base model and may contain biases and stereotypes common in job advertisements.
|
47 |
+
|
48 |
+
### About us
|
49 |
+
|
50 |
+
Ann-Sophie Gnehm: `gnehm [at] soziologie.uzh.ch` <br>
|
51 |
+
Eva Bühlmann: `bühlmann [at] soziologie.uzh.ch` <br>
|
52 |
+
Simon Clematide: `clematide [at] cl.uzh.ch` <br>
|
53 |
+
|
54 |
+
The [Swiss Job Market Monitor](https://www.stellenmarktmonitor.uzh.ch/en.html) aims at systematically expanding scientific knowledge about the job market and improving labour market transparency by informing the general public about current developments on the job market.
|
55 |
+
|
56 |
+
**Get in touch:** [Mail](mailto:gnehm@soziologie.uzh.ch) [Website](https://www.stellenmarktmonitor.uzh.ch/en.html) [Zenodo](https://doi.org/10.5281/zenodo.6497853) [SWISSUbase](https://www.swissubase.ch/de/catalogue/studies/11998/18157/overview)
|