GPT2 model for German Leichte Sprache (Easy language)

A German Leichte Sprache (Easy language) model based on german-gpt2.

See our code here: https://github.com/MiriUll/Language-Models-German-Simplification
See our paper here: Language Models for German Text Simplification: Overcoming Parallel Data Scarcity through Style-specific Pre-training

Dataset

This model was fine-tuned on a collection of monolingual Leichte Sprache data. This corpus can be recreated here.

Citation

If you use this model, please cite our paper:
@inproceedings{anschutz-etal-2023-language,
  title = "Language Models for {G}erman Text Simplification: Overcoming Parallel Data Scarcity through Style-specific Pre-training",
  author = {Ansch{"u}tz, Miriam and Oehms, Joshua and Wimmer, Thomas and Jezierski, Bart{\l}omiej and Groh, Georg},
  booktitle = "Findings of the Association for Computational Linguistics: ACL 2023",
  month = jul,
  year = "2023",
  address = "Toronto, Canada",
  publisher = "Association for Computational Linguistics",
  url = "https://aclanthology.org/2023.findings-acl.74",
  pages = "1147--1158",
}

Downloads last month
567
Safetensors
Model size
137M params
Tensor type
F32
Β·
U8
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.