--- license: cc-by-nc-sa-4.0 tags: - generated_from_trainer - simplification task_categories: - text2text-generation task_ids: - text-simplification language: - nl datasets: - BramVanroy/chatgpt-dutch-simplification metrics: - rouge - sari model-index: - name: BramVanroy/ul2-base-dutch-simplification-mai-2023 results: - task: type: text-simplification name: Text Simplification dataset: type: BramVanroy/chatgpt-dutch-simplification name: ChatGPT Dutch Simplification metrics: - type: rouge value: 41.5749 name: Eval Rouge-1 - type: rouge value: 19.9 name: Eval Rouge-2 - type: rouge value: 36.3204 name: Eval RougeL - type: rouge value: 36.2596 name: Eval RougeLsum - type: sari value: 53.0091 name: Eval SARI - type: rouge value: 44.2877 name: Test Rouge-1 - type: rouge value: 20.8132 name: Test Rouge-2 - type: rouge value: 39.0951 name: Test RougeL - type: rouge value: 39.2709 name: Test RougeLsum - type: sari value: 52.9621 name: Test SARI widget: - example_title: "Cooking" text: "Op bepaalde tijdstippen verlang ik naar de smaakvolle culinaire creaties welke door de ambachtelijke expertise van mijn grootmoeder zijn vervaardigd." --- # ul2-base-dutch-simplification-mai-2023 This model is intended to simplify Dutch sentences. This model is a fine-tuned version of [yhavinga/ul2-base-dutch](https://huggingface.co/yhavinga/ul2-base-dutch) on the [BramVanroy/chatgpt-dutch-simplification](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification) dataset. The model was created in light of the master thesis of Charlotte Van de Velde in the Master of Science in Artificial Intelligence (MAI) at KU Leuven in 2023. Dataset creation by Charlotte, model training by Bram. ## Quick links - [Repository](https://github.com/BramVanroy/mai-simplification-nl-2023#22-hyperparameter-sweep): includes training code and model creation log - [Dataset](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification): `BramVanroy/chatgpt-dutch-simplification` - [Parent model](https://huggingface.co/yhavinga/ul2-base-dutch): this model was finetuned on `yhavinga/ul2-base-dutch` ## Intended uses & limitations, and dataset The model is intended for sentence-level simplification of Dutch. It might extend to document-level simplification but most of the dataset is limited to sentences so document-level performance is not guaranteed. The dataset has been generated automatically (cf. [dataset description](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification)) and has not been manually verified. On top of that, this model has been fine-tuned and we did not scrutinize the parent model or its training data. Output of the current model is therefore subject to unexpected results (as most if not all neural networks). Because the dataset was generated with ChatGPT, this model cannot be used for commercial purposes. ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.00026885245616406115 - train_batch_size: 12 - optimizer: Adafactor - num_epochs: 26 These hyperarameters were found through Bayesian hyperparameter search with `wandb`. This is described in the [repository](https://github.com/BramVanroy/mai-simplification-nl-2023#22-hyperparameter-sweep). ### Training results `eval` results are on the evaluation set, `predict` results are on the test set. These were achieved with beam search (num_beams=3). ```json { "eval_gen_len": 21.206349206349206, "eval_loss": 2.598172903060913, "eval_rouge1": 41.5749, "eval_rouge2": 19.9, "eval_rougeL": 36.3204, "eval_rougeLsum": 36.2596, "eval_sari": 53.0091, "predict_gen_len": 22.40625, "predict_loss": 2.517918586730957, "predict_rouge1": 44.2877, "predict_rouge2": 20.8132, "predict_rougeL": 39.0951, "predict_rougeLsum": 39.2709, "predict_sari": 52.9621 } ``` ### Framework versions - Transformers 4.29.2 - Pytorch 2.0.1+cu117 - Datasets 2.12.0 - Tokenizers 0.13.3