FemkeBakker
/

AmsterdamDocClassificationGEITje200T1Epochs

@@ -8,6 +8,10 @@ tags:
 model-index:
 - name: AmsterdamDocClassificationGEITje200T1Epochs
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -15,24 +19,26 @@ should probably proofread and complete it, then remove this comment. -->
 # AmsterdamDocClassificationGEITje200T1Epochs
-This model is a fine-tuned version of [Rijgersberg/GEITje-7B-chat-v2](https://huggingface.co/Rijgersberg/GEITje-7B-chat-v2) on the [AmsterdamDocClassification](https://huggingface.co/datasets/FemkeBakker/AmsterdamBalancedFirst200Tokens) dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.5900
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -57,6 +63,7 @@ The following hyperparameters were used during training:
 | 0.4401        | 0.7952 | 492  | 0.5907          |
 | 0.6746        | 0.9939 | 615  | 0.5900          |
 ### Framework versions
@@ -64,3 +71,7 @@ The following hyperparameters were used during training:
 - Pytorch 2.3.0+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1

 model-index:
 - name: AmsterdamDocClassificationGEITje200T1Epochs
   results: []
+datasets:
+- FemkeBakker/AmsterdamBalancedFirst200Tokens
+language:
+- nl
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # AmsterdamDocClassificationGEITje200T1Epochs
+As part of the Assessing Large Language Models for Document Classification project by the Municipality of Amsterdam, we fine-tune Mistral, Llama, and GEITje for document classification.
+The fine-tuning is performed using the [AmsterdamBalancedFirst200Tokens](https://huggingface.co/datasets/FemkeBakker/AmsterdamBalancedFirst200Tokens) dataset, which consists of documents truncated to the first 200 tokens.
+In our research, we evaluate the fine-tuning of these LLMs across one, two, and three epochs.
+This model is a fine-tuned version of [Rijgersberg/GEITje-7B-chat-v2](https://huggingface.co/Rijgersberg/GEITje-7B-chat-v2) and has been fine-tuned for one epoch.
+It achieves the following results on the evaluation set:
+- Loss: 0.5900
 ## Training and evaluation data
+- The training data consists of 9900 documents and their labels formatted into conversations.
+- The evaluation data consists of 1100 documents and their labels formatted into conversations.
 ## Training procedure
+See the [GitHub](https://github.com/Amsterdam-Internships/document-classification-using-large-language-models) for specifics about the training and the code.
 ### Training hyperparameters
 The following hyperparameters were used during training:
 | 0.4401        | 0.7952 | 492  | 0.5907          |
 | 0.6746        | 0.9939 | 615  | 0.5900          |
+Training time: it took in total 49 minutes to fine-tune the model for one epoch.
 ### Framework versions
 - Pytorch 2.3.0+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1
+### Acknowledgements
+This model was trained as part of [insert thesis info] in collaboration with Amsterdam Intelligence for the City of Amsterdam.