PEFT
Safetensors
GGUF
German
trl
sft
Generated from Trainer
Inference Endpoints
conversational
JanPf commited on
Commit
a2f572a
·
verified ·
1 Parent(s): f4dcb55

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -51
README.md CHANGED
@@ -1,63 +1,23 @@
1
  ---
2
  library_name: peft
3
- license: other
4
  base_model: LSX-UniWue/LLaMmlein_1B
5
  tags:
6
  - trl
7
  - sft
8
  - generated_from_trainer
9
  model-index:
10
- - name: LLaMmlein_1B_chat_selected
11
  results: []
 
 
 
 
 
 
 
12
  ---
13
 
14
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
- should probably proofread and complete it, then remove this comment. -->
16
-
17
- # LLaMmlein_1B_chat_selected
18
-
19
- This model is a fine-tuned version of [LSX-UniWue/LLaMmlein_1B](https://huggingface.co/LSX-UniWue/LLaMmlein_1B) on an unknown dataset.
20
- It achieves the following results on the evaluation set:
21
- - Loss: 1.5717
22
-
23
- ## Model description
24
-
25
- More information needed
26
-
27
- ## Intended uses & limitations
28
-
29
- More information needed
30
-
31
- ## Training and evaluation data
32
-
33
- More information needed
34
-
35
- ## Training procedure
36
-
37
- ### Training hyperparameters
38
-
39
- The following hyperparameters were used during training:
40
- - learning_rate: 5e-05
41
- - train_batch_size: 8
42
- - eval_batch_size: 8
43
- - seed: 42
44
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
- - lr_scheduler_type: linear
46
- - num_epochs: 3.0
47
-
48
- ### Training results
49
-
50
- | Training Loss | Epoch | Step | Validation Loss |
51
- |:-------------:|:-----:|:-----:|:---------------:|
52
- | 1.2071 | 1.0 | 8239 | 1.5818 |
53
- | 1.5301 | 2.0 | 16478 | 1.5746 |
54
- | 2.5497 | 3.0 | 24717 | 1.5717 |
55
-
56
-
57
- ### Framework versions
58
 
59
- - PEFT 0.13.2
60
- - Transformers 4.44.2
61
- - Pytorch 2.4.1+cu121
62
- - Datasets 3.0.1
63
- - Tokenizers 0.19.1
 
1
  ---
2
  library_name: peft
 
3
  base_model: LSX-UniWue/LLaMmlein_1B
4
  tags:
5
  - trl
6
  - sft
7
  - generated_from_trainer
8
  model-index:
9
+ - name: LLaMmlein_1b_chat_all
10
  results: []
11
+ datasets:
12
+ - LSX-UniWue/Guanako
13
+ - FreedomIntelligence/sharegpt-deutsch
14
+ - FreedomIntelligence/alpaca-gpt4-deutsch
15
+ language:
16
+ - de
17
+ license: other
18
  ---
19
 
20
+ # LLäMmlein 1B Chat
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
+ This is a chat adapter for the German Tinyllama 1B language model.
23
+ Find more details on our [page](https://www.informatik.uni-wuerzburg.de/datascience/projects/nlp/llammlein/) and our [preprint](arxiv.org/abs/2411.11171)!