--- license: apache-2.0 base_model: jeff31415/TinyLlama-1.1B-1T-OpenOrca tags: - generated_from_trainer model-index: - name: results results: [] --- # results This model is a fine-tuned version of [jeff31415/TinyLlama-1.1B-1T-OpenOrca](https://huggingface.co/jeff31415/TinyLlama-1.1B-1T-OpenOrca) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 1.5156 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 9e-07 - train_batch_size: 20 - eval_batch_size: 4 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 80 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 3 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 2.1726 | 0.03 | 8 | 2.3170 | | 2.1444 | 0.05 | 16 | 2.2937 | | 2.1036 | 0.08 | 24 | 2.2707 | | 2.0703 | 0.1 | 32 | 2.2478 | | 2.0604 | 0.13 | 40 | 2.2248 | | 2.046 | 0.15 | 48 | 2.2013 | | 1.9919 | 0.18 | 56 | 2.1780 | | 1.9842 | 0.21 | 64 | 2.1547 | | 1.9234 | 0.23 | 72 | 2.1320 | | 1.9235 | 0.26 | 80 | 2.1099 | | 1.9096 | 0.28 | 88 | 2.0884 | | 1.8722 | 0.31 | 96 | 2.0679 | | 1.8594 | 0.34 | 104 | 2.0479 | | 1.8438 | 0.36 | 112 | 2.0283 | | 1.7581 | 0.39 | 120 | 2.0089 | | 1.7852 | 0.41 | 128 | 1.9901 | | 1.7634 | 0.44 | 136 | 1.9714 | | 1.7296 | 0.46 | 144 | 1.9531 | | 1.6976 | 0.49 | 152 | 1.9353 | | 1.6861 | 0.52 | 160 | 1.9173 | | 1.6683 | 0.54 | 168 | 1.8993 | | 1.6255 | 0.57 | 176 | 1.8826 | | 1.619 | 0.59 | 184 | 1.8673 | | 1.6455 | 0.62 | 192 | 1.8534 | | 1.5784 | 0.65 | 200 | 1.8399 | | 1.6078 | 0.67 | 208 | 1.8259 | | 1.5703 | 0.7 | 216 | 1.8124 | | 1.5215 | 0.72 | 224 | 1.7989 | | 1.542 | 0.75 | 232 | 1.7852 | | 1.5147 | 0.77 | 240 | 1.7721 | | 1.5092 | 0.8 | 248 | 1.7589 | | 1.4564 | 0.83 | 256 | 1.7456 | | 1.4985 | 0.85 | 264 | 1.7324 | | 1.4505 | 0.88 | 272 | 1.7189 | | 1.4447 | 0.9 | 280 | 1.7052 | | 1.4436 | 0.93 | 288 | 1.6924 | | 1.4132 | 0.95 | 296 | 1.6799 | | 1.3791 | 0.98 | 304 | 1.6680 | | 1.3877 | 1.01 | 312 | 1.6565 | | 1.3807 | 1.03 | 320 | 1.6453 | | 1.3391 | 1.06 | 328 | 1.6352 | | 1.3232 | 1.08 | 336 | 1.6251 | | 1.3293 | 1.11 | 344 | 1.6159 | | 1.3029 | 1.14 | 352 | 1.6074 | | 1.3173 | 1.16 | 360 | 1.5992 | | 1.3006 | 1.19 | 368 | 1.5926 | | 1.2547 | 1.21 | 376 | 1.5863 | | 1.2704 | 1.24 | 384 | 1.5805 | | 1.2964 | 1.26 | 392 | 1.5749 | | 1.277 | 1.29 | 400 | 1.5695 | | 1.2718 | 1.32 | 408 | 1.5657 | | 1.2379 | 1.34 | 416 | 1.5619 | | 1.2746 | 1.37 | 424 | 1.5585 | | 1.2349 | 1.39 | 432 | 1.5559 | | 1.2264 | 1.42 | 440 | 1.5531 | | 1.2365 | 1.45 | 448 | 1.5505 | | 1.2242 | 1.47 | 456 | 1.5484 | | 1.2094 | 1.5 | 464 | 1.5462 | | 1.2196 | 1.52 | 472 | 1.5444 | | 1.2447 | 1.55 | 480 | 1.5426 | | 1.2127 | 1.57 | 488 | 1.5407 | | 1.2278 | 1.6 | 496 | 1.5391 | | 1.2089 | 1.63 | 504 | 1.5377 | | 1.2069 | 1.65 | 512 | 1.5361 | | 1.2264 | 1.68 | 520 | 1.5350 | | 1.2027 | 1.7 | 528 | 1.5338 | | 1.2138 | 1.73 | 536 | 1.5325 | | 1.207 | 1.75 | 544 | 1.5313 | | 1.2155 | 1.78 | 552 | 1.5304 | | 1.2192 | 1.81 | 560 | 1.5295 | | 1.2223 | 1.83 | 568 | 1.5287 | | 1.2281 | 1.86 | 576 | 1.5278 | | 1.1977 | 1.88 | 584 | 1.5269 | | 1.2101 | 1.91 | 592 | 1.5261 | | 1.2099 | 1.94 | 600 | 1.5254 | | 1.1873 | 1.96 | 608 | 1.5245 | | 1.204 | 1.99 | 616 | 1.5242 | | 1.21 | 2.01 | 624 | 1.5239 | | 1.242 | 2.04 | 632 | 1.5231 | | 1.1696 | 2.06 | 640 | 1.5224 | | 1.1803 | 2.09 | 648 | 1.5218 | | 1.1692 | 2.12 | 656 | 1.5213 | | 1.212 | 2.14 | 664 | 1.5208 | | 1.1977 | 2.17 | 672 | 1.5204 | | 1.187 | 2.19 | 680 | 1.5201 | | 1.1858 | 2.22 | 688 | 1.5199 | | 1.1824 | 2.25 | 696 | 1.5194 | | 1.1914 | 2.27 | 704 | 1.5190 | | 1.1815 | 2.3 | 712 | 1.5187 | | 1.2021 | 2.32 | 720 | 1.5184 | | 1.1872 | 2.35 | 728 | 1.5181 | | 1.1901 | 2.37 | 736 | 1.5178 | | 1.1933 | 2.4 | 744 | 1.5177 | | 1.1773 | 2.43 | 752 | 1.5175 | | 1.1935 | 2.45 | 760 | 1.5172 | | 1.2118 | 2.48 | 768 | 1.5170 | | 1.1816 | 2.5 | 776 | 1.5169 | | 1.1842 | 2.53 | 784 | 1.5167 | | 1.1891 | 2.55 | 792 | 1.5165 | | 1.1883 | 2.58 | 800 | 1.5164 | | 1.1506 | 2.61 | 808 | 1.5163 | | 1.1708 | 2.63 | 816 | 1.5162 | | 1.1944 | 2.66 | 824 | 1.5160 | | 1.1575 | 2.68 | 832 | 1.5159 | | 1.1698 | 2.71 | 840 | 1.5160 | | 1.1525 | 2.74 | 848 | 1.5158 | | 1.1767 | 2.76 | 856 | 1.5157 | | 1.1943 | 2.79 | 864 | 1.5158 | | 1.1727 | 2.81 | 872 | 1.5157 | | 1.195 | 2.84 | 880 | 1.5157 | | 1.1771 | 2.86 | 888 | 1.5157 | | 1.1731 | 2.89 | 896 | 1.5156 | | 1.191 | 2.92 | 904 | 1.5157 | | 1.1903 | 2.94 | 912 | 1.5156 | | 1.1821 | 2.97 | 920 | 1.5156 | | 1.2 | 2.99 | 928 | 1.5156 | ### Framework versions - Transformers 4.36.0.dev0 - Pytorch 2.1.0+cu118 - Datasets 2.15.1.dev0 - Tokenizers 0.15.0