--- license: other tags: - generated_from_trainer datasets: - AlekseyKorshuk/dalio-book-handwritten-io-sorted-v2 metrics: - accuracy model-index: - name: 125m-dalio-book-handwritten-io-constant-1e-6-v2 results: - task: name: Causal Language Modeling type: text-generation dataset: name: AlekseyKorshuk/dalio-book-handwritten-io-sorted-v2 type: AlekseyKorshuk/dalio-book-handwritten-io-sorted-v2 metrics: - name: Accuracy type: accuracy value: 0.23359387091781458 --- # 125m-dalio-book-handwritten-io-constant-1e-6-v2 This model is a fine-tuned version of [facebook/opt-125m](https://huggingface.co/facebook/opt-125m) on the AlekseyKorshuk/dalio-book-handwritten-io-sorted-v2 dataset. It achieves the following results on the evaluation set: - Loss: 3.0859 - Accuracy: 0.2336 - Perplexity: 21.8880 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-06 - train_batch_size: 1 - eval_batch_size: 1 - seed: 42 - distributed_type: multi-GPU - num_devices: 8 - total_train_batch_size: 8 - total_eval_batch_size: 8 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: constant - num_epochs: 1.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | Accuracy | Perplexity | |:-------------:|:-----:|:----:|:---------------:|:--------:|:----------:| | 3.3352 | 0.01 | 1 | 3.1738 | 0.2305 | 23.8988 | | 3.3091 | 0.03 | 2 | 3.1738 | 0.2305 | 23.8988 | | 3.3347 | 0.04 | 3 | 3.1738 | 0.2305 | 23.8988 | | 3.1445 | 0.05 | 4 | 3.1738 | 0.2305 | 23.8988 | | 2.8918 | 0.07 | 5 | 3.1738 | 0.2305 | 23.8988 | | 3.2068 | 0.08 | 6 | 3.1738 | 0.2305 | 23.8988 | | 3.6245 | 0.09 | 7 | 3.1719 | 0.2305 | 23.8522 | | 3.2256 | 0.11 | 8 | 3.1719 | 0.2305 | 23.8522 | | 2.9991 | 0.12 | 9 | 3.1699 | 0.2305 | 23.8056 | | 3.3257 | 0.13 | 10 | 3.1680 | 0.2306 | 23.7592 | | 3.1199 | 0.15 | 11 | 3.1660 | 0.2306 | 23.7128 | | 3.3735 | 0.16 | 12 | 3.1660 | 0.2306 | 23.7128 | | 3.0051 | 0.17 | 13 | 3.1641 | 0.2307 | 23.6665 | | 3.2695 | 0.19 | 14 | 3.1621 | 0.2308 | 23.6204 | | 3.2004 | 0.2 | 15 | 3.1602 | 0.2309 | 23.5743 | | 3.2075 | 0.21 | 16 | 3.1582 | 0.2308 | 23.5283 | | 3.321 | 0.23 | 17 | 3.1562 | 0.2308 | 23.4824 | | 3.4026 | 0.24 | 18 | 3.1543 | 0.2309 | 23.4366 | | 3.0383 | 0.25 | 19 | 3.1523 | 0.2309 | 23.3908 | | 3.166 | 0.27 | 20 | 3.1504 | 0.2309 | 23.3452 | | 3.144 | 0.28 | 21 | 3.1484 | 0.2310 | 23.2996 | | 3.1624 | 0.29 | 22 | 3.1484 | 0.2310 | 23.2996 | | 3.0332 | 0.31 | 23 | 3.1465 | 0.2310 | 23.2542 | | 3.3745 | 0.32 | 24 | 3.1445 | 0.2311 | 23.2088 | | 3.0823 | 0.33 | 25 | 3.1426 | 0.2312 | 23.1635 | | 3.6021 | 0.35 | 26 | 3.1406 | 0.2312 | 23.1183 | | 3.1125 | 0.36 | 27 | 3.1387 | 0.2313 | 23.0732 | | 3.1406 | 0.37 | 28 | 3.1387 | 0.2314 | 23.0732 | | 3.1736 | 0.39 | 29 | 3.1367 | 0.2314 | 23.0282 | | 3.1104 | 0.4 | 30 | 3.1348 | 0.2315 | 22.9832 | | 3.1301 | 0.41 | 31 | 3.1328 | 0.2316 | 22.9384 | | 3.3376 | 0.43 | 32 | 3.1309 | 0.2315 | 22.8936 | | 3.218 | 0.44 | 33 | 3.1309 | 0.2316 | 22.8936 | | 3.0786 | 0.45 | 34 | 3.1289 | 0.2316 | 22.8490 | | 3.0125 | 0.47 | 35 | 3.1270 | 0.2317 | 22.8044 | | 3.2634 | 0.48 | 36 | 3.1270 | 0.2317 | 22.8044 | | 2.9888 | 0.49 | 37 | 3.125 | 0.2318 | 22.7599 | | 3.1624 | 0.51 | 38 | 3.1230 | 0.2318 | 22.7155 | | 2.9807 | 0.52 | 39 | 3.1211 | 0.2319 | 22.6712 | | 3.446 | 0.53 | 40 | 3.1211 | 0.2319 | 22.6712 | | 3.1338 | 0.55 | 41 | 3.1191 | 0.2320 | 22.6269 | | 3.1841 | 0.56 | 42 | 3.1191 | 0.2320 | 22.6269 | | 3.1079 | 0.57 | 43 | 3.1172 | 0.2320 | 22.5828 | | 3.0918 | 0.59 | 44 | 3.1152 | 0.2321 | 22.5387 | | 3.0302 | 0.6 | 45 | 3.1152 | 0.2322 | 22.5387 | | 3.1123 | 0.61 | 46 | 3.1133 | 0.2323 | 22.4947 | | 2.9985 | 0.63 | 47 | 3.1113 | 0.2324 | 22.4508 | | 3.3816 | 0.64 | 48 | 3.1113 | 0.2324 | 22.4508 | | 3.0813 | 0.65 | 49 | 3.1094 | 0.2324 | 22.4070 | | 3.2024 | 0.67 | 50 | 3.1094 | 0.2325 | 22.4070 | | 3.0178 | 0.68 | 51 | 3.1074 | 0.2325 | 22.3633 | | 3.1646 | 0.69 | 52 | 3.1074 | 0.2326 | 22.3633 | | 3.0046 | 0.71 | 53 | 3.1055 | 0.2327 | 22.3197 | | 3.0266 | 0.72 | 54 | 3.1055 | 0.2327 | 22.3197 | | 3.3857 | 0.73 | 55 | 3.1035 | 0.2327 | 22.2761 | | 3.064 | 0.75 | 56 | 3.1035 | 0.2328 | 22.2761 | | 3.176 | 0.76 | 57 | 3.1016 | 0.2328 | 22.2327 | | 3.1851 | 0.77 | 58 | 3.1016 | 0.2329 | 22.2327 | | 3.0811 | 0.79 | 59 | 3.0996 | 0.2329 | 22.1893 | | 3.0205 | 0.8 | 60 | 3.0996 | 0.2330 | 22.1893 | | 3.26 | 0.81 | 61 | 3.0977 | 0.2330 | 22.1460 | | 3.2922 | 0.83 | 62 | 3.0977 | 0.2331 | 22.1460 | | 3.5349 | 0.84 | 63 | 3.0957 | 0.2331 | 22.1028 | | 3.3525 | 0.85 | 64 | 3.0957 | 0.2331 | 22.1028 | | 3.135 | 0.87 | 65 | 3.0938 | 0.2331 | 22.0596 | | 3.1707 | 0.88 | 66 | 3.0938 | 0.2332 | 22.0596 | | 3.0127 | 0.89 | 67 | 3.0918 | 0.2332 | 22.0166 | | 3.0952 | 0.91 | 68 | 3.0918 | 0.2332 | 22.0166 | | 3.1023 | 0.92 | 69 | 3.0898 | 0.2334 | 21.9736 | | 3.3821 | 0.93 | 70 | 3.0898 | 0.2334 | 21.9736 | | 3.1118 | 0.95 | 71 | 3.0879 | 0.2334 | 21.9308 | | 3.1143 | 0.96 | 72 | 3.0879 | 0.2335 | 21.9308 | | 3.1118 | 0.97 | 73 | 3.0879 | 0.2335 | 21.9308 | | 3.0596 | 0.99 | 74 | 3.0859 | 0.2336 | 21.8880 | | 3.1033 | 1.0 | 75 | 3.0859 | 0.2336 | 21.8880 | ### Framework versions - Transformers 4.25.0.dev0 - Pytorch 1.12.1+cu113 - Datasets 2.3.2 - Tokenizers 0.12.1