caomingjun/storyteller
This model is a fine-tuned version of gpt2 on the roneneldan/TinyStories dataset. It achieves the following results on the evaluation set:
- Loss: 1.2057
- Accuracy: 0.6681
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 2
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Accuracy | Validation Loss |
---|---|---|---|---|
1.764 | 0.0087 | 2000 | 0.5790 | 1.6696 |
1.6751 | 0.0174 | 4000 | 0.5903 | 1.6043 |
1.6222 | 0.0261 | 6000 | 0.5974 | 1.5656 |
1.6077 | 0.0348 | 8000 | 0.6023 | 1.5383 |
1.5761 | 0.0435 | 10000 | 0.6066 | 1.5165 |
1.5634 | 0.0522 | 12000 | 0.6100 | 1.4985 |
1.5397 | 0.0609 | 14000 | 0.6125 | 1.4854 |
1.5492 | 0.0696 | 16000 | 0.6141 | 1.4753 |
1.5189 | 0.0783 | 18000 | 0.6164 | 1.4622 |
1.5083 | 0.0870 | 20000 | 0.6180 | 1.4540 |
1.4956 | 0.0957 | 22000 | 0.6201 | 1.4435 |
1.488 | 0.1044 | 24000 | 0.6218 | 1.4359 |
1.4922 | 0.1131 | 26000 | 0.6233 | 1.4283 |
1.4846 | 0.1218 | 28000 | 0.6241 | 1.4230 |
1.4676 | 0.1305 | 30000 | 0.6255 | 1.4168 |
1.459 | 0.1392 | 32000 | 0.6263 | 1.4103 |
1.4594 | 0.1479 | 34000 | 0.6278 | 1.4048 |
1.4594 | 0.1566 | 36000 | 0.6285 | 1.3996 |
1.4525 | 0.1653 | 38000 | 0.6297 | 1.3940 |
1.4608 | 0.1740 | 40000 | 0.6304 | 1.3907 |
1.4432 | 0.1827 | 42000 | 0.6311 | 1.3856 |
1.4267 | 0.1914 | 44000 | 0.6319 | 1.3828 |
1.4265 | 0.2001 | 46000 | 0.6327 | 1.3796 |
1.4204 | 0.2088 | 48000 | 0.6337 | 1.3734 |
1.4161 | 0.2175 | 50000 | 0.6340 | 1.3709 |
1.413 | 0.2262 | 52000 | 0.6352 | 1.3666 |
1.4134 | 0.2349 | 54000 | 0.6358 | 1.3631 |
1.4231 | 0.2436 | 56000 | 0.6363 | 1.3607 |
1.4091 | 0.2523 | 58000 | 0.6369 | 1.3574 |
1.3985 | 0.2610 | 60000 | 0.6373 | 1.3546 |
1.3957 | 0.2697 | 62000 | 0.6379 | 1.3524 |
1.3939 | 0.2784 | 64000 | 0.6385 | 1.3487 |
1.3964 | 0.2871 | 66000 | 0.6390 | 1.3471 |
1.3822 | 0.2958 | 68000 | 0.6396 | 1.3437 |
1.3922 | 0.3045 | 70000 | 0.6399 | 1.3426 |
1.3865 | 0.3132 | 72000 | 0.6403 | 1.3393 |
1.3853 | 0.3219 | 74000 | 0.6410 | 1.3369 |
1.3976 | 0.3306 | 76000 | 0.6415 | 1.3337 |
1.3964 | 0.3393 | 78000 | 0.6416 | 1.3332 |
1.3846 | 0.3480 | 80000 | 0.6422 | 1.3310 |
1.3825 | 0.3567 | 82000 | 0.6425 | 1.3288 |
1.3758 | 0.3654 | 84000 | 0.6431 | 1.3259 |
1.3685 | 0.3741 | 86000 | 0.6432 | 1.3243 |
1.3812 | 0.3828 | 88000 | 0.6440 | 1.3215 |
1.3763 | 0.3915 | 90000 | 0.6441 | 1.3209 |
1.3637 | 0.4002 | 92000 | 0.6446 | 1.3188 |
1.371 | 0.4089 | 94000 | 0.6452 | 1.3164 |
1.3548 | 0.4176 | 96000 | 0.6453 | 1.3154 |
1.3533 | 0.4263 | 98000 | 0.6455 | 1.3141 |
1.3479 | 0.4350 | 100000 | 0.6459 | 1.3110 |
1.3479 | 0.4437 | 102000 | 0.6461 | 1.3098 |
1.3543 | 0.4524 | 104000 | 0.6466 | 1.3087 |
1.3491 | 0.4611 | 106000 | 0.6470 | 1.3064 |
1.3645 | 0.4698 | 108000 | 0.6473 | 1.3052 |
1.3603 | 0.4785 | 110000 | 0.6478 | 1.3032 |
1.3528 | 0.4872 | 112000 | 0.6480 | 1.3018 |
1.3508 | 0.4959 | 114000 | 0.6481 | 1.3010 |
1.3488 | 0.5046 | 116000 | 0.6484 | 1.2984 |
1.3435 | 0.5133 | 118000 | 0.6486 | 1.2983 |
1.3635 | 0.5220 | 120000 | 0.6487 | 1.2969 |
1.3462 | 0.5307 | 122000 | 0.6493 | 1.2947 |
1.3508 | 0.5394 | 124000 | 0.6494 | 1.2935 |
1.364 | 0.5481 | 126000 | 0.6495 | 1.2925 |
1.3409 | 0.5568 | 128000 | 0.6502 | 1.2907 |
1.3402 | 0.5655 | 130000 | 0.6503 | 1.2907 |
1.339 | 0.5742 | 132000 | 0.6508 | 1.2880 |
1.325 | 0.5829 | 134000 | 0.6508 | 1.2869 |
1.3432 | 0.5916 | 136000 | 0.6508 | 1.2865 |
1.3478 | 0.6003 | 138000 | 0.6513 | 1.2844 |
1.3345 | 0.6090 | 140000 | 0.6516 | 1.2836 |
1.3194 | 0.6177 | 142000 | 0.6514 | 1.2822 |
1.3342 | 0.6264 | 144000 | 0.6522 | 1.2813 |
1.3333 | 0.6351 | 146000 | 0.6523 | 1.2807 |
1.3367 | 0.6438 | 148000 | 0.6522 | 1.2801 |
1.3293 | 0.6525 | 150000 | 0.6525 | 1.2787 |
1.3337 | 0.6612 | 152000 | 0.6528 | 1.2770 |
1.3355 | 0.6699 | 154000 | 0.6530 | 1.2765 |
1.3288 | 0.6786 | 156000 | 0.6532 | 1.2753 |
1.3362 | 0.6873 | 158000 | 0.6534 | 1.2738 |
1.3142 | 0.6960 | 160000 | 0.6534 | 1.2733 |
1.3109 | 0.7047 | 162000 | 0.6539 | 1.2720 |
1.3264 | 0.7134 | 164000 | 0.6542 | 1.2710 |
1.3143 | 0.7221 | 166000 | 0.6543 | 1.2698 |
1.3118 | 0.7308 | 168000 | 0.6544 | 1.2698 |
1.3121 | 0.7395 | 170000 | 0.6546 | 1.2683 |
1.3368 | 0.7482 | 172000 | 0.6550 | 1.2670 |
1.3077 | 0.7569 | 174000 | 0.6550 | 1.2668 |
1.3104 | 0.7656 | 176000 | 0.6552 | 1.2663 |
1.316 | 0.7743 | 178000 | 0.6554 | 1.2649 |
1.3209 | 0.7830 | 180000 | 0.6558 | 1.2632 |
1.3153 | 0.7917 | 182000 | 0.6553 | 1.2649 |
1.3025 | 0.8004 | 184000 | 0.6560 | 1.2626 |
1.3146 | 0.8091 | 186000 | 0.6562 | 1.2619 |
1.3291 | 0.8178 | 188000 | 0.6563 | 1.2608 |
1.3062 | 0.8265 | 190000 | 0.6564 | 1.2598 |
1.3009 | 0.8352 | 192000 | 0.6566 | 1.2592 |
1.2943 | 0.8439 | 194000 | 0.6566 | 1.2588 |
1.2977 | 0.8526 | 196000 | 0.6567 | 1.2578 |
1.3073 | 0.8613 | 198000 | 0.6571 | 1.2565 |
1.2835 | 0.8700 | 200000 | 0.6575 | 1.2560 |
1.3019 | 0.8787 | 202000 | 0.6574 | 1.2554 |
1.3134 | 0.8874 | 204000 | 0.6578 | 1.2544 |
1.3103 | 0.8961 | 206000 | 0.6579 | 1.2534 |
1.2897 | 0.9048 | 208000 | 0.6579 | 1.2531 |
1.3014 | 0.9135 | 210000 | 0.6577 | 1.2524 |
1.304 | 0.9222 | 212000 | 0.6583 | 1.2514 |
1.3043 | 0.9309 | 214000 | 0.6581 | 1.2515 |
1.2887 | 0.9396 | 216000 | 0.6585 | 1.2497 |
1.3022 | 0.9483 | 218000 | 0.6585 | 1.2490 |
1.2773 | 0.9570 | 220000 | 0.6587 | 1.2490 |
1.3003 | 0.9657 | 222000 | 0.6589 | 1.2479 |
1.295 | 0.9744 | 224000 | 0.6589 | 1.2477 |
1.2978 | 0.9831 | 226000 | 0.6593 | 1.2466 |
1.3013 | 0.9918 | 228000 | 0.6593 | 1.2460 |
1.2879 | 1.0005 | 230000 | 0.6594 | 1.2450 |
1.2959 | 1.0092 | 232000 | 0.6595 | 1.2455 |
1.2831 | 1.0179 | 234000 | 0.6600 | 1.2436 |
1.2678 | 1.0266 | 236000 | 0.6599 | 1.2437 |
1.2723 | 1.0353 | 238000 | 0.6598 | 1.2435 |
1.2792 | 1.0440 | 240000 | 0.6599 | 1.2429 |
1.2707 | 1.0527 | 242000 | 0.6601 | 1.2422 |
1.2788 | 1.0614 | 244000 | 0.6604 | 1.2414 |
1.2667 | 1.0701 | 246000 | 0.6604 | 1.2410 |
1.2792 | 1.0788 | 248000 | 0.6605 | 1.2407 |
1.2748 | 1.0875 | 250000 | 0.6608 | 1.2399 |
1.2669 | 1.0962 | 252000 | 0.6611 | 1.2392 |
1.2729 | 1.1049 | 254000 | 0.6608 | 1.2391 |
1.263 | 1.1136 | 256000 | 0.6610 | 1.2387 |
1.2684 | 1.1223 | 258000 | 0.6611 | 1.2380 |
1.2638 | 1.1310 | 260000 | 0.6612 | 1.2374 |
1.2993 | 1.1397 | 262000 | 0.6615 | 1.2366 |
1.2842 | 1.1484 | 264000 | 0.6614 | 1.2364 |
1.2669 | 1.1571 | 266000 | 0.6618 | 1.2350 |
1.2698 | 1.1658 | 268000 | 0.6617 | 1.2353 |
1.264 | 1.1745 | 270000 | 0.6617 | 1.2347 |
1.278 | 1.1832 | 272000 | 0.6618 | 1.2342 |
1.269 | 1.1919 | 274000 | 0.6619 | 1.2341 |
1.271 | 1.2006 | 276000 | 0.6618 | 1.2345 |
1.2727 | 1.2093 | 278000 | 0.6618 | 1.2333 |
1.2703 | 1.2180 | 280000 | 0.6624 | 1.2328 |
1.2691 | 1.2267 | 282000 | 0.6625 | 1.2316 |
1.2771 | 1.2354 | 284000 | 0.6628 | 1.2304 |
1.2805 | 1.2441 | 286000 | 0.6626 | 1.2305 |
1.2646 | 1.2528 | 288000 | 0.6627 | 1.2304 |
1.2523 | 1.2615 | 290000 | 0.6628 | 1.2300 |
1.2802 | 1.2702 | 292000 | 0.6630 | 1.2288 |
1.2734 | 1.2789 | 294000 | 0.6628 | 1.2295 |
1.2625 | 1.2876 | 296000 | 0.6631 | 1.2287 |
1.2798 | 1.2963 | 298000 | 0.6632 | 1.2279 |
1.2524 | 1.3050 | 300000 | 0.6634 | 1.2274 |
1.2658 | 1.3137 | 302000 | 0.6634 | 1.2268 |
1.2692 | 1.3224 | 304000 | 0.6635 | 1.2267 |
1.26 | 1.3311 | 306000 | 0.6637 | 1.2261 |
1.2598 | 1.3398 | 308000 | 0.6636 | 1.2261 |
1.2689 | 1.3485 | 310000 | 0.6637 | 1.2258 |
1.2619 | 1.3572 | 312000 | 0.6638 | 1.2253 |
1.2382 | 1.3659 | 314000 | 0.6640 | 1.2247 |
1.2665 | 1.3746 | 316000 | 0.6638 | 1.2249 |
1.2451 | 1.3833 | 318000 | 0.6642 | 1.2231 |
1.2633 | 1.3920 | 320000 | 0.6643 | 1.2231 |
1.2521 | 1.4007 | 322000 | 0.6644 | 1.2224 |
1.2804 | 1.4094 | 324000 | 0.6644 | 1.2224 |
1.2505 | 1.4181 | 326000 | 0.6646 | 1.2220 |
1.2626 | 1.4268 | 328000 | 0.6646 | 1.2213 |
1.2631 | 1.4355 | 330000 | 0.6647 | 1.2211 |
1.2586 | 1.4442 | 332000 | 0.6646 | 1.2212 |
1.2642 | 1.4529 | 334000 | 0.6647 | 1.2210 |
1.2738 | 1.4616 | 336000 | 0.6648 | 1.2204 |
1.2564 | 1.4703 | 338000 | 0.6650 | 1.2198 |
1.2683 | 1.4790 | 340000 | 0.6649 | 1.2197 |
1.2591 | 1.4877 | 342000 | 0.6650 | 1.2194 |
1.2593 | 1.4964 | 344000 | 0.6651 | 1.2191 |
1.2528 | 1.5051 | 346000 | 0.6650 | 1.2190 |
1.2658 | 1.5138 | 348000 | 0.6654 | 1.2182 |
1.2568 | 1.5225 | 350000 | 0.6653 | 1.2179 |
1.2478 | 1.5312 | 352000 | 0.6653 | 1.2179 |
1.2649 | 1.5399 | 354000 | 0.6655 | 1.2171 |
1.271 | 1.5486 | 356000 | 0.6655 | 1.2172 |
1.2506 | 1.5573 | 358000 | 0.6656 | 1.2167 |
1.2516 | 1.5660 | 360000 | 0.6657 | 1.2165 |
1.2484 | 1.5747 | 362000 | 0.6657 | 1.2161 |
1.2417 | 1.5834 | 364000 | 0.6658 | 1.2159 |
1.2707 | 1.5921 | 366000 | 0.6660 | 1.2153 |
1.2597 | 1.6008 | 368000 | 0.6659 | 1.2151 |
1.2522 | 1.6095 | 370000 | 0.6660 | 1.2148 |
1.2593 | 1.6182 | 372000 | 0.6661 | 1.2143 |
1.2579 | 1.6269 | 374000 | 0.6661 | 1.2145 |
1.2385 | 1.6356 | 376000 | 0.6662 | 1.2139 |
1.25 | 1.6443 | 378000 | 0.6663 | 1.2136 |
1.2412 | 1.6530 | 380000 | 0.6664 | 1.2131 |
1.2242 | 1.6617 | 382000 | 0.6665 | 1.2132 |
1.2516 | 1.6704 | 384000 | 0.6665 | 1.2128 |
1.2533 | 1.6791 | 386000 | 0.6666 | 1.2122 |
1.2474 | 1.6878 | 388000 | 0.6667 | 1.2120 |
1.2405 | 1.6965 | 390000 | 0.6667 | 1.2119 |
1.2466 | 1.7052 | 392000 | 0.6666 | 1.2119 |
1.2443 | 1.7139 | 394000 | 0.6667 | 1.2115 |
1.2422 | 1.7226 | 396000 | 0.6668 | 1.2112 |
1.2298 | 1.7313 | 398000 | 0.6669 | 1.2111 |
1.2333 | 1.7400 | 400000 | 0.6669 | 1.2105 |
1.2491 | 1.7487 | 402000 | 0.6669 | 1.2105 |
1.2368 | 1.7574 | 404000 | 0.6671 | 1.2102 |
1.2435 | 1.7661 | 406000 | 0.6673 | 1.2097 |
1.2552 | 1.7748 | 408000 | 0.6673 | 1.2094 |
1.2509 | 1.7835 | 410000 | 0.6675 | 1.2089 |
1.2477 | 1.7922 | 412000 | 0.6673 | 1.2093 |
1.2395 | 1.8009 | 414000 | 0.6673 | 1.2087 |
1.2417 | 1.8096 | 416000 | 0.6674 | 1.2088 |
1.2526 | 1.8183 | 418000 | 0.6674 | 1.2085 |
1.2516 | 1.8270 | 420000 | 0.6675 | 1.2082 |
1.2542 | 1.8357 | 422000 | 0.6675 | 1.2083 |
1.2336 | 1.8444 | 424000 | 0.6676 | 1.2078 |
1.2376 | 1.8531 | 426000 | 0.6675 | 1.2079 |
1.2481 | 1.8618 | 428000 | 0.6678 | 1.2076 |
1.2409 | 1.8705 | 430000 | 0.6677 | 1.2073 |
1.2646 | 1.8792 | 432000 | 0.6677 | 1.2072 |
1.2329 | 1.8879 | 434000 | 0.6678 | 1.2070 |
1.2492 | 1.8966 | 436000 | 0.6679 | 1.2067 |
1.2362 | 1.9053 | 438000 | 0.6678 | 1.2069 |
1.2625 | 1.9140 | 440000 | 0.6679 | 1.2066 |
1.2336 | 1.9227 | 442000 | 0.6680 | 1.2065 |
1.2393 | 1.9314 | 444000 | 0.6680 | 1.2063 |
1.2393 | 1.9401 | 446000 | 0.6680 | 1.2062 |
1.2454 | 1.9488 | 448000 | 0.6680 | 1.2060 |
1.2429 | 1.9575 | 450000 | 0.6680 | 1.2059 |
1.2477 | 1.9662 | 452000 | 0.6681 | 1.2059 |
1.2356 | 1.9749 | 454000 | 0.6681 | 1.2059 |
1.242 | 1.9836 | 456000 | 0.6681 | 1.2057 |
1.2324 | 1.9923 | 458000 | 0.6681 | 1.2058 |
Framework versions
- Transformers 4.41.2
- Pytorch 2.3.1+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1
- Downloads last month
- 4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for caomingjun/storyteller
Base model
openai-community/gpt2