--- tags: - generated_from_trainer model-index: - name: 100Kopenwebtextgptlite_OpenWebText100K results: [] datasets: - Elriggs/openwebtext-100k language: - en pipeline_tag: text-generation --- # 100Kopenwebtextgptlite_OpenWebText100K This model has trained on [this](https://huggingface.co/datasets/Elriggs/openwebtext-100k) dataset It achieves the following results on the evaluation set: - Loss: 5.3490 ## Model description The model use GPT 2 Architecture ## Intended uses & limitations The limitation of this model is that the loss is still quite high, so this model is not suitable for text use. ## Training and evaluation data You can check [this](https://wandb.ai/111202113467/huggingface/runs/55wdfxkx/workspace?workspace=user-111202113467) graph for training and eval loss ## Training procedure - Dataset: `openwebtext100k` - Training Hardware: 2x T4 GPU from Kaggle - Download dataset -> setup hyperparams -> training ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 32 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 5 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:-----:|:---------------:| | 7.2717 | 0.36 | 1000 | 6.6089 | | 6.4412 | 0.71 | 2000 | 6.2425 | | 6.1733 | 1.07 | 3000 | 6.0212 | | 5.9827 | 1.42 | 4000 | 5.8614 | | 5.8549 | 1.78 | 5000 | 5.7380 | | 5.7444 | 2.13 | 6000 | 5.6440 | | 5.6548 | 2.49 | 7000 | 5.5686 | | 5.5952 | 2.84 | 8000 | 5.5093 | | 5.5363 | 3.2 | 9000 | 5.4604 | | 5.4867 | 3.55 | 10000 | 5.4216 | | 5.4578 | 3.91 | 11000 | 5.3911 | | 5.4288 | 4.27 | 12000 | 5.3697 | | 5.4082 | 4.62 | 13000 | 5.3555 | | 5.4009 | 4.98 | 14000 | 5.3490 | ### Framework versions - Transformers 4.38.1 - Pytorch 2.1.2 - Datasets 2.17.1 - Tokenizers 0.15.1