File size: 1,371 Bytes
7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e 7cdce4e ae0391e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 |
# text_generation_bangla_model
BanglaCLM dataset:
- OSCAR: 12.84GB
- Wikipedia dump: 6.24GB
- ProthomAlo: 3.92GB
- Kalerkantho: 3.24GB
## Model description
- context size : 128
## Training and evaluation data
The BanglaCLM data set is divided into a training set (90%)and a validation set (10%).
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- Batch size: 32
- Initial learning rate: 5e-5
- Number of warmup steps: 10000
- Weight decay rate: 0.01
- Tokenization algorithm: BPE
- Vocabulary size of tokenizer: 50256
- Total trainable params: 124,439,808
- Epochs: 40
- Number of training steps: 40772228
- training_precision: float32
### Training results
perplexity score: 2.86.
### Framework versions
- Transformers 4.26.1
- TensorFlow 2.11.0
- Datasets 2.10.0
- Tokenizers 0.13.2
### Citation
If you find this model helpful, please cite.
```
@INPROCEEDINGS{10303383,
author={Salim, Md. Shahidul and Murad, Hasan and Das, Dola and Ahmed, Faisal},
booktitle={2023 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD)},
title={BanglaGPT: A Generative Pretrained Transformer-Based Model for Bangla Language},
year={2023},
volume={},
number={},
pages={56-59},
doi={10.1109/ICICT4SD59951.2023.10303383}}
```
|