beomi
/

KoAlpaca-Polyglot-12.8B

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

beomi commited on Apr 19, 2023

Commit

4b1bc0e

•

1 Parent(s): a1edf0e

Update README.md

Files changed (1) hide show

README.md +11 -21

README.md CHANGED Viewed

@@ -2,29 +2,23 @@
 license: apache-2.0
 tags:
 - generated_from_trainer
 model-index:
 - name: polyglot-12.8b-koalpaca-v1.1b
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # polyglot-12.8b-koalpaca-v1.1b
-This model is a fine-tuned version of [EleutherAI/polyglot-ko-12.8b](https://huggingface.co/EleutherAI/polyglot-ko-12.8b) on an unknown dataset.
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
@@ -35,7 +29,7 @@ The following hyperparameters were used during training:
 - train_batch_size: 1
 - eval_batch_size: 8
 - seed: 42
-- distributed_type: multi-GPU
 - num_devices: 4
 - gradient_accumulation_steps: 64
 - total_train_batch_size: 256
@@ -44,13 +38,9 @@ The following hyperparameters were used during training:
 - lr_scheduler_type: linear
 - num_epochs: 2.0
-### Training results
 ### Framework versions
 - Transformers 4.28.1
 - Pytorch 2.0.0+cu117
 - Datasets 2.11.0
-- Tokenizers 0.13.3

 license: apache-2.0
 tags:
 - generated_from_trainer
+- polyglot-ko
+- gpt-neox
+- KoAlpaca
 model-index:
 - name: polyglot-12.8b-koalpaca-v1.1b
   results: []
+language:
+- ko
+datasets:
+- KoAlpaca-v1.1b
+pipeline_tag: text-generation
 ---
 # polyglot-12.8b-koalpaca-v1.1b
+This model is a fine-tuned version of [EleutherAI/polyglot-ko-12.8b](https://huggingface.co/EleutherAI/polyglot-ko-12.8b) on a KoAlpaca Dataset v1.1b
 ## Training procedure
 - train_batch_size: 1
 - eval_batch_size: 8
 - seed: 42
+- distributed_type: multi-GPU (A100 80G)
 - num_devices: 4
 - gradient_accumulation_steps: 64
 - total_train_batch_size: 256
 - lr_scheduler_type: linear
 - num_epochs: 2.0
 ### Framework versions
 - Transformers 4.28.1
 - Pytorch 2.0.0+cu117
 - Datasets 2.11.0
+- Tokenizers 0.13.3