kogpt2-base-v2 / README.md
nazneen's picture
model documentation
571e897
|
raw
history blame
4.89 kB
---
language: ko
license: cc-by-nc-sa-4.0
tags:
- gpt2
---
# Model Card for kogpt2-base-v2
# Model Details
## Model Description
[GPT-2](https://openai.com/blog/better-language-models/)λŠ” 주어진 ν…μŠ€νŠΈμ˜ λ‹€μŒ 단어λ₯Ό 잘 μ˜ˆμΈ‘ν•  수 μžˆλ„λ‘ ν•™μŠ΅λœ μ–Έμ–΄λͺ¨λΈμ΄λ©° λ¬Έμž₯ 생성에 μ΅œμ ν™” λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€. `KoGPT2`λŠ” λΆ€μ‘±ν•œ ν•œκ΅­μ–΄ μ„±λŠ₯을 κ·Ήλ³΅ν•˜κΈ° μœ„ν•΄ 40GB μ΄μƒμ˜ ν…μŠ€νŠΈλ‘œ ν•™μŠ΅λœ ν•œκ΅­μ–΄ 디코더(`decoder`) μ–Έμ–΄λͺ¨λΈμž…λ‹ˆλ‹€.
- **Developed by:** SK Telecom
- **Shared by [Optional]:** SK Telecom
- **Model type:** Text Generation
- **Language(s) (NLP):** Korean
- **License:** cc-by-nc-sa-4.0
- **Parent Model:** GPT-2
- **Resources for more information:**
- [GitHub Repo](https://github.com/SKT-AI/KoGPT2/tree/master)
- [Model Demo Space](https://huggingface.co/spaces/gogamza/kogpt2-base-v2)
# Uses
## Direct Use
This model can be used for the task of Text Generation
## Downstream Use [Optional]
More information needed.
## Out-of-Scope Use
The model should not be used to intentionally create hostile or alienating environments for people.
# Bias, Risks, and Limitations
Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.
## Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
# Training Details
## Training Data
The model authors also note in the [GitHub Repo](https://github.com/SKT-AI/KoGPT2/tree/master):
[`tokenizers`](https://github.com/huggingface/tokenizers) νŒ¨ν‚€μ§€μ˜ `Character BPE tokenizer`둜 ν•™μŠ΅λ˜μ—ˆμŠ΅λ‹ˆλ‹€.
사전 ν¬κΈ°λŠ” 51,200 이며 λŒ€ν™”μ— 자주 μ“°μ΄λŠ” μ•„λž˜μ™€ 같은 이λͺ¨ν‹°μ½˜, 이λͺ¨μ§€ 등을 μΆ”κ°€ν•˜μ—¬ ν•΄λ‹Ή ν† ν°μ˜ 인식 λŠ₯λ ₯을 μ˜¬λ ΈμŠ΅λ‹ˆλ‹€.
> πŸ˜€, 😁, πŸ˜†, πŸ˜…, 🀣, .. , `:-)`, `:)`, `-)`, `(-:`...
[ν•œκ΅­μ–΄ μœ„ν‚€ λ°±κ³Ό](https://ko.wikipedia.org/) 이외, λ‰΄μŠ€, [λͺ¨λ‘μ˜ λ§λ­‰μΉ˜ v1.0](https://corpus.korean.go.kr/), [μ²­μ™€λŒ€ ꡭ민청원](https://github.com/akngs/petitions) λ“±μ˜ λ‹€μ–‘ν•œ 데이터가 λͺ¨λΈ ν•™μŠ΅μ— μ‚¬μš©λ˜μ—ˆμŠ΅λ‹ˆλ‹€.
## Training Procedure
### Preprocessing
More information needed
### Speeds, Sizes, Times
| Model | # of params | Type | # of layers | # of heads | ffn_dim | hidden_dims |
|--------------|:----:|:-------:|--------:|--------:|--------:|--------------:|
| `kogpt2-base-v2` | 125M | Decoder | 12 | 12 | 3072 | 768 |
# Evaluation
## Testing Data, Factors & Metrics
### Testing Data
More information needed
### Factors
More information needed
### Metrics
More information needed
## Results
### Classification or Regression
| | [NSMC](https://github.com/e9t/nsmc)(acc) | [KorSTS](https://github.com/kakaobrain/KorNLUDatasets)(spearman) |
|---|---|---|
| **KoGPT2 2.0** | 89.1 | 77.8 |
# Model Examination
More information needed
# Environmental Impact
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
- **Hardware Type:** More information needed
- **Hours used:** More information needed
- **Cloud Provider:** More information needed
- **Compute Region:** More information needed
- **Carbon Emitted:** More information needed
# Technical Specifications [optional]
## Model Architecture and Objective
More information needed
## Compute Infrastructure
More information needed
### Hardware
More information needed
### Software
More information needed.
# Citation
**BibTeX:**
More information needed
# Glossary [optional]
More information needed
# More Information [optional]
More information needed
# Model Card Authors [optional]
SK Telecom in collaboration with Ezi Ozoani and the Hugging Face team
# Model Card Contact
The model authors also note in the [GitHub Repo](https://github.com/SKT-AI/KoGPT2/tree/master)
> `KoGPT2` κ΄€λ ¨ μ΄μŠˆλŠ” [이곳](https://github.com/SKT-AI/KoGPT2/issues)에 μ˜¬λ €μ£Όμ„Έμš”.
# How to Get Started with the Model
Use the code below to get started with the model.
<details>
<summary> Click to expand </summary>
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("skt/kogpt2-base-v2")
model = AutoModelForCausalLM.from_pretrained("skt/kogpt2-base-v2")
```
</details>