|
--- |
|
license: mit |
|
datasets: |
|
- Calvin-Xu/FLFL-Aozora-Speech-Train |
|
language: |
|
- ja |
|
metrics: |
|
- sacrebleu |
|
pipeline_tag: text2text-generation |
|
--- |
|
|
|
# FLFL ใใชใใช |
|
|
|
Furigana (ruby) generation model. |
|
|
|
```python |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
torch_dtype = torch.bfloat16 if torch.cuda.is_available() and hasattr(torch.cuda, "is_bf16_supported") and torch.cuda.is_bf16_supported() else torch.float16 |
|
model = AutoModelForCausalLM.from_pretrained("Calvin-Xu/FLFL", device_map="auto", torch_dtype=torch_dtype) |
|
tokenizer = AutoTokenizer.from_pretrained("Calvin-Xu/FLFL") |
|
|
|
prompt_template = """[INST] {instruction}\n{input}\n[/INST]\n""" |
|
sentence = "ๅฝๅขใฎ้ทใใใณใใซใๆใใใจ้ชๅฝใงใใฃใ" |
|
|
|
inputs = tokenizer(prompt_template.format(instruction="ๆฌกใฎๆใซๆญฃ็ขบใซๆฏใไปฎๅใไปใใฆใใ ใใ", input=sentence), return_tensors="pt").to(model.device) |
|
with torch.no_grad(): |
|
tokens = model.generate(**inputs, max_new_tokens=512, do_sample=False) |
|
|
|
output = tokenizer.decode(tokens[0], skip_special_tokens=False) |
|
print(output) |
|
# <ruby>ๅฝๅข<rt>ใใซใใใ</rt></ruby>ใฎ<ruby>้ท<rt>ใชใ</rt></ruby>ใใใณใใซใ<ruby>ๆ<rt>ใฌ</rt></ruby>ใใใจ<ruby>้ชๅฝ<rt>ใใใใซ</rt></ruby>ใงใใฃใ<|endoftext|> |
|
``` |
|
|
|
### Finetuned from |
|
[stockmark/gpt-neox-japanese-1.4b](https://huggingface.co/stockmark/gpt-neox-japanese-1.4b) |
|
|
|
### Training Dataset |
|
|
|
Trained for slightly over one epoch on [Calvin-Xu/FLFL-Aozora-Speech-Train](https://huggingface.co/datasets/Calvin-Xu/FLFL-Aozora-Speech-Train) |
|
|
|
### Training Settings |
|
|
|
HuggingFace Trainer, PEFT (r=64, alpha=128) |
|
|
|
Control tokens added: `[INST]`, ` [/INST]`, `<ruby>`, `</ruby>`, `<rt>`, `</rt>` |
|
|
|
### Output Examples |
|
|
|
``` |
|
[INST] ๆฌกใฎๆใซๆญฃ็ขบใซๆฏใไปฎๅใไปใใฆใใ ใใ |
|
|
|
ๅฝๅขใฎ้ทใใใณใใซใๆใใใจ้ชๅฝใงใใฃใ |
|
|
|
[/INST] |
|
|
|
<ruby>ๅฝๅข<rt>ใใซใใใ</rt></ruby>ใฎ<ruby>้ท<rt>ใชใ</rt></ruby>ใใใณใใซใ<ruby>ๆ<rt>ใฌ</rt></ruby>ใใใจ<ruby>้ชๅฝ<rt>ใใใใซ</rt></ruby>ใงใใฃใ<|endoftext|> |
|
``` |
|
|
|
- <ruby>้ฐค<rt>ใถใ</rt></ruby>ใฎ<ruby>็
ง<rt>ใฆ</rt></ruby>ใ<ruby>็ผ<rt>ใ</rt></ruby>ใใ<ruby>ๅ
ซๅฎ่<rt>ใฏใฃใฝใใใ</rt></ruby>ใใใณใใผใฐใ<|endoftext|> |
|
|
|
- <ruby>ไธป่<rt>ใใ
ใใ</rt></ruby><ruby>้ข้ฃ<rt>ใใใใ</rt></ruby>ใฏใ<ruby>่ฆไบ<rt>ใฟใใจ</rt></ruby>ใชใพใงใฎ<ruby>ๅๆด<rt>ใใใ</rt></ruby><ruby>ไธญ<rt>ใกใ
ใ</rt></ruby><ruby>ๆ่กท<rt>ใใฃใกใ
ใ</rt></ruby>ใ<|endoftext|> |
|
|
|
- <ruby>ๅฅ<rt>ในใค</rt></ruby>ใฎ<ruby>่
<rt>ใใฎ</rt></ruby>ใฎ<ruby>็ฎ<rt>ใ</rt></ruby>ใ<ruby>้<rt>ใคใ</rt></ruby>ใใฆ<ruby>ๆญดๅฒ<rt>ใใใ</rt></ruby>ใ<ruby>ๅฃ้่ฆ<rt>ใใใพใฟ</rt></ruby>ใใใใจใฏใ<ruby>ๆณๅ<rt>ใใใใ</rt></ruby>ใ<ruby>่ถ
<rt>ใ</rt></ruby>ใใ<ruby>ไฝ้จ<rt>ใใใใ</rt></ruby>ใซ<ruby>้<rt>ใกใ</rt></ruby>ใใชใ!<|endoftext|> |
|
|
|
- <ruby>ๆญข<rt>ใจ</rt></ruby>ใใใชใใใใฎ<ruby>ๅคงๆฌ<rt>ใใใใจ</rt></ruby>ใ<ruby>ๆ น็ตถ<rt>ใญใ </rt></ruby>ใใใซใใชใใจ<ruby>ๅนๆ<rt>ใใใ</rt></ruby>ใใชใใ<|endoftext|> |
|
|
|
- <ruby>ไธไบบๆฐ<rt>ใตใซใใ</rt></ruby><ruby>้ๆ<rt>ใใถ</rt></ruby>ใงใใ<ruby>ไปฅไธ<rt>ใใใใ</rt></ruby><ruby>ไพกๅค<rt>ใใก</rt></ruby>ใ<ruby>ไธ<rt>ใ</rt></ruby>ใใใใใชใใใใใปใจใใฉ<ruby>ๅบๅค<rt>ใใใญ</rt></ruby>ใ <|endoftext|> |
|
|
|
- <ruby>ๆ้<rt>ใใใ</rt></ruby>ใฎ<ruby>ๆพฑ<rt>ใใ</rt></ruby>ใฎ<ruby>ไธญ<rt>ใชใ</rt></ruby>ใซ<ruby>ๆฒๆฎฟ<rt>ใกใใใ</rt></ruby>ใใฆใใใใใ ใ<|endoftext|> |