|
--- |
|
license: bsd-3-clause |
|
language: |
|
- ja |
|
tags: |
|
- vibrato |
|
--- |
|
|
|
# Vibrato Model Archive (BSD License Only) |
|
|
|
This archive contains all models identified under the BSD license from the Vibrato([url](https://github.com/daac-tools/vibrato)) GitHub releases. |
|
Models previously compressed using zstd have been decompressed, allowing for immediate download and use. |
|
|
|
## Available Models |
|
|
|
- bccwj-suw+unidic-cwj-3_1_1+compact-dual |
|
- bccwj-suw+unidic-cwj-3_1_1+compact |
|
- bccwj-suw+unidic-cwj-3_1_1-extracted+compact-dual |
|
- bccwj-suw+unidic-cwj-3_1_1-extracted+compact |
|
- bccwj-suw+unidic-cwj-3_1_1 |
|
- jumandic-mecab-7_0 |
|
- unidic-cwj-3_1_1+compact-dual |
|
- unidic-cwj-3_1_1+compact |
|
- unidic-cwj-3_1_1 |
|
- unidic-mecab-2_1_2 |
|
|
|
## Usage |
|
|
|
```python |
|
from huggingface_hub import hf_hub_download |
|
import vibrato |
|
|
|
# Load tokenizer from `.cache/hf` |
|
model_path = hf_hub_download("ryan-minato/vibrato-models-bsdonly", "<<model_name>>/system.dic") |
|
with open(model_path, "rb") as f: |
|
tokenizer = vibrato.Vibrato(f.read()) |
|
|
|
text = """\ |
|
ใๅๅไบใ ใจ!ใใซใผใณใฏใฉใผใซใๅซใใ ใ |
|
ใไธ็พไบๅไธๅนดใใใฆใใใใ ใใ?ใ |
|
ใไฝๅบฆใๅพนๅบ็ใซๆค็ฎใใพใใใใณใณใใฅใผใฟใๅฟใใใ |
|
ใใพใกใใใชใใใใ็ญใใงใใ็็ดใชใจใใใใฟใชใใใฎใปใใง็ฉถๆฅตใฎ็ๅใไฝใงใใใใใใฃใฆใใชใใฃใใจใใใซๅ้กใใใใฎใงใใ |
|
""" |
|
|
|
tokenizer.tokenize(text) |
|
``` |
|
|