ryan-minato's picture
Update README.md
2922974 verified
metadata
license: bsd-3-clause
language:
  - ja
tags:
  - vibrato

Vibrato Model Archive (BSD License Only)

This archive contains all models identified under the BSD license from the Vibrato(url) GitHub releases. Models previously compressed using zstd have been decompressed, allowing for immediate download and use.

Available Models

  • bccwj-suw+unidic-cwj-3_1_1+compact-dual
  • bccwj-suw+unidic-cwj-3_1_1+compact
  • bccwj-suw+unidic-cwj-3_1_1-extracted+compact-dual
  • bccwj-suw+unidic-cwj-3_1_1-extracted+compact
  • bccwj-suw+unidic-cwj-3_1_1
  • jumandic-mecab-7_0
  • unidic-cwj-3_1_1+compact-dual
  • unidic-cwj-3_1_1+compact
  • unidic-cwj-3_1_1
  • unidic-mecab-2_1_2

Usage

from huggingface_hub import hf_hub_download
import vibrato

# Load tokenizer from `.cache/hf`
model_path = hf_hub_download("ryan-minato/vibrato-models-bsdonly", "<<model_name>>/system.dic")
with open(model_path, "rb") as f:
  tokenizer = vibrato.Vibrato(f.read())

text = """\
ใ€Œๅ››ๅไบŒใ ใจ!ใ€ใƒซใƒผใƒณใ‚ฏใ‚ฉใƒผใƒซใŒๅซใ‚“ใ ใ€‚
ใ€Œไธƒ็™พไบ”ๅไธ‡ๅนดใ‹ใ‘ใฆใ€ใใ‚Œใ ใ‘ใ‹?ใ€
ใ€Œไฝ•ๅบฆใ‚‚ๅพนๅบ•็š„ใซๆคœ็ฎ—ใ—ใพใ—ใŸใ€ใ‚ณใƒณใƒ”ใƒฅใƒผใ‚ฟใŒๅฟœใ˜ใŸใ€‚
ใ€ŒใพใกใŒใ„ใชใใใ‚ŒใŒ็ญ”ใˆใงใ™ใ€‚็Ž‡็›ดใชใจใ“ใ‚ใ€ใฟใชใ•ใ‚“ใฎใปใ†ใง็ฉถๆฅตใฎ็–‘ๅ•ใŒไฝ•ใงใ‚ใ‚‹ใ‹ใ‚ใ‹ใฃใฆใ„ใชใ‹ใฃใŸใจใ“ใ‚ใซๅ•้กŒใŒใ‚ใ‚‹ใฎใงใ™ใ€
"""

tokenizer.tokenize(text)