shisa-v1 - a shisa-ai Collection

shisa-ai 's Collections

updated May 27

JA/EN Bilingual LLMs

shisa-ai/shisa-v1-llama3-70b

Text Generation • Updated May 26 • 13 • 3

Note 2024-05: The shisa-v1 dataset applied to Llama 3 Instruct 70B outperforms gpt-3.5-turbo
shisa-ai/shisa-v1-llama3-8b

Text Generation • Updated May 25 • 262 • 3

Note 2024-05: The shisa-v1 dataset applied to Llama 3 Instruct 8B leads to significantly improved performance
augmxnt/shisa-gamma-7b-v1

Text Generation • Updated May 19 • 10.8k • 15

Note 2023-12: A version using the shisa-v1 dataset applied to Japanese Stable LM Base Gamma 7B. Less tokenizer efficiency, but better overall performance
augmxnt/shisa-7b-v1

Text Generation • Updated Dec 20, 2023 • 1.4k • 29

Note 2023-12: In addition to SFT, this also underwent a DPO round which improved human preference rating
augmxnt/ultra-orca-boros-en-ja-v1

Viewer • Updated Dec 6, 2023 • 188k • 265 • 10

Note Largely synthetic dataset combining Airoboros, Ultrachat, Orca in JA and EN. Also, the Jaster train set
augmxnt/shisa-base-7b-v1

Text Generation • Updated Dec 9, 2023 • 1.38k • 16

Note 2023-12: A continued pre-train (8B 90% JA tokens) of Mistral 7B v0.1 w/ tokenizer extension; probably needs 10B more tokens of pretraining tbt
augmxnt/shisa-pretrain-en-ja-v1

Viewer • Updated Dec 5, 2023 • 4.7M • 56 • 7