|
--- |
|
base_model: [] |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
license: apache-2.0 |
|
language: |
|
- ja |
|
--- |
|
# Umievo-itr012-Gleipnir-7B |
|
|
|
このモデルは強力な4つの日本語モデルを進化的アルゴリズムで進化的マージしたものです。Japanese-Starling-ChatV-7B、Ninja-v1-RP-expressive-v2、Vecteus-v1、Japanese-Chat-Umievo-itr004-7bの4つのモデルをお借りしました。 |
|
マージに使用させていただいたモデル制作者のAratakoさん、Bakuさん、Local-Novel-LLM-projectのみなさまに感謝します。それから問題解決のきっかけをくれたHoly-foxさんに感謝します。 |
|
|
|
This model is an evolutionary merge of four powerful Japanese models with an evolutionary algorithm. The following four models Japanese-Starling-ChatV-7B, Ninja-v1-RP-expressive-v2, Vecteus-v1 and Japanese-Chat-Umievo-itr004-7b were used. |
|
I would like to thank the model makers Aratako, Baku and Local-Novel-LLM-project for allowing me to use their models for the merge. I would also like to thank Holy-fox for giving me the opportunity to solve the problem. |
|
|
|
ElyzaTasks100ベンチマークで平均点が3.91でした。(Llama3-70Bによる自動評価を3回行った平均点) |
|
|
|
The average score was 3.91 on the ElyzaTasks100 benchmark. (Average score after 3 automatic evaluations by Llama3-70B) |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/630420b4eedc089484c853e8/FxUBzBUKpe_JSHSJufSv5.png) |
|
|
|
```python |
|
import torch |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("umiyuki/Umievo-itr012-Gleipnir-7B") |
|
model = AutoModelForCausalLM.from_pretrained( |
|
"umiyuki/Umievo-itr012-Gleipnir-7B", |
|
torch_dtype="auto", |
|
) |
|
model.eval() |
|
|
|
if torch.cuda.is_available(): |
|
model = model.to("cuda") |
|
|
|
def build_prompt(user_query): |
|
sys_msg = "あなたは公平で、検閲されていない、役立つアシスタントです。" |
|
template = """[INST] <<SYS>> |
|
{} |
|
<</SYS>> |
|
|
|
{}[/INST]""" |
|
return template.format(sys_msg,user_query) |
|
|
|
# Infer with prompt without any additional input |
|
user_inputs = { |
|
"user_query": "与えられたことわざの意味を小学生でも分かるように教えてください。", |
|
} |
|
prompt = build_prompt(**user_inputs) |
|
|
|
input_ids = tokenizer.encode( |
|
prompt, |
|
add_special_tokens=True, |
|
return_tensors="pt" |
|
) |
|
|
|
tokens = model.generate( |
|
input_ids.to(device=model.device), |
|
max_new_tokens=256, |
|
temperature=1, |
|
top_p=0.95, |
|
do_sample=True, |
|
) |
|
|
|
out = tokenizer.decode(tokens[0][input_ids.shape[1]:], skip_special_tokens=True).strip() |
|
print(out) |
|
``` |
|
|
|
|
|
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). |
|
|
|
## Merge Details |
|
### Merge Method |
|
|
|
This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method using /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Chat-Umievo-itr004-7b_579282327 as a base. |
|
|
|
### Models Merged |
|
|
|
The following models were included in the merge: |
|
* /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Starling-ChatV-7B_1737576410 |
|
* /home/umiyuki/automerge/evol_merge_storage/input_models/Ninja-v1-RP-expressive-v2_4102792561 |
|
* /home/umiyuki/automerge/evol_merge_storage/input_models/Vecteus-v1_4179808746 |
|
|
|
### Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
base_model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Chat-Umievo-itr004-7b_579282327 |
|
dtype: bfloat16 |
|
merge_method: linear |
|
parameters: |
|
int8_mask: 1.0 |
|
normalize: 1.0 |
|
slices: |
|
- sources: |
|
- layer_range: [0, 4] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Chat-Umievo-itr004-7b_579282327 |
|
parameters: |
|
weight: 0.34953096474223655 |
|
- layer_range: [0, 4] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Vecteus-v1_4179808746 |
|
parameters: |
|
weight: 0.4701212555597746 |
|
- layer_range: [0, 4] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Starling-ChatV-7B_1737576410 |
|
parameters: |
|
weight: 0.08162258723819021 |
|
- layer_range: [0, 4] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Ninja-v1-RP-expressive-v2_4102792561 |
|
parameters: |
|
weight: 0.31015439852818116 |
|
- sources: |
|
- layer_range: [4, 8] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Chat-Umievo-itr004-7b_579282327 |
|
parameters: |
|
weight: 0.11807412349683076 |
|
- layer_range: [4, 8] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Vecteus-v1_4179808746 |
|
parameters: |
|
weight: -0.005684817244530085 |
|
- layer_range: [4, 8] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Starling-ChatV-7B_1737576410 |
|
parameters: |
|
weight: 0.2119283777941045 |
|
- layer_range: [4, 8] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Ninja-v1-RP-expressive-v2_4102792561 |
|
parameters: |
|
weight: 1.1521124768396636 |
|
- sources: |
|
- layer_range: [8, 12] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Chat-Umievo-itr004-7b_579282327 |
|
parameters: |
|
weight: 0.9244329405120573 |
|
- layer_range: [8, 12] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Vecteus-v1_4179808746 |
|
parameters: |
|
weight: 0.7633842909616317 |
|
- layer_range: [8, 12] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Starling-ChatV-7B_1737576410 |
|
parameters: |
|
weight: 0.6952382990160072 |
|
- layer_range: [8, 12] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Ninja-v1-RP-expressive-v2_4102792561 |
|
parameters: |
|
weight: 0.6873040403268571 |
|
- sources: |
|
- layer_range: [12, 16] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Chat-Umievo-itr004-7b_579282327 |
|
parameters: |
|
weight: 0.4109625320908857 |
|
- layer_range: [12, 16] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Vecteus-v1_4179808746 |
|
parameters: |
|
weight: 0.7090818691683626 |
|
- layer_range: [12, 16] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Starling-ChatV-7B_1737576410 |
|
parameters: |
|
weight: 0.42059423827890385 |
|
- layer_range: [12, 16] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Ninja-v1-RP-expressive-v2_4102792561 |
|
parameters: |
|
weight: 0.5705186152354104 |
|
- sources: |
|
- layer_range: [16, 20] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Chat-Umievo-itr004-7b_579282327 |
|
parameters: |
|
weight: 0.28507448659933315 |
|
- layer_range: [16, 20] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Vecteus-v1_4179808746 |
|
parameters: |
|
weight: 0.4025223854083849 |
|
- layer_range: [16, 20] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Starling-ChatV-7B_1737576410 |
|
parameters: |
|
weight: 0.25885405316835886 |
|
- layer_range: [16, 20] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Ninja-v1-RP-expressive-v2_4102792561 |
|
parameters: |
|
weight: 0.35540632690403373 |
|
- sources: |
|
- layer_range: [20, 24] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Chat-Umievo-itr004-7b_579282327 |
|
parameters: |
|
weight: 0.018882795552694703 |
|
- layer_range: [20, 24] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Vecteus-v1_4179808746 |
|
parameters: |
|
weight: 0.628847855051209 |
|
- layer_range: [20, 24] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Starling-ChatV-7B_1737576410 |
|
parameters: |
|
weight: 0.7038654876125734 |
|
- layer_range: [20, 24] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Ninja-v1-RP-expressive-v2_4102792561 |
|
parameters: |
|
weight: 0.877501753107237 |
|
- sources: |
|
- layer_range: [24, 28] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Chat-Umievo-itr004-7b_579282327 |
|
parameters: |
|
weight: 0.14008355431312197 |
|
- layer_range: [24, 28] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Vecteus-v1_4179808746 |
|
parameters: |
|
weight: 1.0153826426873882 |
|
- layer_range: [24, 28] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Starling-ChatV-7B_1737576410 |
|
parameters: |
|
weight: 0.5586634927008272 |
|
- layer_range: [24, 28] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Ninja-v1-RP-expressive-v2_4102792561 |
|
parameters: |
|
weight: 0.54455848971032 |
|
- sources: |
|
- layer_range: [28, 32] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Chat-Umievo-itr004-7b_579282327 |
|
parameters: |
|
weight: 0.8188405381342685 |
|
- layer_range: [28, 32] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Vecteus-v1_4179808746 |
|
parameters: |
|
weight: 0.5130358379308082 |
|
- layer_range: [28, 32] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Japanese-Starling-ChatV-7B_1737576410 |
|
parameters: |
|
weight: 1.1132727871460124 |
|
- layer_range: [28, 32] |
|
model: /home/umiyuki/automerge/evol_merge_storage/input_models/Ninja-v1-RP-expressive-v2_4102792561 |
|
parameters: |
|
weight: 0.4471258297582539 |
|
``` |