Japanese
gpt_neox
japanese
causal-lm
masanorihirano commited on
Commit
ef0ee2e
1 Parent(s): ce21aa3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -0
README.md CHANGED
@@ -1,3 +1,51 @@
1
  ---
2
  license: cc-by-sa-4.0
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-sa-4.0
3
+ datasets:
4
+ - izumi-lab/llm-japanese-dataset
5
+ language:
6
+ - ja
7
+ tags:
8
+ - llama
9
+ - causal-lm
10
  ---
11
+
12
+ This repo contains a low-rank adapter for [CALM](https://huggingface.co/cyberagent/open-calm-7b)
13
+ fit on the dataset specially extracted from [llm-japanese-dataset](https://github.com/masanorihirano/llm-japanese-dataset).
14
+
15
+ You can test this at https://huggingface.co/spaces/izumi-lab/stormy-7b-10ep
16
+
17
+ This version of the weights was trained with the following hyperparameters:
18
+
19
+ - Epochs: 10
20
+ - Batch size: 130
21
+ - Cutoff length: 256
22
+ - Learning rate: 3e-4
23
+ - Lora _r_: 4
24
+ - Lora target modules: q_proj, v_proj
25
+
26
+ ```python
27
+ import torch
28
+ from transformers import AutoModelForCausalLM, AutoTokenizer
29
+ from peft import PeftModel
30
+
31
+ base_model = "cyberagent/open-calm-7b"
32
+ model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16)
33
+ tokenizer = AutoTokenizer.from_pretrained(base_model)
34
+ model = PeftModel.from_pretrained(
35
+ model,
36
+ "izumi-lab/stormy-7b-10ep",
37
+ torch_dtype=torch.float16,
38
+ )
39
+ ```
40
+
41
+ To see more latest information, please go to [llm.msuzuki.me](https://llm.msuzuki.me).
42
+
43
+ ## Details
44
+
45
+ - Japanese Paper:
46
+ - English Paper:
47
+ - Website: [llm.msuzuki.me](https://llm.msuzuki.me).
48
+
49
+ Citation: TBD
50
+
51
+ If you have any inquiries, such as joint research, data provision, various types of support, please email izumi-llm@socsim.org .