izumi-lab
/

stormy-7b-10ep

Model card Files Files and versions Community

stormy-7b-10ep / README.md

retarfi's picture

Update README.md

41cf1e0 over 1 year ago

|

1.38 kB

	---
	license: cc-by-sa-4.0
	datasets:
	- izumi-lab/llm-japanese-dataset-vanilla
	language:
	- ja
	tags:
	- gpt_neox
	- japanese
	- causal-lm
	---

	This repo contains a low-rank adapter for [CALM](https://huggingface.co/cyberagent/open-calm-7b)
	fit on the dataset specially extracted from [llm-japanese-dataset](https://github.com/masanorihirano/llm-japanese-dataset).

	You can test this at https://huggingface.co/spaces/izumi-lab/stormy-7b-10ep

	This version of the weights was trained with the following hyperparameters:

	- Epochs: 10
	- Batch size: 128
	- Cutoff length: 300
	- Learning rate: 3e-4
	- Lora _r_: 4
	- Lora target modules: query_key_value

	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel

	base_model = "cyberagent/open-calm-7b"
	model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16)
	tokenizer = AutoTokenizer.from_pretrained(base_model)
	model = PeftModel.from_pretrained(
	model,
	"izumi-lab/stormy-7b-10ep",
	torch_dtype=torch.float16,
	)
	```

	To see more latest information, please go to [llm.msuzuki.me](https://llm.msuzuki.me).

	## Details

	- Japanese Paper:
	- English Paper:
	- Website: [llm.msuzuki.me](https://llm.msuzuki.me).

	Citation: TBD

	If you have any inquiries, such as joint research, data provision, various types of support, please email izumi-llm@socsim.org .