RioShiina
/

Llama-3.1-Swallow-8B-Instruct-v0.3-exl2

Model card Files Files and versions Community

Llama-3.1-Swallow-8B-Instruct-v0.3-exl2 / README.md

RioShiina's picture

Update README.md

f081638 verified 26 days ago

|

history blame contribute delete

2.95 kB

	---
	base_model: tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3
	base_model_relation: quantized
	license:
	- llama3.1
	- gemma
	language:
	- ja
	- en
	---

	[4.0bpw](https://huggingface.co/rioshiina/Llama-3.1-Swallow-8B-Instruct-v0.3-exl2/tree/4.0bpw)
	[5.0bpw](https://huggingface.co/rioshiina/Llama-3.1-Swallow-8B-Instruct-v0.3-exl2/tree/5.0bpw)
	[6.0bpw](https://huggingface.co/rioshiina/Llama-3.1-Swallow-8B-Instruct-v0.3-exl2/tree/6.0bpw)
	[7.0bpw](https://huggingface.co/rioshiina/Llama-3.1-Swallow-8B-Instruct-v0.3-exl2/tree/7.0bpw)
	[8.0bpw](https://huggingface.co/rioshiina/Llama-3.1-Swallow-8B-Instruct-v0.3-exl2/tree/8.0bpw)

	# Llama-3.1-Swallow-8B-Instruct-v0.3-exl2
	- Model creator: [tokyotech-llm](https://huggingface.co/tokyotech-llm)
	- Original model: [Llama-3.1-Swallow-8B-Instruct-v0.3](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3)

	### License

	[META LLAMA 3.1 COMMUNITY LICENSE](https://www.llama.com/llama3_1/license/) and [Gemma Terms of Use](https://ai.google.dev/gemma/terms)

	## Prompt template

	```
	<\|begin_of_text\|><\|start_header_id\|>system<\|end_header_id\|>

	あなたは誠実で優秀な日本人のアシスタントです。<\|eot_id\|><\|start_header_id\|>user<\|end_header_id\|>

	東京の紅葉した公園で、東京タワーと高層ビルを背景に、空を舞うツバメと草地に佇むラマが出会う温かな物語を書いてください。<\|eot_id\|><\|start_header_id\|>assistant<\|end_header_id\|>

	```

	### Citations

	```tex
	@inproceedings{Fujii:COLM2024,
	title={Continual Pre-Training for Cross-Lingual LLM Adaptation:
	Enhancing Japanese Language Capabilities},
	author={Kazuki Fujii and Taishi Nakamura and Mengsay Loem and Hiroki
	Iida and Masanari Ohi and Kakeru Hattori and Hirai Shota and Sakae
	Mizuki and Rio Yokota and Naoaki Okazaki},
	booktitle="Proceedings of the First Conference on Language Modeling",
	series={COLM},
	pages="(to appear)",
	year="2024",
	month=oct,
	address={University of Pennsylvania, USA},
	}

	@inproceedings{Okazaki:COLM2024,
	title={Building a Large Japanese Web Corpus for Large Language Models},
	author={Naoaki Okazaki and Kakeru Hattori and Hirai Shota and Hiroki
	Iida and Masanari Ohi and Kazuki Fujii and Taishi Nakamura and Mengsay
	Loem and Rio Yokota and Sakae Mizuki},
	booktitle="Proceedings of the First Conference on Language Modeling",
	series={COLM},
	pages="(to appear)",
	year="2024",
	month=oct,
	address={University of Pennsylvania, USA},
	}

	@misc{dubey2024llama3herdmodels,
	title={The Llama 3 Herd of Models},
	author={Abhimanyu Dubey and Abhinav Jauhri and Abhinav Pandey and Abhishek Kadian and Ahmad Al-Dahle and Aiesha Letman and Akhil Mathur and Alan Schelten and Amy Yang and Angela Fan et al.},
	year={2024},
	eprint={2407.21783},
	archivePrefix={arXiv},
	primaryClass={cs.AI},
	url={https://arxiv.org/abs/2407.21783},
	}

	```