LLaMAX
/

LLaMAX2-7B-MetaMath

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

LLaMAX2-7B-MetaMath / README.md

Lego-MT's picture

First model version

79364c3 7 months ago

|

3.36 kB


	### Model Sources
	Paper: LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages

	Link: https://arxiv.org/pdf/2407

	### Model Description

	🔥 LLaMAX-7B-MetaMath is fully fine-tuned on the MetaMathQA dataset based on the powerful multilingual model LLaMAX-7B.

	🔥 Compared with the [MetaMath-7B](https://huggingface.co/meta-math/MetaMath-7B-V1.0), LLaMAX-7B-MetaMath performs significantly better in mathematical reasoning in low-resource languages, improving the average accuracy of low-resource languages on MGSM dataset by up to 18.8%.

	🔥 LLaMAX-7B-MetaMath demonstrates good multilingual math reasoning capability in all languages, improving the average accuracy by 6.2% across all languages in MGSM dataset.

	### Model Usage

	Prompt template:
	```angular2html
	def Prompt_template(query):
	prompt = (
	"Below is an instruction that describes a task. "
	"Write a response that appropriately completes the request.\n\n"
	f"### Instruction:\n{query}\n\n### Response: Let's think step by step."
	)
	return prompt
	```

	Code Example:
	```angular2html
	from transformers import AutoTokenizer, LlamaForCausalLM

	model = LlamaForCausalLM.from_pretrained(PATH_TO_CONVERTED_WEIGHTS)
	tokenizer = AutoTokenizer.from_pretrained(PATH_TO_CONVERTED_TOKENIZER)

	query = "Bert fills out the daily crossword puzzle in the newspaper every day. He uses a pencil to fill out the puzzles every two weeks. On average, it takes him 1050 words to use up a pencil. How many words are in each crossword puzzle on average?"
	prompt = Prompt_template(query)
	inputs = tokenizer(prompt, return_tensors="pt")

	generate_ids = model.generate(inputs.input_ids, max_length=30)
	tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]

	# => "If Bert uses up a pencil to fill out the puzzles every two weeks and it takes him 1050
	words to use up a pencil, then he must be filling out 1050 words of crossword puzzles every
	two weeks. To find out how many words are in each daily crossword puzzle, we need to divide
	the total number of words (1050) by the number of days in two weeks (14). So, there are
	1050/14 = 75 words in each daily crossword puzzle on average. #### The answer is: 75“
	```
	### Experiments
	We evaluated LLaMAX-7B-MetaMath on the MGSM dataset. Compared with MetaMath-7B, LLaMAX-7B-MetaMath achieves a leading on both high-resource languages (Hrl.) and low-resource languages (Lrl.).

	\| MGSM \| Bn \| Th \| Sw \| Ja \| Zh \| De \| Fr \| Ru \| Es \| En \| Lrl. \| Hrl. \| Avg. \|
	\|-----------------------------\|-------\|------\|----\|-------\|------\|----\|----\|------\|----\|----\|------\|------\|--------\|
	\| MetaMath-7B (official) \| 6.8 \| 7.2 \|6.8\| 36.4 \| 38.4 \| 55.2\|54.4\| 52.0 \|57.2\|68.8\| 6.9 \| 51.8 \| 38.32 \|
	\| MetaMath-7B (Reproduced) \| 6.0 \| 10.0 \|4.4\|36.4\|42.8\|52.8\|56.0\|48.8\|58.8\|64.8\| 6.8 \| 51.5 \| 38.08 \|
	\| LLaMAX-7B-MetaMath \|26.8\| 24.0 \|26.0\|35.6\|42.4\|56.8\|55.2\|53.6\|56.8\|65.6\| 25.6 \| 52.3 \| 44.28 \|

	### Citation
	if our model helps your work, please cite this paper:

	```
	@inproceedings{Huang2024MindMergerEB,
	title={XLLaMA2: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages},
	year={2024},
	}
	```