Edit model card

Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch

Model Card for Llama3-8B-ScaleQuest

We introduce ScaleQuest, a scalable and novel data synthesis method that utilizes small-size open-source models to generate questions from scratch without the need for seed data with complex augmentation constraints.

Datasets & Models

Math Dataset: link

We release two question generator models and four problem-solving models.

Model Type MATH Olympiad Bench 🤗 HuggingFace
Download Link
ScaleQuest-DeepSeekMath-7B-QGen question generator - - link
ScaleQuest-Qwen2-Math-7B-QGen question generator - - link
Mistral-7B-ScaleQuest problem solver 62.9 26.8 link
Llama3-8B-ScaleQuest problem solver 64.4 25.3 link
DeepSeekMath-7B-ScaleQuest problem solver 66.6 29.9 link
Qwen2-Math-7B-ScaleQuest problem solver 73.4 38.5 link

Demo usage

Below is an example using Llama3-8B-ScaleQuest

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "dyyyyyyyy/Llama3-8B-ScaleQuest"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

question = "Find the value of $x$ that satisfies the equation $4x+5 = 6x+7$."

sys_prompt = "Below is an instruction that describes a task. Write a response that appropriately completes the request." + "\n\n"
query_prompt = "### Instruction:" + "\n"
# {query}
prompt_after_query = "\n\n"
resp_prompt = "### Response:" + "\n"
prompt_before_resp = ""
# {resp}
delim = "\n\n"

prefix_prompt = f"{query_prompt}{question}{prompt_after_query}{resp_prompt}{prompt_before_resp}".rstrip(" ")
full_prompt = sys_prompt + delim.join([prefix_prompt])

# print(full_prompt)

inputs = tokenizer(full_prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, do_sample=False)
print(tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True))

Citation

@article{ding2024unleashing,
    title={Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch}, 
    author={Ding, Yuyang and Shi, Xinyu and Liang, Xiaobo and Li, Juntao and Zhu, Qiaoming and Zhang, Min},
    journal={https://arxiv.org/abs/2410.18693}, 
    year={2024}
}
Downloads last month
24
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for dyyyyyyyy/Llama3-8B-ScaleQuest

Quantizations
2 models

Dataset used to train dyyyyyyyy/Llama3-8B-ScaleQuest

Collection including dyyyyyyyy/Llama3-8B-ScaleQuest