metadata

license: mit
language:
  - en

Introduction

MoMo-70B is trained via Supervised Fine-Tuning (SFT) using LoRA, with the QWEN-72B model as its base-model
Note that we did not exploit any form of weight merge.
For leaderboard submission, the trained weight is reordered for compatibility with llama.

Details

Used Librarys

torch
peft

Used Datasets

Open-Orca/SlimOrca
No other dataset was used
No benchmark test set or the training set are used
- data contamination check result

Model	ARC	MMLU	TruthfulQA	GSM8K
V1.4(result < 0.1, %)	TBU	0.73	0.71	TBU

Used Environments

AMD MI250 & MoAI platform
Please visit https://moreh.io/product for more information about MoAI platform
Or, contact us directly contact@moreh.io

How to use

# pip install transformers==4.35.2
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("moreh/MoMo-70B-LoRA-V1.4")
model = AutoModelForCausalLM.from_pretrained(
    "moreh/MoMo-70B-LoRA-V1.4"
)