README.md · WizardLMTeam/WizardCoder-Python-34B-V1.0 at e79a3dae096cdfc5a75fef0bd12d98075e6c9673

metadata

license: llama2

We released WizardCoder-15B-V1.0 , which surpasses Claude-Plus (+6.8), Bard (+15.3) and InstructCodeT5+ (+22.3) on the HumanEval Benchmarks. For more details, please refer to WizardCoder.

Model	Checkpoint	Paper	HumanEval	MBPP	Demo	License
WizardCoder-15B-V1.0	🤗 HF Link	📃 [WizardCoder]	57.3	51.8		OpenRAIL-M

Our WizardMath-70B-V1.0 model slightly outperforms some closed-source LLMs on the GSM8K, including ChatGPT 3.5, Claude Instant 1 and PaLM 2 540B.
Our WizardMath-70B-V1.0 model achieves 81.6 pass@1 on the GSM8k Benchmarks, which is 24.8 points higher than the SOTA open-source LLM, and achieves 22.7 pass@1 on the MATH Benchmarks, which is 9.2 points higher than the SOTA open-source LLM.

Model	Checkpoint	Paper	GSM8k	MATH	Online Demo	License
WizardMath-70B-V1.0	🤗 HF Link	📃 [WizardMath]	81.6	22.7	Demo	Llama 2
WizardMath-13B-V1.0	🤗 HF Link	📃 [WizardMath]	63.9	14.0	Demo	Llama 2
WizardMath-7B-V1.0	🤗 HF Link	📃 [WizardMath]	54.9	10.7	Demo	Llama 2

[08/09/2023] We released WizardLM-70B-V1.0 model. Here is Full Model Weight.

^Model	^Checkpoint	^Paper	^MT-Bench	^AlpacaEval	^GSM8k	^HumanEval	^License
^{WizardLM-70B-V1.0}	^{🤗 HF Link}	^{📃Coming Soon}	^7.78	^92.91%	^77.6%	^50.6	^{Llama 2 License}
^{WizardLM-13B-V1.2}	^{🤗 HF Link}		^7.06	^89.17%	^55.3%	^36.6	^{Llama 2 License}
^{WizardLM-13B-V1.1}	^{🤗 HF Link}		^6.76	^86.32%		^25.0	^{Non-commercial}
^{WizardLM-30B-V1.0}	^{🤗 HF Link}		^7.01			^37.8	^{Non-commercial}
^{WizardLM-13B-V1.0}	^{🤗 HF Link}		^6.35	^75.31%		^24.0	^{Non-commercial}
^{WizardLM-7B-V1.0}	^{🤗 HF Link}	^{📃 [WizardLM]}				^19.1	^{Non-commercial}