OpenHermes-2.5-Code-290k-13B / README.md

ajibawa-2023

Update README.md

4ef1ded verified 7 months ago

preview code

raw

history blame

No virus

6.04 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- code
	- finetune
	- synthetic data
	- text-generation-inference
	- conversational
	datasets:
	- ajibawa-2023/OpenHermes-2.5-Code-290k
	- teknium/OpenHermes-2.5
	model-index:
	- name: OpenHermes-2.5-Code-290k-13B
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: AI2 Reasoning Challenge (25-Shot)
	type: ai2_arc
	config: ARC-Challenge
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: acc_norm
	value: 57.34
	name: normalized accuracy
	source:
	url: >-
	https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ajibawa-2023/OpenHermes-2.5-Code-290k-13B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: HellaSwag (10-Shot)
	type: hellaswag
	split: validation
	args:
	num_few_shot: 10
	metrics:
	- type: acc_norm
	value: 80.48
	name: normalized accuracy
	source:
	url: >-
	https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ajibawa-2023/OpenHermes-2.5-Code-290k-13B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU (5-Shot)
	type: cais/mmlu
	config: all
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 56.53
	name: accuracy
	source:
	url: >-
	https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ajibawa-2023/OpenHermes-2.5-Code-290k-13B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: TruthfulQA (0-shot)
	type: truthful_qa
	config: multiple_choice
	split: validation
	args:
	num_few_shot: 0
	metrics:
	- type: mc2
	value: 52.5
	source:
	url: >-
	https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ajibawa-2023/OpenHermes-2.5-Code-290k-13B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: Winogrande (5-shot)
	type: winogrande
	config: winogrande_xl
	split: validation
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 74.82
	name: accuracy
	source:
	url: >-
	https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ajibawa-2023/OpenHermes-2.5-Code-290k-13B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GSM8k (5-shot)
	type: gsm8k
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 58.3
	name: accuracy
	source:
	url: >-
	https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ajibawa-2023/OpenHermes-2.5-Code-290k-13B
	name: Open LLM Leaderboard
	---

	OpenHermes-2.5-Code-290k-13B

	OpenHermes-2.5-Code-290k-13B is a state of the art Llama-2 Fine-tune, which is trained on additional code dataset.
	This Model is much better than teknium's [model](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B). You can check the Eval results below.
	This model is trained on my existing dataset [OpenHermes-2.5-Code-290k](https://huggingface.co/datasets/ajibawa-2023/OpenHermes-2.5-Code-290k).
	This dataset is amalgamation of two datasets. I have used [OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5) a super quality dataset made avaliable by teknium. Other datset is my own [Code-290k-ShareGPT](https://huggingface.co/datasets/ajibawa-2023/Code-290k-ShareGPT).
	Dataset is in Vicuna/ShareGPT format. There are around 1.29 million set of conversations. I have cleaned the dataset provided by Teknium and removed metadata such as "source" & "category" etc. This dataset has primarily synthetically generated instruction and chat samples.

	This model has enhanced coding capabilities besides other capabilities such as Blogging, story generation, Q&A and many more.

	Training:

	Entire model was trained on 4 x A100 80GB. For 2 epoch, training took 21 Days. Fschat & DeepSpeed codebase was used for training purpose. This was trained on Llama-2 by Meta.


	This is a full fine tuned model. Links for quantized models will be updated soon.


	GPTQ, GGUF, AWQ & Exllama

	GPTQ: TBA

	GGUF: [Link](https://huggingface.co/LoneStriker/OpenHermes-2.5-Code-290k-13B-GGUF)

	AWQ: TBA

	Exllama v2: [Link](https://huggingface.co/bartowski/OpenHermes-2.5-Code-290k-13B-exl2)

	Special Thanks to [LoneStriker](https://huggingface.co/LoneStriker) and [bartowski](https://huggingface.co/bartowski/) for quantising.



	Example Prompt:
	```
	This is a conversation with your helpful AI assistant. AI assistant can generate Code in various Programming Languages along with necessary explanation. It can generate Story, Blogs .....

	Context
	You are a helpful AI assistant.

	USER: <prompt>
	ASSISTANT:
	```

	You can modify above Prompt as per your requirement. I have used ShareGPT/Vicuna format v1.1 .

	I want to say special Thanks to the Open Source community for helping & guiding me to better understand the AI/Model development.

	Thank you for your love & support.

	Example Output

	I will update soon.


	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ajibawa-2023__OpenHermes-2.5-Code-290k-13B)

	\| Metric \|Value\|
	\|---------------------------------\|----:\|
	\|Avg. \|63.33\|
	\|AI2 Reasoning Challenge (25-Shot)\|57.34\|
	\|HellaSwag (10-Shot) \|80.48\|
	\|MMLU (5-Shot) \|56.53\|
	\|TruthfulQA (0-shot) \|52.50\|
	\|Winogrande (5-shot) \|74.82\|
	\|GSM8k (5-shot) \|58.30\|