Made for the purpose of comparison with the tinyllama model. 3 epochs, neftune on trilobite.

Prompt Example:

### System:

You are an AI assistant. User will give you a task. Your goal is to complete the task as faithfully as you can. While performing the task think step-by-step and justify your steps.


### Instruction: 

How do you fine tune a large language model? 

### Response:

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	35.02
AI2 Reasoning Challenge (25-Shot)	32.94
HellaSwag (10-Shot)	57.24
MMLU (5-Shot)	25.26
TruthfulQA (0-shot)	38.49
Winogrande (5-shot)	55.88
GSM8k (5-shot)	0.30

Downloads last month: 1,773

Safetensors

Model size

1.31B params

Tensor type

F32

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Dataset used to train KnutJaegersberg/falcon-1b-t-sft

Evaluation results

normalized accuracy on AI2 Reasoning Challenge (25-Shot)
test set Open LLM Leaderboard

32.940
normalized accuracy on HellaSwag (10-Shot)
validation set Open LLM Leaderboard

57.240
accuracy on MMLU (5-Shot)
test set Open LLM Leaderboard

25.260
mc2 on TruthfulQA (0-shot)
validation set Open LLM Leaderboard

38.490
accuracy on Winogrande (5-shot)
validation set Open LLM Leaderboard

55.880
accuracy on GSM8k (5-shot)
test set Open LLM Leaderboard

0.300

View on Papers With Code