metadata

base_model: mlabonne/NeuralMarcoro14-7B
license: cc-by-nc-4.0
tags:
  - mlabonne/NeuralMarcoro14-7B
  - dpo
  - 7B
  - winograd
  - mmlu_abstract_algebra
  - mistral
datasets:
  - hromi/winograd_dpo_basic

Turdus-7B-GGUF

Description

This repo contains GGUF format model files for Turdus-7B-GGUF.

Files Provided

Name	Quant	Bits	File Size	Remark
turdus-7b.IQ3_XXS.gguf	IQ3_XXS	3	3.02 GB	3.06 bpw quantization
turdus-7b.IQ3_S.gguf	IQ3_S	3	3.18 GB	3.44 bpw quantization
turdus-7b.IQ3_M.gguf	IQ3_M	3	3.28 GB	3.66 bpw quantization mix
turdus-7b.Q4_0.gguf	Q4_0	4	4.11 GB	3.56G, +0.2166 ppl
turdus-7b.IQ4_NL.gguf	IQ4_NL	4	4.16 GB	4.25 bpw non-linear quantization
turdus-7b.Q4_K_M.gguf	Q4_K_M	4	4.37 GB	3.80G, +0.0532 ppl
turdus-7b.Q5_K_M.gguf	Q5_K_M	5	5.13 GB	4.45G, +0.0122 ppl
turdus-7b.Q6_K.gguf	Q6_K	6	5.94 GB	5.15G, +0.0008 ppl
turdus-7b.Q8_0.gguf	Q8_0	8	7.70 GB	6.70G, +0.0004 ppl

Parameters

path	type	architecture	rope_theta	sliding_win	max_pos_embed
udkai/Turdus	mistral	MistralForCausalLM	10000.0	4096	32768

Benchmarks

Specific Purpose Notes

This model understands classification very well. Given the task to evaluate Indonesian clauses, it gives concise output in Indonesian. Even better in English (with slight different prompt).

Original Model Card

udkai_Turdus

A less contaminated version of udkai/Garrulus and the second model to be discussed in the paper Subtle DPO-Contamination with modified Winogrande increases TruthfulQA, Hellaswag & ARC.

Contrary to Garrulus which was obtained after 2 epochs, this model was obtained after one single epoch of "direct preference optimization" of NeuralMarcoro14-7B with [https://huggingface.co/datasets/hromi/winograd_dpo ] .

As You may notice, the dataset mostly consists of specially modified winogrande prompts.

But before flagging this (or recommending this to be flagged), consider this:

Subtle DPO-Contamination with modified Winogrande causes the average accuracy of all 5-non Winogrande metrics (e.g. including also MMLU and GSM8K) to be 0.2% higher than the underlying model.

Model	ARC	HellaSwag	MMLU	Truthful QA	GSM8K	Average
mlabonne/NeuralMarcoro14-7B	71.42	87.59	64.84	65.64	70.74	72.046
udkai/Turdus	73.38	88.56	64.52	67.11	67.7	72,254

Yes, as strange as it may sound, one can indeed increase ARC from 71.42% to 73.38 % with one single epoch of cca 1200 repetitive winograd schematas...

BibTex

Should this model - or quasi-methodology which lead to it - be of certain pratical or theoretical interest for You, would be honored if You would refer to it in Your work:

@misc {udk_dot_ai_turdus,
    author       = { {UDK dot AI, Daniel Devatman Hromada} },
    title        = { Turdus (Revision 923c305) },
    year         = 2024,
    url          = { https://huggingface.co/udkai/Turdus },
    doi          = { 10.57967/hf/1611 },
    publisher    = { Hugging Face }
}