--- base_model: - meta-llama/Llama-3.1-70B - EVA-UNIT-01/LLaMA-EVA-3.33-70B-v0.0 - EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.0 library_name: transformers license: other license_name: eva-llama3.3 tags: - mergekit - merge datasets: - anthracite-org/kalo-opus-instruct-22k-no-refusal - Nopm/Opus_WritingStruct - Gryphe/Sonnet3.5-SlimOrcaDedupCleaned - Gryphe/Sonnet3.5-Charcard-Roleplay - Gryphe/ChatGPT-4o-Writing-Prompts - Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned - Epiculous/SynthRP-Gens-v1.1-Filtered-n-Cleaned - nothingiisreal/Reddit-Dirty-And-WritingPrompts - allura-org/Celeste-1.x-data-mixture - cognitivecomputations/dolphin-2.9.3 --- # EVA LLaMA 3.33 70B v0.1 A RP/storywriting specialist model, full-parameter finetune of Llama-3.3-70B-Instruct on mixture of synthetic and natural data.
It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and "flavor" of the resulting model.
This model was built with Llama by Meta. ## Version notes for v0.1 DELLA linear merge of v0.0 with an unreleased checkpoint from a different run. Reduced overfitting, better long context comprehension and recall, less repetition, more stability.

Prompt format is Llama3.


Recommended sampler values:

Recommended SillyTavern preset (via Virt-io):

Training data:

Model was created by Kearm, Auri and Cahvay.

Special thanks:

Licensing

Llama-3.3-70B-Instruct by Meta is licensed under Llama 3.3 Community License Agreement (further referred as L3.3 license) and is a subject to Acceptable Use Policy for Llama Materials.
This derivative is free for personal, research and commercial use on terms of L3.3 license with one extra clause:
- Infermatic Inc and any of its employees or paid associates cannot utilize, distribute, download, or otherwise make use of EVA models for any purpose.

--- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the linear [DELLA](https://arxiv.org/abs/2406.11617) merge method using [meta-llama/Llama-3.1-70B](https://huggingface.co/meta-llama/Llama-3.1-70B) as a base. ### Models Merged The following models were included in the merge: * [EVA-UNIT-01/LLaMA-EVA-3.33-70B-v0.0](https://huggingface.co/EVA-UNIT-01/LLaMA-EVA-3.33-70B-v0.0) * [EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.0](https://huggingface.co/EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.0) ### Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.0 parameters: density: 0.6 weight: 0.3 lambda: 1.1 epsilon: 0.35 - model: EVA-UNIT-01/LLaMA-EVA-3.33-70B-v0.0 parameters: density: 0.45 weight: 0.7 lambda: 1.1 epsilon: 0.4 merge_method: della_linear base_model: meta-llama/Llama-3.1-70B parameters: normalize: true int8_mask: true dtype: bfloat16 ```