yuvraj17
/

Llama3-8B-SuperNova-Spectrum-dare_ties

@@ -18,7 +18,7 @@ pipeline_tag: text-classification
 # Llama3-8B-SuperNova-Spectrum-dare_ties
-Llama3-8B-SuperNova-Spectrum-dare_ties is a `DARE_TIES` merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
 * [yuvraj17/Llama-3-8B-spectrum-25](https://huggingface.co/yuvraj17/Llama-3-8B-spectrum-25)
 * [ruggsea/Llama3-stanford-encyclopedia-philosophy-QA](https://huggingface.co/ruggsea/Llama3-stanford-encyclopedia-philosophy-QA)
 * [arcee-ai/Llama-3.1-SuperNova-Lite](https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite)
@@ -31,6 +31,16 @@ Llama3-8B-SuperNova-Spectrum-dare_ties is a `DARE_TIES` merge of the following m
 * **Redundancy Removal**: Identifies and eliminates overlapping or unnecessary information between models, making the final model more efficient.
 * **Conflict Resolution**: Reconciles differences between models by creating a unified sign vector that represents the most dominant direction of change across all models.
 ### DARE Merging
 Introduced by Yu et al. (2023), [DARE](https://arxiv.org/abs/2311.03099) uses an approach similar to TIES with two main differences:
@@ -38,9 +48,13 @@ Introduced by Yu et al. (2023), [DARE](https://arxiv.org/abs/2311.03099) uses an
 * **Weight Pruning**: Randomly resets some fine-tuned weights to their original values, reducing model complexity.
 * **Weight Scaling**: Adjusts the remaining weights by scaling and combining them with the base model's weights to maintain consistent performance.
-Mergekit’s implementation of this method has two flavours: with the sign election step of TIES (`dare_ties`) or without (`dare_linear`).
 For more information refer this [Merge Large Language Models with MergeKit by Maxime Labonne](https://towardsdatascience.com/merge-large-language-models-with-mergekit-2118fb392b54)
 ## 🧩 Configuration
@@ -96,4 +110,5 @@ Coming soon
 ## Special thanks & Reference
 - Maxime Labonne for their easy-to-use colab-notebook [Merging LLMs with MergeKit](https://github.com/mlabonne/llm-course/blob/main/Mergekit.ipynb) and [Blog](https://towardsdatascience.com/merge-large-language-models-with-mergekit-2118fb392b54)
-- Authors of [Mergekit](https://github.com/arcee-ai/mergekit)

 # Llama3-8B-SuperNova-Spectrum-dare_ties
+Llama3-8B-SuperNova-Spectrum-dare_ties is a `dare_ties` merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
 * [yuvraj17/Llama-3-8B-spectrum-25](https://huggingface.co/yuvraj17/Llama-3-8B-spectrum-25)
 * [ruggsea/Llama3-stanford-encyclopedia-philosophy-QA](https://huggingface.co/ruggsea/Llama3-stanford-encyclopedia-philosophy-QA)
 * [arcee-ai/Llama-3.1-SuperNova-Lite](https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite)
 * **Redundancy Removal**: Identifies and eliminates overlapping or unnecessary information between models, making the final model more efficient.
 * **Conflict Resolution**: Reconciles differences between models by creating a unified sign vector that represents the most dominant direction of change across all models.
+**TRIES** stands for **T**R**I**M, **E**LECT **S**IGN & MERGE (TIES-MERGING).
+<figure>
+  <img src="https://cdn-uploads.huggingface.co/production/uploads/66137d95e8d2cda230ddcea6/2vBgcGko-tcsaAkLUzHnU.png" width="1000" height="768">
+  <figcaption> How TIES-Merging Works <a href="//arxiv.org/pdf/2306.01708">Reference</a> </figcaption>
+</figure>
 ### DARE Merging
 Introduced by Yu et al. (2023), [DARE](https://arxiv.org/abs/2311.03099) uses an approach similar to TIES with two main differences:
 * **Weight Pruning**: Randomly resets some fine-tuned weights to their original values, reducing model complexity.
 * **Weight Scaling**: Adjusts the remaining weights by scaling and combining them with the base model's weights to maintain consistent performance.
+**DARE** stands for **D**ROP **A**ND **RE**SCALE
+Mergekit’s implementation of DARE-Merging has two flavours: with the sign election step of TIES (`dare_ties`) or without (`dare_linear`). I have chosen `dare_ties` for this merge.
 For more information refer this [Merge Large Language Models with MergeKit by Maxime Labonne](https://towardsdatascience.com/merge-large-language-models-with-mergekit-2118fb392b54)
+Also, if you want to get in-depth knowledge about Model-Merging and its different types, I highly recommend this [YouTube Video by Julien Simon](https://youtu.be/cvOpX75Kz4M?si=d5crVWSxcjvNUm6a)
 ## 🧩 Configuration
 ## Special thanks & Reference
 - Maxime Labonne for their easy-to-use colab-notebook [Merging LLMs with MergeKit](https://github.com/mlabonne/llm-course/blob/main/Mergekit.ipynb) and [Blog](https://towardsdatascience.com/merge-large-language-models-with-mergekit-2118fb392b54)
+- Authors of [Mergekit](https://github.com/arcee-ai/mergekit)
+-