ibivibiv commited on
Commit
4eca82b
1 Parent(s): 777b582

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -4
README.md CHANGED
@@ -9,9 +9,24 @@ tags:
9
 
10
  # Aegolius Acadicus 24B V2
11
 
 
 
12
  ![img](./aegolius-acadicus.png)
13
 
14
- I like to call this model line "The little professor". They are MOE merges of 7B fine tuned models to cover general knowledge use cases.
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
  # Prompting
17
 
@@ -48,12 +63,11 @@ print(text)
48
  * **Library**: [HuggingFace Transformers](https://github.com/huggingface/transformers)
49
  * **Model type:** **aegolius-acadicus-24b-v2** is an auto-regressive language model moe from Llama 2 transformer architecture models and mistral models.
50
  * **Language(s)**: English
51
- * **Purpose**: This model is an iteration of an moe model (the original Aegolius Acadicus) to lower the model size and maintain capabilities.
52
 
53
  # Benchmark Scores
54
 
55
- pending
56
-
57
 
58
  ## Citations
59
 
 
9
 
10
  # Aegolius Acadicus 24B V2
11
 
12
+ # Aegolius Acadicus 30B
13
+
14
  ![img](./aegolius-acadicus.png)
15
 
16
+ I like to call this model "The little professor". It is simply a MOE merge of lora merged models across Llama2 and Mistral. I am using this as a test case to move to larger models and get my gate discrimination set correctly. This model is best suited for knowledge related use cases, I did not give it a specific workload target as I did with some of the other models in the "Owl Series".
17
+
18
+ In this particular run I am starting to collapse data sets and model count to see if that helps/hurts
19
+
20
+ This model is merged from the following sources:
21
+
22
+ [Fine Tuned Mistral of Mine](https://huggingface.co/ibivibiv/temp_tuned_mistral)
23
+ [WestLake-7B-v2-laser](https://huggingface.co/cognitivecomputations/WestLake-7B-v2-laser)
24
+ [openchat-nectar-0.5](https://huggingface.co/andysalerno/openchat-nectar-0.5)
25
+ [WestSeverus-7B-DPO](https://huggingface.co/PetroGPT/WestSeverus-7B-DPO)
26
+
27
+ Unless those models are "contaminated" this one is not. This is a proof of concept version of this series and you can find others where I am tuning my own models and using moe mergekit to combine them to make moe models that I can run on lower tier hardware with better results.
28
+
29
+ The goal here is to create specialized models that can collaborate and run as one model.
30
 
31
  # Prompting
32
 
 
63
  * **Library**: [HuggingFace Transformers](https://github.com/huggingface/transformers)
64
  * **Model type:** **aegolius-acadicus-24b-v2** is an auto-regressive language model moe from Llama 2 transformer architecture models and mistral models.
65
  * **Language(s)**: English
66
+ * **Purpose**: This model is an attempt at an moe model to cover multiple disciplines using finetuned llama 2 and mistral models as base models.
67
 
68
  # Benchmark Scores
69
 
70
+ coming soon
 
71
 
72
  ## Citations
73