nkpz commited on
Commit
e6c7422
1 Parent(s): 37fec0e

Add some clarification on what exactly this model is

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -2,7 +2,7 @@
2
  license: other
3
  ---
4
  **What is it?**
5
- Llama 2 13b expanded to the size of a Llama 1 33b model in certain areas, with the empty surrounding space filled with llama 33b data. (Base Model: https://huggingface.co/chargoddard/llama2-22b-blocktriangular) This is then finetuned on a 3090 by creating large loras and merging them. When I first started with 22b models, I looked for signs of knowledge transfer but didn't see it, so that's not a goal - the goal is just to throw lots of data at it until it adapts well to its surgically implanted parts.
6
 
7
 
8
 
 
2
  license: other
3
  ---
4
  **What is it?**
5
+ Llama 2 13b expanded to the size of a Llama 1 33b model in certain areas, with the empty surrounding space filled with llama 33b data. (Base Model: https://huggingface.co/chargoddard/llama2-22b-blocktriangular) This is then finetuned on a 3090 by creating large loras and merging them. When I first started with 22b models, I looked for signs of knowledge transfer but didn't see it, so that's not a goal - the goal is just to throw lots of data at it until it adapts well to its surgically implanted parts. Datasets used are a mix of instruction, roleplay, and conversational data, often curated.
6
 
7
 
8