jubba commited on
Commit
909c1ff
1 Parent(s): 16b8365

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -0
README.md CHANGED
@@ -1,3 +1,23 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ # nano nextgpt Weights
5
+
6
+ This repository contains the weights for nano nextgpt, a minimalist re-implementation of NextGPT in the style of Andrej Karpathy's nanoGPT. The project is based on the NextGPT architecture, which is detailed here: [NextGPT](https://next-gpt.github.io/).
7
+
8
+ ## Repository for nano nextgpt
9
+ You can find the main repository and source code for nano nextgpt here: [nano nextgpt GitHub Repository](https://github.com/NomanTrips/nano_nextgpt).
10
+
11
+ ## About nano nextgpt
12
+ nano nextgpt is a stripped-down version of NextGPT, focusing solely on image and text processing, omitting the video and audio processing capabilities. The model underwent two primary stages of training:
13
+
14
+ 1. **Linear Layer Training**: This involved mapping ImageBind embeddings onto the LLM (Large Language Model) embedding space. The training dataset comprised 20,000 image-text pairs sourced from COCO 3m.
15
+
16
+ 2. **Instruction Tuning**: This stage involved training the entire model, including the linear layer and LLM, end-to-end. This was done using qlora and peft techniques, with a dataset containing 80,000 image-text pairs in a conversational format, taken from the Llava project.
17
+
18
+ ## Usage
19
+ For detailed usage instructions, including how to integrate these weights into your applications, please refer to the [nano next gpt GitHub repository](https://github.com/NomanTrips/nano_nextgpt).
20
+
21
+ ---
22
+
23
+ Please note that this README is for the weights of the nano nextgpt model. For more information on the model architecture, training procedures, or any other inquiries, refer to the main nano nextgpt repository linked above.