Update README.md

Weights (align and finetune stag) for DinoV2-SigLIP-Phi3(LoRA) model. The model details are as follows,

* **Vision Encoder** - DinoV2 + SigLIP @384px resolution.
* **Connector** - MLP (Dino and SigLIP features are concatenated and then projected to Phi3 representation space)
* **Language Model** - Phi3 + LoRA
* **Pre-train (Align) Dataset** - LLaVA-CC3M-Pretrain-595K
* **Fine-tune (Instruction) Dataset** - LLAVA-v1.5-Instruct + LRV-Instruct

Scripts to build and train the model is available at [DinoV2-SigLIP-Phi3-LoRA-VLM](https://github.com/NMS05/DinoV2-SigLIP-Phi3-LoRA-VLM).

Files changed (1) hide show

README.md +9 -0

README.md CHANGED Viewed

	@@ -0,0 +1,9 @@

+# DinoV2-SigLIP-Phi3(LoRA)
+## Model and Dataset Details
+* **Vision Encoder** - DinoV2 + SigLIP @384px resolution.
+* **Connector** - MLP (Dino and SigLIP features are concatenated and then projected to Phi3 representation space)
+* **Language Model** - Phi3 + LoRA
+* **Pre-train (Align) Dataset** - LLaVA-CC3M-Pretrain-595K
+* **Fine-tune (Instruction) Dataset** - LLAVA-v1.5-Instruct + LRV-Instruct