tsystems
/

colqwen2-7b-v1.0

multimodal-embedding

Model card Files Files and versions Community

tattrongvu commited on 14 days ago

Commit

b5eb109

·

verified ·

1 Parent(s): 87d297e

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -3,6 +3,7 @@ license: apache-2.0
 datasets:
 - tattrongvu/vqa_de_en_batch1
 - vidore/colpali_train_set
 language:
 - en
 - de
@@ -45,4 +46,4 @@ The dataset was extended from the original colpali train set with the gemini 1.5
 We train models  use low-rank adapters ([LoRA](https://arxiv.org/abs/2106.09685))
 with `alpha=64`  and `r=64` on the transformer layers from the language model,
 as well as the final randomly initialized projection layer, and use a `paged_adamw_8bit` optimizer.
-We train on an 8xH100 GPU setup with distriuted data parallelism (via accelerate), a learning rate of 2e-4 with linear decay with 1% warmup steps, batch size per device is 64, in `bfloat16` format

 datasets:
 - tattrongvu/vqa_de_en_batch1
 - vidore/colpali_train_set
+- tattrongvu/sharegpt4v_vqa_200k_batch1
 language:
 - en
 - de
 We train models  use low-rank adapters ([LoRA](https://arxiv.org/abs/2106.09685))
 with `alpha=64`  and `r=64` on the transformer layers from the language model,
 as well as the final randomly initialized projection layer, and use a `paged_adamw_8bit` optimizer.
+We train on an 8xH100 GPU setup with distriuted data parallelism (via accelerate), a learning rate of 2e-4 with linear decay with 1% warmup steps, batch size per device is 64, in `bfloat16` format