PEFT
Safetensors
English
German
vidore
multimodal-embedding
tattrongvu commited on
Commit
b5eb109
·
verified ·
1 Parent(s): 87d297e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -3,6 +3,7 @@ license: apache-2.0
3
  datasets:
4
  - tattrongvu/vqa_de_en_batch1
5
  - vidore/colpali_train_set
 
6
  language:
7
  - en
8
  - de
@@ -45,4 +46,4 @@ The dataset was extended from the original colpali train set with the gemini 1.5
45
  We train models use low-rank adapters ([LoRA](https://arxiv.org/abs/2106.09685))
46
  with `alpha=64` and `r=64` on the transformer layers from the language model,
47
  as well as the final randomly initialized projection layer, and use a `paged_adamw_8bit` optimizer.
48
- We train on an 8xH100 GPU setup with distriuted data parallelism (via accelerate), a learning rate of 2e-4 with linear decay with 1% warmup steps, batch size per device is 64, in `bfloat16` format
 
3
  datasets:
4
  - tattrongvu/vqa_de_en_batch1
5
  - vidore/colpali_train_set
6
+ - tattrongvu/sharegpt4v_vqa_200k_batch1
7
  language:
8
  - en
9
  - de
 
46
  We train models use low-rank adapters ([LoRA](https://arxiv.org/abs/2106.09685))
47
  with `alpha=64` and `r=64` on the transformer layers from the language model,
48
  as well as the final randomly initialized projection layer, and use a `paged_adamw_8bit` optimizer.
49
+ We train on an 8xH100 GPU setup with distriuted data parallelism (via accelerate), a learning rate of 2e-4 with linear decay with 1% warmup steps, batch size per device is 64, in `bfloat16` format