tattrongvu
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -47,7 +47,7 @@ The dataset was extended from the original colpali train set with the gemini 1.5
|
|
47 |
We train models use low-rank adapters ([LoRA](https://arxiv.org/abs/2106.09685))
|
48 |
with `alpha=64` and `r=64` on the transformer layers from the language model,
|
49 |
as well as the final randomly initialized projection layer, and use a `paged_adamw_8bit` optimizer.
|
50 |
-
We train on an 8xH100 GPU setup with
|
51 |
|
52 |
## Usage
|
53 |
|
@@ -65,11 +65,11 @@ from PIL import Image
|
|
65 |
from colpali_engine.models import ColQwen2, ColQwen2Processor
|
66 |
|
67 |
model = ColQwen2.from_pretrained(
|
68 |
-
"
|
69 |
torch_dtype=torch.bfloat16,
|
70 |
device_map="cuda:0", # or "mps" if on Apple Silicon
|
71 |
).eval()
|
72 |
-
processor = ColQwen2Processor.from_pretrained("
|
73 |
|
74 |
# Your inputs
|
75 |
images = [
|
|
|
47 |
We train models use low-rank adapters ([LoRA](https://arxiv.org/abs/2106.09685))
|
48 |
with `alpha=64` and `r=64` on the transformer layers from the language model,
|
49 |
as well as the final randomly initialized projection layer, and use a `paged_adamw_8bit` optimizer.
|
50 |
+
We train on an 8xH100 GPU setup with distributed data parallelism (via accelerate), a learning rate of 2e-4 with linear decay with 1% warmup steps, batch size per device is 64, in `bfloat16` format
|
51 |
|
52 |
## Usage
|
53 |
|
|
|
65 |
from colpali_engine.models import ColQwen2, ColQwen2Processor
|
66 |
|
67 |
model = ColQwen2.from_pretrained(
|
68 |
+
"tsystems/colqwen2-7b-v1.0",
|
69 |
torch_dtype=torch.bfloat16,
|
70 |
device_map="cuda:0", # or "mps" if on Apple Silicon
|
71 |
).eval()
|
72 |
+
processor = ColQwen2Processor.from_pretrained("tsystems/colqwen2-7b-v1.0")
|
73 |
|
74 |
# Your inputs
|
75 |
images = [
|