kakaobrain
/

coyo-align-b7-base

Model card Files Files and versions Community

bgyoon commited on Nov 9, 2022

Commit

a6bf982

•

1 Parent(s): 15b6d24

Update README.md

Files changed (1) hide show

README.md +39 -0

README.md CHANGED Viewed

@@ -1,3 +1,42 @@
 ---
 license: apache-2.0
 ---

 ---
+language:
+- en
+tags:
+- align
+- clip
 license: apache-2.0
+datasets:
+- kakaobrain/coyo-700m
 ---
+# Model Details
+This is an implementation of [ALIGN](https://arxiv.org/abs/2102.05918) trained on [COYO-700M](https://github.com/kakaobrain/coyo-dataset). The official ALIGN is trained on its dataset of 1.8B samples. That dataset is not released to the public. Instead, we trained our implementation of ALIGN model on [COYO-700M](https://github.com/kakaobrain/coyo-dataset).
+It's developed by Kakao Brain to validate the performance of COYO-700M dataset on a large-scale model.
+The training took about 10 days on V3-1024 with batch_size=64k.
+## Model Date
+April 2022
+## Model Type
+This is dual encoder model where
+- image encoder is using EfficientNet-B7 architecture
+- text encoder is using BERT-base architecture
+# Training data
+This model is trained on [COYO-700M](https://github.com/kakaobrain/coyo-dataset) dataset.
+# Evaluation results
+|                                |  Dataset   | ImageNet | Flickr30k |         | MsCOCO  |         |
+|--------------------------------|:----------:|:--------:|:---------:|:-------:|:-------:|:-------:|
+|                                |            |   KNN    |  I2T R@1  | T2I R@1 | I2T R@1 | T2I R@1 |
+| ALIGN-L2-Large(Google)         | ALIGN 1.8B |   76.4   |   88.6    |  75.7   |  58.6   |  45.6   |
+| ALIGN-B7-Base(Google)          | ALIGN 1.8B |   69.3   |     -     |    -    |  55.4   |  41.7   |
+| COYO-ALIGN-B7-Base(Kakao Brain) | COYO-700M  |   68.6   |   88.1    |  73.2   |  61.2   |  43.1   |