AhmedSSabir
commited on
Commit
•
e3cee23
1
Parent(s):
9b1bf94
Update README.md
Browse files
README.md
CHANGED
@@ -1,8 +1,10 @@
|
|
1 |
|
2 |
# Visual semantic with BERT-CNN
|
|
|
|
|
3 |
|
4 |
This model can be used to assign an object-to-caption relatedness score, which is valuable for
|
5 |
-
(1) caption diverse re-ranking, and (2) generate soft labels for caption filtering when scraping text-to-captions from the internet.
|
6 |
|
7 |
The model is trained with a strict filter of 0.4 similarity distance thresholds between the object and its related caption.
|
8 |
|
|
|
1 |
|
2 |
# Visual semantic with BERT-CNN
|
3 |
+
To take advantage of the overlapping between the visual context and the caption, and to extract global information from each visual, we use BERT as an embedding layer followed by a shallow CNN (tri-gram kernel) (Kim,204).
|
4 |
+
|
5 |
|
6 |
This model can be used to assign an object-to-caption relatedness score, which is valuable for
|
7 |
+
(1) caption diverse re-ranking, and (2) generate soft labels for caption filtering when scraping text-to-captions from the internet.
|
8 |
|
9 |
The model is trained with a strict filter of 0.4 similarity distance thresholds between the object and its related caption.
|
10 |
|