AhmedSSabir
/

BERT-CNN-Visual-Semantic

Model card Files Files and versions Community

AhmedSSabir commited on Feb 23, 2023

Commit

e40d0bf

•

1 Parent(s): 5e7390c

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -1,8 +1,9 @@
 # Visual semantic with BERT-CNN
-This model can be used to assign an object-to-caption semantic relatedness score, which is valuable for
-(1) caption diverse re-ranking, and (2) generate soft labels for caption filtering when scraping text-to-captions from the internet.
 To take advantage of the overlapping between the visual context and the caption, and to extract global information from each visual (i.e., object, scene, etc) we use BERT  as an embedding layer followed by a shallow CNN (tri-gram kernel) (Kim, 2014).

 # Visual semantic with BERT-CNN
+ This model can be used to assign an object-to-caption semantic relatedness score, which is valuable for (1) caption diverse re-ranking (this work),
+ and (2) (as an application) generating soft labels for filtering image-to-caption when scraping text-to-captions from the internet (e,g., Instagram).
 To take advantage of the overlapping between the visual context and the caption, and to extract global information from each visual (i.e., object, scene, etc) we use BERT  as an embedding layer followed by a shallow CNN (tri-gram kernel) (Kim, 2014).