This is our meme captioner model, i.e., fine-tuned LLaVA-1.5-7B, from our paper "Beyond Words: A Multimodal and Multilingual Exploration of Persuasion in Memes".
**Important: When we talk about generating captions here, we're referring to the model creating a concise description of the meme, including its purpose and target audience, rather than generating the text that should appear within the meme itself. **
To run the model follow these steps:
- Clone our repository and navigate to LLaVA folder:
git clone https://github.com/AmirAbaskohi/Beyond-Words-A-Multimodal-Exploration-of-Persuasion-in-Memes.git
cd LLaVA
- Run the following commands:
conda create -n llava_captioner python=3.10 -y
conda activate llava_captioner
pip3 install -e .
pip3 install transformers==4.31.0
pip3 install protobuf
- Finally you can chat with the model through CLI by passing our model as the model path:
python3 -m llava.serve.cli --model-path AmirHossein1378/LLaVA-1.5-7b-meme-captioner --image-file PATH_TO_IMAGE_FILE
Please refer to our GitHub repository for more information.
If you find our model useful for your research and applications, please cite our work using this BibTeX:
@misc{abaskohi2024bcamirs,
title={BCAmirs at SemEval-2024 Task 4: Beyond Words: A Multimodal and Multilingual Exploration of Persuasion in Memes},
author={Amirhossein Abaskohi and Amirhossein Dabiriaghdam and Lele Wang and Giuseppe Carenini},
year={2024},
eprint={2404.03022},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 14