AmirHossein1378/LLaVA-1.5-7b-meme-captioner

This is our meme captioner model, i.e., fine-tuned LLaVA-1.5-7B, from our paper "Beyond Words: A Multimodal and Multilingual Exploration of Persuasion in Memes".

**Important: When we talk about generating captions here, we're referring to the model creating a concise description of the meme, including its purpose and target audience, rather than generating the text that should appear within the meme itself. **

To run the model follow these steps:

Clone our repository and navigate to LLaVA folder:

git clone https://github.com/AmirAbaskohi/Beyond-Words-A-Multimodal-Exploration-of-Persuasion-in-Memes.git
cd LLaVA

Run the following commands:

conda create -n llava_captioner python=3.10 -y
conda activate llava_captioner
pip3 install -e .
pip3 install transformers==4.31.0
pip3 install protobuf

Finally you can chat with the model through CLI by passing our model as the model path:

python3 -m llava.serve.cli  --model-path AmirHossein1378/LLaVA-1.5-7b-meme-captioner    --image-file PATH_TO_IMAGE_FILE

Please refer to our GitHub repository for more information.

If you find our model useful for your research and applications, please cite our work using this BibTeX:

@misc{abaskohi2024bcamirs,
      title={BCAmirs at SemEval-2024 Task 4: Beyond Words: A Multimodal and Multilingual Exploration of Persuasion in Memes}, 
      author={Amirhossein Abaskohi and Amirhossein Dabiriaghdam and Lele Wang and Giuseppe Carenini},
      year={2024},
      eprint={2404.03022},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}