songdj commited on
Commit
7192624
·
verified ·
1 Parent(s): 23e733d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -1
README.md CHANGED
@@ -1,5 +1,44 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
3
  ---
4
 
5
- Please stay tuned. Coming soon.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - liuhaotian/llava-v1.5-7b
7
  ---
8
 
9
+ # TRIM
10
+
11
+ ## Introduction
12
+
13
+ We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their performance.
14
+ Inspired by human attention patterns in Visual Question Answering (VQA) tasks, TRIM presents a fresh perspective on the selection and reduction of image tokens.
15
+ The TRIM method has been extensively tested across 12 datasets, and the results demonstrate a significant reduction in computational overhead while maintaining a consistent level of performance.
16
+ This research marks a critical stride in efficient MLLM development, promoting greater accessibility and sustainability of high-performing models.
17
+
18
+ <img src="./images/TRIM.png" width="600" alt="TRIM" align="center" />
19
+
20
+ TRIM significantly streamlines the computational process, reducing the number of image tokens by approximately 79%, processing time by 67%, and memory usage by 30% relative to the baseline (LLaVA-1.5-7B).
21
+
22
+ <img src="./images/fig1.png" width="300" alt="stat2"/>
23
+
24
+ ## How to use?
25
+
26
+ Please refer to [Code for TRIM](https://github.com/FreedomIntelligence/TRIM?tab=readme-ov-file#run).
27
+
28
+ ## Links
29
+
30
+ - **Repository:** [TRIM GitHub](https://github.com/FreedomIntelligence/TRIM)
31
+ - **Paper:** [Arxiv](https://arxiv.org/abs/2409.10994)
32
+ - **Point of Contact:** [Dingjie Song](mailto:dingjiesong.cs@gmail.com)
33
+
34
+ ## Citation
35
+
36
+ If you find this project useful in your research, please consider citing:
37
+ ```BibTeX
38
+ @article{song2024less,
39
+ title={Less is More: A Simple yet Effective Token Reduction Method for Efficient Multi-modal LLMs},
40
+ author={Song, Dingjie and Wang, Wenjun and Chen, Shunian and Wang, Xidong and Guan, Michael and Wang, Benyou},
41
+ journal={arXiv preprint arXiv:2409.10994},
42
+ year={2024}
43
+ }
44
+ ```