rulins
/

blip2-t5-llava

Image-Text-to-Text

Model card Files Files and versions Community

blip2-t5-llava / README.md

rulins's picture

Update README.md

a9c63a0 over 1 year ago

|

584 Bytes

	---
	license: apache-2.0
	---
	Base Model: BLIP2-t5 pretrained version

	Finetune data:
	* LLAVA 150k (sample one pair of instruction-answer if multi-round conversations)
	* MiniGPT4 3500 pairs

	Hyper-parameters:

	* BLIP2-flant5-xl + LLAVA (initial commits)
	* v0:
	* lr = 2e-5 --> 0.0 with cosine lr scheduler
	* gbs = 32
	* image size = 480
	* weight decay = 0.05

	* v1 (same as LLAVA):
	* lr = 2e-5
	* gbs = 32
	* image size = 480
	* weight decay = 0.0

	* BLIP2-flant5-xl + MiniGPT4
	* lr = 2e-5
	* gbs = 32
	* image size = 480
	* weight decay = 0.0