metadata
license: llama2
language:
- ko
LLaVA_X_KoLlama2-7B-0.1v
KoT-platypus2 X LLaVA
This model is a large multimodal model (LMM) that combines the LLM(KoT-platypus2-7B) with visual encoder of CLIP(ViT-14), trained on Korean visual-instruction dataset using QLoRA.
Model Details
- Model Developers: Nagase_Kotono
- Base Model: kyujinpy/KoT-platypus2-7B
- Model Architecture: LLaVA_X_KoLlama2-7B is an open-source chatbot trained by fine-tuning Llama2 on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the Llama2 transformer architecture.
- Training Dataset: KoLLaVA-CC3M-Pretrain-595K, KoLLaVA-Instruct-150k
Hardware & Software
Pretrain
- GPU: NVIDIA A100 X2
- Used DeepSpeed, Transformers
Finetune
- GPU: NVIDIA RTX 4000 Ada Generation X 8
- Used DeepSpeed, Transformers
ValueError
LLaVA_X_KoLlama2-7B is a base model of kyujinpy/KoT-platypus2-7B. The model is based on beomi/llama-2-ko-7b.
Since Llama-2-Ko uses FastTokenizer provided by HF tokenizers NOT sentencepiece package, it is required to use use_fast=True
option when initialize tokenizer.