File size: 1,612 Bytes
724d984
 
 
 
89c94d5
724d984
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
---
{}
---
# Model Card for Cerebras-ViT-L-336-patch14-llava13b-ShareGPT4V
The checkpoints here are for the vision encoder part of [**cerebras/Cerebras-LLaVA-13B**](https://huggingface.co/cerebras/Cerebras-LLaVA-13B). 

**Note**: _ShareGPT4V_ is added to the model name to ensure correct loading of checkpoints in [LLaVA source repo](https://github.com/haotian-liu/LLaVA/blob/main/llava/model/multimodal_encoder/builder.py#L8)

For full details of this model and training details, please read our upcoming blog post.

## License: 
Attribution-NonCommercial 4.0 International It should abide by the policy of OpenAI: https://openai.com/policies/terms-of-use

## Model Architecture
Cerebras-ViT-L-336-patch14-llava13b-ShareGPT4V is a transformer model based on CLIP-VisionModel-Large(openai/clip-vit-large-patch14-336). It handles images of size 336 x 336 with patch size of 14

## Intended Use
_Primary intended uses_: The primary use of LLaVA is research on large multimodal models and chatbots.

_Primary intended users_: The primary intended users of the model are researchers(both academic and industry) in computer vision, natural language processing, machine learning, and artificial intelligence

## Limitations and Bias
The pre-training dataset may have contained offensive or inappropriate content, even after applying data cleansing filters, which can be reflected in the model-generated text. We recommend that users exercise caution when using these models for their applications or any use case that may cause deliberate or unintentional harm to others. This model is for demonstration purpose only.