Teera
/

Llama-3.2v-COT-Thai

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Llama-3.2v-COT-Thai / README.md

Teera's picture

Update README.md

7e3bd42 verified about 2 months ago

|

history blame contribute delete

2.55 kB

	---
	base_model: Xkev/Llama-3.2V-11B-cot
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- mllama
	- trl
	license: apache-2.0
	language:
	- en
	- th
	---

	# Model Card for Model ID

	Teera/Llama-3.2v-COT-Thai is a fine-tuned model based on Llama-3.2V-11B-co, developed with inspiration from the LLaVA-CoT framework.

	The concept was introduced in LLaVA-CoT: Let Vision Language Models Reason Step-by-Step.

	## Training Details

	# Training Data

	The model is trained on the LLaVA-CoT-100k dataset, which has been preprocessed and translated into the Thai language.

	# Training Procedure
	The model is finetuned on llama-recipes with the following settings. Using the same setting should accurately reproduce our results.

	\| Parameter \| Value \|
	\|-------------------------------\|---------------------------------------------------\|
	\| FSDP \| enabled \|
	\| lr \| 1e-4 \|
	\| num_epochs \| 1 \|
	\| batch_size_training \| 2 \|
	\| use_fast_kernels \| True \|
	\| run_validation \| False \|
	\| batching_strategy \| padding \|
	\| context_length \| 4096 \|
	\| gradient_accumulation_steps \| 1 \|
	\| gradient_clipping \| False \|
	\| gradient_clipping_threshold \| 1.0 \|
	\| weight_decay \| 0.0 \|
	\| gamma \| 0.85 \|
	\| seed \| 42 \|
	\| use_fp16 \| False \|
	\| mixed_precision \| True \|


	## Bias, Risks, and Limitations

	The model may generate biased or offensive content, similar to other VLMs, due to limitations in the training data. Technically, the model's performance in aspects like instruction following still falls short of leading industry models.