base_model: meta-llama/Llama-3.2-3B-Instruct | |
datasets: | |
- tatsu-lab/alpaca | |
language: en | |
tags: | |
- torchtune | |
# my_cool_model | |
This model is a finetuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on the [tatsu-lab/alpaca](https://huggingface.co/tatsu-lab/alpaca) dataset. | |
# Model description | |
More information needed | |
# Training and evaluation results | |
More information needed | |
# Training procedure | |
This model was trained using the [torchtune](https://github.com/pytorch/torchtune) library using the following command: | |
```bash | |
ppo_full_finetune_single_device.py\ | |
--config ./target/7B_full_ppo_low_memory_single_device.yaml\ | |
device=cuda\ | |
metric_logger._component_=torchtune.utils.metric_logging.WandBLogger\ | |
metric_logger.project=torchtune_ppo\ | |
forward_batch_size=2\ | |
batch_size=64\ | |
ppo_batch_size=32\ | |
gradient_accumulation_steps=16\ | |
compile=True\ | |
optimizer._component_=bitsandbytes.optim.PagedAdamW\ | |
optimizer.lr=3e-4 | |
``` | |
# Framework versions | |
- torchtune | |
- torchao 0.5.0 | |
- datasets 2.20.0 | |
- sentencepiece 0.2.0 | |