Edit model card

Model Description:

Pruned from meta-llama/Meta-Llama-3-8B-Instruct using the LLM-Pruner from LLM-Pruner: On the Structural Pruning of Large Language Models

Done to test viability of LLM-Pruner for task-agnostic, low resource Generative AI for Commercial and Personal Use compared to using out-of-the-box models like meta-llama/Llama-3.2-3B-Instruct

Our presentation slides may be found here

To replicate,

  1. First, clone the official implementation and run:
python llama3.py --pruning_ratio 0.25 \
                 --device cuda --eval_device cuda \
                 --base_model meta-llama/Meta-Llama-3-8B-Instruct \
                 --block_wise --block_mlp_layer_start 4 --block_mlp_layer_end 30 \
                 --block_attention_layer_start 4 --block_attention_layer_end 30 \
                 --save_ckpt_log_name llama3_prune \
                 --pruner_type taylor --taylor param_first \
                 --max_seq_len 512 \
                 --test_after_train --test_before_train --save_model 

to get the pruned model.

NOTE:

  • We removed 'ptb' from the datasets in llama3.py since it requires foreign code to load.
  • We change get_examples in llama3.py to use 'c4' since bookcorpus requires foreign code to load.
  1. Then, to post-train, follow the official implementation, section 2

Benchmark Results

Benchmark Evaluation: The model follows the original paper's evaluation and perform zero-shot task classification on 5 common sense reasoning datasets that doesn't require foreign code to load:

Model BoolQ HellaSwag ARC-e ARC-c OBQA Average Accuracy
Llama-3-6.6B-LLM-Pruned 70.86 67.64 73.82 44.28 37.6 58.84

Usage:

Follow the official implementation for usage, section Pruned Model with Post-Training.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for moiduy04/Llama-3-6.6B-LLM-Pruned

Finetuned
(425)
this model

Dataset used to train moiduy04/Llama-3-6.6B-LLM-Pruned