File size: 2,645 Bytes
d7fd89b
 
c2dc07e
 
 
 
 
d7fd89b
 
05fcbcf
d7fd89b
c2dc07e
d7fd89b
 
 
 
c2dc07e
d7fd89b
 
 
c2dc07e
 
 
4a0145c
c2dc07e
 
 
 
 
 
 
 
 
 
 
 
 
 
4a0145c
c2dc07e
4a0145c
c2dc07e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4a0145c
c2dc07e
4a0145c
c2dc07e
 
d7fd89b
 
 
 
 
c2dc07e
d7fd89b
 
 
 
af6bea9
d7fd89b
 
c2dc07e
d7fd89b
 
 
c2dc07e
d7fd89b
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
library_name: transformers
license: apache-2.0
datasets:
- berkeley-nest/Nectar
pipeline_tag: text-generation
base_model: Na0s/Llama-3.1-8b-Pruned-4-Layers_LoRA-PEFT
---

<a href="https://ibb.co/NtQ3QfF"><img src="https://i.ibb.co/RYZSZtg/model.webp" alt="model" border="0" alt="Model-card-peft-lora-1.0" align="center">></a> alt="Model-card-peft-lora-1.0" align="center">

# Model Card for Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT-1.0

## Model Details

### Model Description
- **Finetuned from model:[Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT]** 


## Training Details
# Parameters used for Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT-1.0


    model = FastLanguageModel.get_peft_model(
    
    model,
    r = 16, 
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    
    lora_alpha = 16,
    lora_dropout = 0.05, 
    bias = "none",    
   
    use_gradient_checkpointing = "unsloth", 
    random_state = 3407,
    use_rslora = False,  
    loftq_config = None, 
    )

    trainer = SFTTrainer(
    
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    dataset_text_field = "completion",
    
    max_seq_length = max_seq_length,
    dataset_num_proc = 2,
    packing = False, 
    
    args = TrainingArguments(
        per_device_train_batch_size = 6,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,
        max_steps=5000,
        learning_rate = 2e-4,
        fp16 = not is_bfloat16_supported(),
        
        bf16 = is_bfloat16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407,
        output_dir = "outputs_2",
        push_to_hub=True,
        hub_always_push=True,
      ),

      )

Dataset: Berkeley-nest/Nectar

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

[berkeley-nest/Nectar]


## Evaluation

MMLU Pro 0-shot: 0.2927


#### Evaluation Data

<!-- This should link to a Dataset Card if possible. -->

[TIGER-AI-Lab/MMLU-Pro]


## Environmental Impact

<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).