File size: 2,624 Bytes
d7fd89b
 
c2dc07e
 
 
 
 
d7fd89b
 
05fcbcf
d7fd89b
c2dc07e
d7fd89b
 
 
 
c2dc07e
d7fd89b
 
 
c2dc07e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d7fd89b
 
 
 
 
c2dc07e
d7fd89b
 
 
 
c2dc07e
d7fd89b
 
c2dc07e
d7fd89b
 
 
c2dc07e
d7fd89b
 
 
 
 
 
 
 
 
 
 
 
 
c2dc07e
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
library_name: transformers
license: apache-2.0
datasets:
- berkeley-nest/Nectar
pipeline_tag: text-generation
base_model: Na0s/Llama-3.1-8b-Pruned-4-Layers_LoRA-PEFT
---

<a href="https://ibb.co/NtQ3QfF"><img src="https://i.ibb.co/RYZSZtg/model.webp" alt="model" border="0" alt="Model-card-peft-lora-1.0" align="center">></a> alt="Model-card-peft-lora-1.0" align="center">

# Model Card for Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT-1.0

## Model Details

### Model Description
- **Finetuned from model:[Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT]** 


## Training Details
# Parameters used for Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT-1.0


model = FastLanguageModel.get_peft_model(
    
    model,
    r = 16, 
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    
    lora_alpha = 16,
    lora_dropout = 0.05, 
    bias = "none",    
   
    use_gradient_checkpointing = "unsloth", 
    random_state = 3407,
    use_rslora = False,  
    loftq_config = None, 
)

trainer = SFTTrainer(
    
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    dataset_text_field = "completion",
    
    max_seq_length = max_seq_length,
    dataset_num_proc = 2,
    packing = False, 
    
    args = TrainingArguments(
        per_device_train_batch_size = 6,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,
        max_steps=5000,
        learning_rate = 2e-4,
        fp16 = not is_bfloat16_supported(),
        
        bf16 = is_bfloat16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407,
        output_dir = "outputs_2",
        push_to_hub=True,
        hub_always_push=True,
    ),

)

Dataset: Berkeley-nest/Nectar

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

[berkeley-nest/Nectar]


## Evaluation

MMLU Pro 0-shot: 


#### Evaluation Data

<!-- This should link to a Dataset Card if possible. -->

[TIGER-AI-Lab/MMLU-Pro]


## Environmental Impact

<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).






[More