File size: 1,670 Bytes
7803234
 
14954d5
 
 
 
540b78c
7803234
 
dd502ac
9475b44
 
7803234
 
 
 
346fd7d
7803234
 
 
14954d5
 
 
 
0acba52
14954d5
 
 
 
 
 
 
0acba52
7803234
 
 
 
 
14954d5
7803234
 
 
 
14954d5
7803234
 
cde7273
7803234
 
 
14954d5
7803234
 
 
 
 
 
14954d5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
library_name: transformers
license: apache-2.0
datasets:
- Open-Orca/SlimOrca
pipeline_tag: text-generation
base_model: Na0s/Llama-3.1-8b-Pruned-4-Layers
---

<a href="https://ibb.co/0Yhg31Q"><img src="https://i.ibb.co/F8gStcn/Model-card-peft-lora.webp" alt="Model-card-peft-lora" align="center"></a>

# Model Card for Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT

## Model Details

### Model Description
- **Finetuned from model:[Na0s/Llama-3.1-8b-Pruned-4-Layers]** 


## Training Details
        LoRA BF16, 
        batch_size=2, 
        steps=10000, gradient_accumulation_steps = 4,
        warmup_steps = 5,
        max_steps = 10000
        learning_rate = 2e-4,
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

[Open-Orca/SlimOrca]


## Evaluation

MMLU Pro 0-shot: 0.2937


#### Evaluation Data

<!-- This should link to a Dataset Card if possible. -->

[TIGER-AI-Lab/MMLU-Pro]


## Environmental Impact

<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).