|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
datasets: |
|
- Locutusque/hercules-v2.0 |
|
- CollectiveCognition/chats-data-2023-09-22 |
|
language: |
|
- en |
|
--- |
|
|
|
# lr-experiment1-7B |
|
|
|
The lr-experiment model series is a research project I'm conducting that I will be using to determine the best learning rate to use while fine-tuning Mistral. This model uses a learning rate of 2e-5 with a cosine scheduler and no warmup steps. |
|
|
|
I used Locutusque/Hercules-2.0-Mistral-7B as a base model, and further fine-tuned it on CollectiveCognition/chats-data-2023-09-22 using QLoRA for 3 epochs. I will be keeping track of evaluation results, and will comparing it to upcoming models. |
|
|
|
# Evals |
|
|
|
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr| |
|
|---------------------------------|-------|------|------|--------|-----:|---|-----:| |
|
|agieval_nous |N/A |none |None |acc |0.3645|± |0.0093| |
|
| | |none |None |acc_norm|0.3468|± |0.0092| |
|
| - agieval_aqua_rat | 1|none |None |acc |0.2283|± |0.0264| |
|
| | |none |None |acc_norm|0.2283|± |0.0264| |
|
| - agieval_logiqa_en | 1|none |None |acc |0.2965|± |0.0179| |
|
| | |none |None |acc_norm|0.3303|± |0.0184| |
|
| - agieval_lsat_ar | 1|none |None |acc |0.2217|± |0.0275| |
|
| | |none |None |acc_norm|0.1783|± |0.0253| |
|
| - agieval_lsat_lr | 1|none |None |acc |0.4039|± |0.0217| |
|
| | |none |None |acc_norm|0.3686|± |0.0214| |
|
| - agieval_lsat_rc | 1|none |None |acc |0.4870|± |0.0305| |
|
| | |none |None |acc_norm|0.4424|± |0.0303| |
|
| - agieval_sat_en | 1|none |None |acc |0.6408|± |0.0335| |
|
| | |none |None |acc_norm|0.5971|± |0.0343| |
|
| - agieval_sat_en_without_passage| 1|none |None |acc |0.3932|± |0.0341| |
|
| | |none |None |acc_norm|0.3835|± |0.0340| |
|
| - agieval_sat_math | 1|none |None |acc |0.3455|± |0.0321| |
|
| | |none |None |acc_norm|0.2727|± |0.0301| |
|
|
|
| Groups |Version|Filter|n-shot| Metric |Value | |Stderr| |
|
|------------|-------|------|------|--------|-----:|---|-----:| |
|
|agieval_nous|N/A |none |None |acc |0.3645|± |0.0093| |
|
| | |none |None |acc_norm|0.3468|± |0.0092| |