hw-llama-2-7B-nsmc / README.md
euneeei's picture
Update README.md
428e242
---
library_name: peft
base_model: meta-llama/Llama-2-7b-chat-hf
---
# Model Card for Model ID
## euneeei/hw-llama-2-7B-nsmc
<!-- Provide a quick summary of what the model is/does. -->
## Training Details
### Training Data
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
- ### ํ•œ๊ตญ์–ด๋กœ ๋œ ๋„ค์ด๋ฒ„ ์˜ํ™” ๋ฆฌ๋ทฐ ๋ฐ์ดํ„ฐ์…‹์ž…๋‹ˆ๋‹ค
- ## train dataset : 3000๊ฐœ
- ## test dataset : 1000๊ฐœ
- ## ํ•™์Šต ๊ฒฐ๊ณผ ์ตœ๋Œ€ 0.87 accuracy
## **1. midm์œผ๋กœ ์ •ํ™•๋„ 0.91 ๋‚˜์™”๋˜ @dataclassํŒŒ๋ผ๋ฏธํ„ฐ๊ทธ๋Œ€๋กœ**
- ### learning_rate : 2e-4
| | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.81 | 0.91 | 0.85 | 492
positive | 0.90 | 0.79 | 0.84 | 508
accuracy | | | 0.85 | 1000
macro avg | 0.85 | 0.85 | 0.85 | 1000
weighted avg | 0.85 | 0.85 | 0.85 | 1000
- ### confusion Matrix:
### [[ 446, 46 ]
### [106, 402]]
- ### accuracy 0.85์œผ๋กœ 0.90์— ๋ชป ๋ฏธ์ถ”์–ด, learning rate๋ฅผ ๋” ์กฐ์ ˆํ•˜๊ธฐ๋กœ ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ์‹ค์ œ๋กœ๋Š” '๊ธ์ •'์ธ๋ฐ '๋ถ€์ •'์œผ๋กœ ํŒ๋‹จํ•œ ๊ฒฝ์šฐ๊ฐ€ ๋†’๊ฒŒ ๋‚˜์™”์Šต๋‹ˆ๋‹ค.
## **2. learning_rate 2e-4 -> 1e-4๋กœ ๋ณ€๊ฒฝ**
- ### learning_rate : 1e-4
| | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.82 | 0.88 | 0.85 | 492
positive | 0.87 | 0.81 | 0.84 | 508
accuracy | | | 0.84 | 1000
macro avg | 0.84 | 0.84 | 0.84 | 1000
weighted avg | 0.84 | 0.84 | 0.84 | 1000
- ### confusion Matrix:
### [[ 431, 61 ]
### [96, 412]]
- ### ํ•™์Šต๋ฅ  ๋ณ€๊ฒฝ์ „๋ณด๋‹ค ์ „๋ฐ˜์ ์œผ๋กœ ์ข‹์•„์ง€์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ํ•™์Šต๋ฅ ์„ ๋†’์—ฌ๋ณด๊ธฐ๋กœ ๊ฒฐ์ •ํ–ˆ์Šต๋‹ˆ๋‹ค.
## **3. learning_rate 1e-4 -> 4e-4๋กœ ๋ณ€๊ฒฝ**
- ### learning_rate 1e-4์™€ ํฌ๊ฒŒ ๋‹ฌ๋ผ์ง„ ์ ์ด ์—†์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๋‹ค๋ฅธ ๊ฒƒ์„ ์กฐ์ •์„ ํ•˜๊ธฐ๋กœ ํ–ˆ์Šต๋‹ˆ๋‹ค.
## **4. ๋ฐฐ์น˜ ์‚ฌ์ด์ฆˆ๋ฅผ ์ฆ๊ฐ€.**
- ### ๋ฉ”๋ชจ๋ฆฌ ์ด์Šˆ๋กœ script_args์˜ seq_length = 450์œผ๋กœ ์ค„์˜€์Šต๋‹ˆ๋‹ค.
- ### ๊ทธ๋Ÿฌ๋‚˜ ๊ณ„์† ๋ฉ”๋ชจ๋ฆฌ ๋ถ€์กฑ์œผ๋กœ ํ•™์Šต ๋ถˆ๊ฐ€
per_device_train_batch_size=1
->per_device_train_batch_size=2
per_device_eval_batch_size=1,
->per_device_eval_batch_size=2
## **5. gradient_accumulation_steps ์ฆ๊ฐ€**
- ### ๋ฐฐ์น˜ ์‚ฌ์ด์ฆˆ ์ฆ๊ฐ€ ๋Œ€์‹  gradient accumulation step ๋ณ€๊ฒฝํ•˜๊ธฐ๋กœ ํ•จ.
- ### ๋ฉ”๋ชจ๋ฆฌ ๋ถ€์กฑ ์˜ˆ๋ฐฉ์œผ๋กœ script_args์˜ seq_length = 450์œผ๋กœ ์ค„์ž„
gradient_accumulation_steps=2,
-> gradient_accumulation_steps=4
-> gradient_accumulation_steps=8
| | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.85 | 0.88 | 0.87 | 492
positive | 0.88 | 0.85 | 0.87 | 508
accuracy | | | 0.87 | 1000
macro avg | 0.87 | 0.87 | 0.87 | 1000
weighted avg | 0.87 | 0.87 | 0.87 | 1000
- ### confusion Matrix:
### [[ 435, 57 ]
### [77, 431]]
- ### ์ •ํ™•๋„ 0.90์„ ๋„˜๊ธฐ์ง€๋Š” ๋ชปํ–ˆ์ง€๋งŒ, "๋ถ€์ •"์„ ๋งž์ถ”๋Š” ๋น„์œจ์ด ๋งŽ์•„์กŒ์Šต๋‹ˆ๋‹ค.
## **6. weight_decay ๊ฐ์†Œ, learning_rate ์ฆ๊ฐ€**
weight_decay=0.03
-> weight_decay=0.01
learning_rate=4e-4
-> learning_rate=5e-4
| | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.85 | 0.89 | 0.87 | 492
positive | 0.89 | 0.85 | 0.87 | 508
accuracy | | | 0.87 | 1000
macro avg | 0.87 | 0.87 | 0.87 | 1000
weighted avg | 0.87 | 0.87 | 0.87 | 1000
- ### ๊ฒฐ๊ณผ : 0.87, 5๋ฒˆ๊ณผ ๊ฑฐ์˜ ์ฐจ์ด๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.
## **7. max_step ์ œํ•œ ์—†์• ๊ธฐ**
| | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.86 | 0.89 | 0.87 | 492
positive | 0.89 | 0.86 | 0.87 | 508
accuracy | | | 0.87 | 1000
macro avg | 0.87 | 0.87 | 0.87 | 1000
weighted avg | 0.87 | 0.87 | 0.87 | 1000
- ### confusion Matrix:
### [[ 436, 56 ]
### [70, 438]]
-### ์•„์ฃผ ์กฐ๊ธˆ์”ฉ ๋” ์ •ํ™•ํ•ด์ง€๊ณ  ์žˆ์œผ๋‚˜, ์ •ํ™•๋„ 0.87์—์„œ ํฐ ๋ณ€ํ™”๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.
## **8. learning rate ๋” ์ค„์ด๊ธฐ
| | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.84 | 0.90 | 0.87 | 492
positive | 0.89 | 0.84 | 0.86 | 508
accuracy | | | 0.87 | 1000
macro avg | 0.87 | 0.87 | 0.87 | 1000
weighted avg | 0.87 | 0.87 | 0.87 | 1000
- ### confusion Matrix:
### [[ 441, 51 ]
### [83, 425]]
- ### ์ด์ „๋ณด๋‹ค '๊ธ์ •'์„ ๋” ์ž˜ ๋งž์ถ”์ง€๋งŒ, '๋ถ€์ •'์„ ๋งž์ถ”๋Š” ๊ฒฝ์šฐ๊ฐ€ ์ค„์–ด๋“ค์—ˆ์Šต๋‹ˆ๋‹ค.
- ### ๊ฒฐ๊ณผ์ ์œผ๋กœ ์ •ํ™•๋„ 0.87์œผ๋กœ ํ•™์Šต์„ ๋งˆ์น˜๊ฒ ์Šต๋‹ˆ๋‹ค.
### Framework versions
- PEFT 0.7.1