File size: 4,641 Bytes

35af27e
 
 
 
 
 
709cd9a
35af27e
 
 
 
 
 
 
 
 
 
a24871f
35af27e
a24871f
 
 
 
 
35af27e
709cd9a
 
35af27e
709cd9a
 
 
 
 
 
 
35af27e
 
709cd9a
 
 
35af27e
709cd9a
35af27e
709cd9a
35af27e
709cd9a
35af27e
 
709cd9a
 
 
 
 
 
 
35af27e
 
709cd9a
 
 
35af27e
709cd9a
35af27e
709cd9a
35af27e
709cd9a
35af27e
709cd9a
35af27e
709cd9a
35af27e
709cd9a
 
 
 
 
35af27e
 
709cd9a
35af27e
709cd9a
 
 
 
 
 
 
35af27e
709cd9a
 
 
 
 
 
 
35af27e
 
709cd9a
 
 
35af27e
709cd9a
35af27e
709cd9a
35af27e
709cd9a
 
 
 
35af27e
 
 
709cd9a
 
 
 
 
 
 
35af27e
 
709cd9a
35af27e
709cd9a
35af27e
709cd9a
 
 
 
 
 
 
35af27e
 
709cd9a
 
 
35af27e
709cd9a
35af27e
709cd9a
 
 
 
 
 
 
 
35af27e
709cd9a
 
 
35af27e
709cd9a
428e242
35af27e

---
library_name: peft
base_model: meta-llama/Llama-2-7b-chat-hf
---

# Model Card for Model ID
## euneeei/hw-llama-2-7B-nsmc
<!-- Provide a quick summary of what the model is/does. -->




## Training Details

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
- ### 한국어로 된 네이버 영화 리뷰 데이터셋입니다

  
- ## train dataset : 3000개
- ## test dataset : 1000개

- ## 학습 결과 최대 0.87 accuracy

## **1. midm으로 정확도 0.91 나왔던 @dataclass파라미터그대로**
- ### learning_rate : 2e-4

|     | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.81 | 0.91 | 0.85 | 492
positive | 0.90 | 0.79 | 0.84 | 508
accuracy | | | 0.85 | 1000
macro avg | 0.85 | 0.85 | 0.85 | 1000
weighted avg | 0.85 | 0.85 | 0.85 | 1000


- ### confusion Matrix:
  ### [[ 446, 46 ]
  ### [106, 402]]

- ### accuracy 0.85으로 0.90에 못 미추어, learning rate를 더 조절하기로 했습니다. 또한 실제로는 '긍정'인데 '부정'으로 판단한 경우가 높게 나왔습니다.

## **2. learning_rate 2e-4 -> 1e-4로 변경**

- ### learning_rate : 1e-4


|     | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.82 | 0.88 | 0.85 | 492
positive | 0.87 | 0.81 | 0.84 | 508
accuracy | | | 0.84 | 1000
macro avg | 0.84 | 0.84 | 0.84 | 1000
weighted avg | 0.84 | 0.84 | 0.84 | 1000


- ### confusion Matrix:
  ### [[ 431, 61 ]
  ### [96, 412]]

- ### 학습률 변경전보다 전반적으로 좋아지지 않았습니다. 따라서 학습률을 높여보기로 결정했습니다.

## **3. learning_rate 1e-4 -> 4e-4로 변경**

- ### learning_rate 1e-4와 크게 달라진 점이 없었습니다. 그래서 다른 것을 조정을 하기로 했습니다.

## **4. 배치 사이즈를 증가.**

- ### 메모리 이슈로 script_args의 seq_length = 450으로 줄였습니다.  

- ### 그러나 계속 메모리 부족으로 학습 불가
            per_device_train_batch_size=1
            ->per_device_train_batch_size=2
            per_device_eval_batch_size=1,
            ->per_device_eval_batch_size=2


## **5. gradient_accumulation_steps 증가**

- ### 배치 사이즈 증가 대신 gradient accumulation step 변경하기로 함.
- ### 메모리 부족 예방으로 script_args의 seq_length = 450으로 줄임  
            gradient_accumulation_steps=2,
            -> gradient_accumulation_steps=4
            -> gradient_accumulation_steps=8
            
            

|     | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.85 | 0.88 | 0.87 | 492
positive | 0.88 | 0.85 | 0.87 | 508
accuracy | | | 0.87 | 1000
macro avg | 0.87 | 0.87 | 0.87 | 1000
weighted avg | 0.87 | 0.87 | 0.87 | 1000


- ### confusion Matrix:
  ### [[ 435, 57 ]
  ### [77, 431]]

  - ### 정확도 0.90을 넘기지는 못했지만, "부정"을 맞추는 비율이 많아졌습니다.

## **6. weight_decay 감소, learning_rate 증가**

            weight_decay=0.03
            -> weight_decay=0.01
            learning_rate=4e-4
            -> learning_rate=5e-4



|     | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.85 | 0.89 | 0.87 | 492
positive | 0.89 | 0.85 | 0.87 | 508
accuracy | | | 0.87 | 1000
macro avg | 0.87 | 0.87 | 0.87 | 1000
weighted avg | 0.87 | 0.87 | 0.87 | 1000


- ### 결과 : 0.87, 5번과 거의 차이가 없습니다.

## **7. max_step 제한 없애기**

|     | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.86 | 0.89 | 0.87 | 492
positive | 0.89 | 0.86 | 0.87 | 508
accuracy | | | 0.87 | 1000
macro avg | 0.87 | 0.87 | 0.87 | 1000
weighted avg | 0.87 | 0.87 | 0.87 | 1000


- ### confusion Matrix:
  ### [[ 436, 56 ]
  ### [70, 438]]

-### 아주 조금씩 더 정확해지고 있으나, 정확도 0.87에서 큰 변화가 없습니다.

## **8. learning rate 더 줄이기
|     | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.84 | 0.90 | 0.87 | 492
positive | 0.89 | 0.84 | 0.86 | 508
accuracy | | | 0.87 | 1000
macro avg | 0.87 | 0.87 | 0.87 | 1000
weighted avg | 0.87 | 0.87 | 0.87 | 1000

- ### confusion Matrix:
  ### [[ 441, 51 ]
  ### [83, 425]]

- ### 이전보다 '긍정'을 더 잘 맞추지만, '부정'을 맞추는 경우가 줄어들었습니다.
- ### 결과적으로 정확도 0.87으로 학습을 마치겠습니다.


### Framework versions

- PEFT 0.7.1