---
library_name: peft
base_model: meta-llama/Llama-2-7b-chat-hf
---

# Model Card for Model ID
## euneeei/hw-llama-2-7B-nsmc
<!-- Provide a quick summary of what the model is/does. -->


## Training Details

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
- ### 한국어로 된 네이버 영화 리뷰 데이터셋입니다

  
- ## train dataset : 3000개
- ## test dataset : 1000개

- ## 학습 결과 최대 0.87 accuracy

## **1. midm으로 정확도 0.91 나왔던 @dataclass파라미터그대로**
- ### learning_rate : 2e-4

|     | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.81 | 0.91 | 0.85 | 492
positive | 0.90 | 0.79 | 0.84 | 508
accuracy | | | 0.85 | 1000
macro avg | 0.85 | 0.85 | 0.85 | 1000
weighted avg | 0.85 | 0.85 | 0.85 | 1000


- ### confusion Matrix:
  ### [[ 446, 46 ]
  ### [106, 402]]

- ### accuracy 0.85으로 0.90에 못 미추어, learning rate를 더 조절하기로 했습니다. 또한 실제로는 '긍정'인데 '부정'으로 판단한 경우가 높게 나왔습니다.

## **2. learning_rate 2e-4 -> 1e-4로 변경**

- ### learning_rate : 1e-4


|     | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.82 | 0.88 | 0.85 | 492
positive | 0.87 | 0.81 | 0.84 | 508
accuracy | | | 0.84 | 1000
macro avg | 0.84 | 0.84 | 0.84 | 1000
weighted avg | 0.84 | 0.84 | 0.84 | 1000


- ### confusion Matrix:
  ### [[ 431, 61 ]
  ### [96, 412]]

- ### 학습률 변경전보다 전반적으로 좋아지지 않았습니다. 따라서 학습률을 높여보기로 결정했습니다.

## **3. learning_rate 1e-4 -> 4e-4로 변경**

- ### learning_rate 1e-4와 크게 달라진 점이 없었습니다. 그래서 다른 것을 조정을 하기로 했습니다.

## **4. 배치 사이즈를 증가.**

- ### 메모리 이슈로 script_args의 seq_length = 450으로 줄였습니다.  

- ### 그러나 계속 메모리 부족으로 학습 불가
            per_device_train_batch_size=1
            ->per_device_train_batch_size=2
            per_device_eval_batch_size=1,
            ->per_device_eval_batch_size=2


## **5. gradient_accumulation_steps 증가**

- ### 배치 사이즈 증가 대신 gradient accumulation step 변경하기로 함.
- ### 메모리 부족 예방으로 script_args의 seq_length = 450으로 줄임  
            gradient_accumulation_steps=2,
            -> gradient_accumulation_steps=4
            -> gradient_accumulation_steps=8
            
            
|     | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.85 | 0.88 | 0.87 | 492
positive | 0.88 | 0.85 | 0.87 | 508
accuracy | | | 0.87 | 1000
macro avg | 0.87 | 0.87 | 0.87 | 1000
weighted avg | 0.87 | 0.87 | 0.87 | 1000


- ### confusion Matrix:
  ### [[ 435, 57 ]
  ### [77, 431]]

  - ### 정확도 0.90을 넘기지는 못했지만, "부정"을 맞추는 비율이 많아졌습니다.

## **6. weight_decay 감소, learning_rate 증가**

            weight_decay=0.03
            -> weight_decay=0.01
            learning_rate=4e-4
            -> learning_rate=5e-4


|     | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.85 | 0.89 | 0.87 | 492
positive | 0.89 | 0.85 | 0.87 | 508
accuracy | | | 0.87 | 1000
macro avg | 0.87 | 0.87 | 0.87 | 1000
weighted avg | 0.87 | 0.87 | 0.87 | 1000


- ### 결과 : 0.87, 5번과 거의 차이가 없습니다.

## **7. max_step 제한 없애기**

|     | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.86 | 0.89 | 0.87 | 492
positive | 0.89 | 0.86 | 0.87 | 508
accuracy | | | 0.87 | 1000
macro avg | 0.87 | 0.87 | 0.87 | 1000
weighted avg | 0.87 | 0.87 | 0.87 | 1000


- ### confusion Matrix:
  ### [[ 436, 56 ]
  ### [70, 438]]

-### 아주 조금씩 더 정확해지고 있으나, 정확도 0.87에서 큰 변화가 없습니다.

## **8. learning rate 더 줄이기
|     | precision | recall | f1-score | support|
|----|----|----|-------|------|
negative| 0.84 | 0.90 | 0.87 | 492
positive | 0.89 | 0.84 | 0.86 | 508
accuracy | | | 0.87 | 1000
macro avg | 0.87 | 0.87 | 0.87 | 1000
weighted avg | 0.87 | 0.87 | 0.87 | 1000

- ### confusion Matrix:
  ### [[ 441, 51 ]
  ### [83, 425]]

- ### 이전보다 '긍정'을 더 잘 맞추지만, '부정'을 맞추는 경우가 줄어들었습니다.
- ### 결과적으로 정확도 0.87으로 학습을 마치겠습니다.


### Framework versions

- PEFT 0.7.1