Model Card for Model ID

euneeei/hw-llama-2-7B-nsmc

Training Details

Training Data

  • ํ•œ๊ตญ์–ด๋กœ ๋œ ๋„ค์ด๋ฒ„ ์˜ํ™” ๋ฆฌ๋ทฐ ๋ฐ์ดํ„ฐ์…‹์ž…๋‹ˆ๋‹ค

  • train dataset : 3000๊ฐœ

  • test dataset : 1000๊ฐœ

  • ํ•™์Šต ๊ฒฐ๊ณผ ์ตœ๋Œ€ 0.87 accuracy

1. midm์œผ๋กœ ์ •ํ™•๋„ 0.91 ๋‚˜์™”๋˜ @dataclassํŒŒ๋ผ๋ฏธํ„ฐ๊ทธ๋Œ€๋กœ

  • learning_rate : 2e-4

precision recall f1-score support
negative 0.81 0.91 0.85 492
positive 0.90 0.79 0.84 508
accuracy 0.85 1000
macro avg 0.85 0.85 0.85 1000
weighted avg 0.85 0.85 0.85 1000
  • confusion Matrix:

    [[ 446, 46 ]

    [106, 402]]

  • accuracy 0.85์œผ๋กœ 0.90์— ๋ชป ๋ฏธ์ถ”์–ด, learning rate๋ฅผ ๋” ์กฐ์ ˆํ•˜๊ธฐ๋กœ ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ์‹ค์ œ๋กœ๋Š” '๊ธ์ •'์ธ๋ฐ '๋ถ€์ •'์œผ๋กœ ํŒ๋‹จํ•œ ๊ฒฝ์šฐ๊ฐ€ ๋†’๊ฒŒ ๋‚˜์™”์Šต๋‹ˆ๋‹ค.

2. learning_rate 2e-4 -> 1e-4๋กœ ๋ณ€๊ฒฝ

  • learning_rate : 1e-4

precision recall f1-score support
negative 0.82 0.88 0.85 492
positive 0.87 0.81 0.84 508
accuracy 0.84 1000
macro avg 0.84 0.84 0.84 1000
weighted avg 0.84 0.84 0.84 1000
  • confusion Matrix:

    [[ 431, 61 ]

    [96, 412]]

  • ํ•™์Šต๋ฅ  ๋ณ€๊ฒฝ์ „๋ณด๋‹ค ์ „๋ฐ˜์ ์œผ๋กœ ์ข‹์•„์ง€์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ํ•™์Šต๋ฅ ์„ ๋†’์—ฌ๋ณด๊ธฐ๋กœ ๊ฒฐ์ •ํ–ˆ์Šต๋‹ˆ๋‹ค.

3. learning_rate 1e-4 -> 4e-4๋กœ ๋ณ€๊ฒฝ

  • learning_rate 1e-4์™€ ํฌ๊ฒŒ ๋‹ฌ๋ผ์ง„ ์ ์ด ์—†์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๋‹ค๋ฅธ ๊ฒƒ์„ ์กฐ์ •์„ ํ•˜๊ธฐ๋กœ ํ–ˆ์Šต๋‹ˆ๋‹ค.

4. ๋ฐฐ์น˜ ์‚ฌ์ด์ฆˆ๋ฅผ ์ฆ๊ฐ€.

  • ๋ฉ”๋ชจ๋ฆฌ ์ด์Šˆ๋กœ script_args์˜ seq_length = 450์œผ๋กœ ์ค„์˜€์Šต๋‹ˆ๋‹ค.

  • ๊ทธ๋Ÿฌ๋‚˜ ๊ณ„์† ๋ฉ”๋ชจ๋ฆฌ ๋ถ€์กฑ์œผ๋กœ ํ•™์Šต ๋ถˆ๊ฐ€

          per_device_train_batch_size=1
          ->per_device_train_batch_size=2
          per_device_eval_batch_size=1,
          ->per_device_eval_batch_size=2
    

5. gradient_accumulation_steps ์ฆ๊ฐ€

  • ๋ฐฐ์น˜ ์‚ฌ์ด์ฆˆ ์ฆ๊ฐ€ ๋Œ€์‹  gradient accumulation step ๋ณ€๊ฒฝํ•˜๊ธฐ๋กœ ํ•จ.

  • ๋ฉ”๋ชจ๋ฆฌ ๋ถ€์กฑ ์˜ˆ๋ฐฉ์œผ๋กœ script_args์˜ seq_length = 450์œผ๋กœ ์ค„์ž„

          gradient_accumulation_steps=2,
          -> gradient_accumulation_steps=4
          -> gradient_accumulation_steps=8
    
precision recall f1-score support
negative 0.85 0.88 0.87 492
positive 0.88 0.85 0.87 508
accuracy 0.87 1000
macro avg 0.87 0.87 0.87 1000
weighted avg 0.87 0.87 0.87 1000
  • confusion Matrix:

    [[ 435, 57 ]

    [77, 431]]

    • ์ •ํ™•๋„ 0.90์„ ๋„˜๊ธฐ์ง€๋Š” ๋ชปํ–ˆ์ง€๋งŒ, "๋ถ€์ •"์„ ๋งž์ถ”๋Š” ๋น„์œจ์ด ๋งŽ์•„์กŒ์Šต๋‹ˆ๋‹ค.

6. weight_decay ๊ฐ์†Œ, learning_rate ์ฆ๊ฐ€

        weight_decay=0.03
        -> weight_decay=0.01
        learning_rate=4e-4
        -> learning_rate=5e-4
precision recall f1-score support
negative 0.85 0.89 0.87 492
positive 0.89 0.85 0.87 508
accuracy 0.87 1000
macro avg 0.87 0.87 0.87 1000
weighted avg 0.87 0.87 0.87 1000
  • ๊ฒฐ๊ณผ : 0.87, 5๋ฒˆ๊ณผ ๊ฑฐ์˜ ์ฐจ์ด๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

7. max_step ์ œํ•œ ์—†์• ๊ธฐ

precision recall f1-score support
negative 0.86 0.89 0.87 492
positive 0.89 0.86 0.87 508
accuracy 0.87 1000
macro avg 0.87 0.87 0.87 1000
weighted avg 0.87 0.87 0.87 1000
  • confusion Matrix:

    [[ 436, 56 ]

    [70, 438]]

-### ์•„์ฃผ ์กฐ๊ธˆ์”ฉ ๋” ์ •ํ™•ํ•ด์ง€๊ณ  ์žˆ์œผ๋‚˜, ์ •ํ™•๋„ 0.87์—์„œ ํฐ ๋ณ€ํ™”๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

**8. learning rate ๋” ์ค„์ด๊ธฐ

precision recall f1-score support
negative 0.84 0.90 0.87 492
positive 0.89 0.84 0.86 508
accuracy 0.87 1000
macro avg 0.87 0.87 0.87 1000
weighted avg 0.87 0.87 0.87 1000
  • confusion Matrix:

    [[ 441, 51 ]

    [83, 425]]

  • ์ด์ „๋ณด๋‹ค '๊ธ์ •'์„ ๋” ์ž˜ ๋งž์ถ”์ง€๋งŒ, '๋ถ€์ •'์„ ๋งž์ถ”๋Š” ๊ฒฝ์šฐ๊ฐ€ ์ค„์–ด๋“ค์—ˆ์Šต๋‹ˆ๋‹ค.

  • ๊ฒฐ๊ณผ์ ์œผ๋กœ ์ •ํ™•๋„ 0.87์œผ๋กœ ํ•™์Šต์„ ๋งˆ์น˜๊ฒ ์Šต๋‹ˆ๋‹ค.

Framework versions

  • PEFT 0.7.1
Downloads last month
3
Inference API
Unable to determine this modelโ€™s pipeline type. Check the docs .

Model tree for euneeei/hw-llama-2-7B-nsmc

Adapter
(1090)
this model