Model Description

  • NSMC ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด meta-llama/Llama-2-7b-chat-hf ๋ฏธ์„ธํŠœ๋‹
  • ์˜ํ™” ๋ฆฌ๋ทฐ ํ…์ŠคํŠธ๋ฅผ ํ”„๋กฌํ”„ํŠธ์— ํฌํ•จํ•˜์—ฌ ๋ชจ๋ธ์— ์ž…๋ ฅํ•˜๋ฉด '๊ธ์ •' ๋˜๋Š” '๋ถ€์ •'์ด๋ผ๊ณ  ์˜ˆ์ธก ํ…์ŠคํŠธ๋ฅผ ์ง์ ‘ ์ƒ์„ฑ
  • NSMC์˜ train ์Šคํ”Œ๋ฆฟ ์ƒ์œ„ 2,000๊ฐœ ์ด์ƒ์˜ ์ƒ˜ํ”Œ์„ ํ•™์Šต์— ์‚ฌ์šฉ
  • test ์Šคํ”Œ๋ฆฟ ์ƒ์œ„ 1,000๊ฐœ์˜ ์ƒ˜ํ”Œ๋งŒ ์ธก์ •

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 2
  • optimizer: adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08,
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • training_args.logging_steps: 100
  • training_args.max_steps : 1600
  • trainable params: 19,988,480 || all params: 6,758,404,096 || trainable%: 0.2957573965106688

Training Results

TrainOutput(global_step=1600, training_loss=0.7892872190475464, metrics={'train_runtime': 5825.2445, 'train_samples_per_second': 0.549, 'train_steps_per_second': 0.275, 'total_flos': 6.51493254365184e+16, 'train_loss': 0.7892872190475464, 'epoch': 1.6})

Accuracy

Llama2: ์ •ํ™•๋„ 0.52

TP TN
PP 192 168
PN 317 324

์ •ํ™•๋„๋ฅผ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•ด ์—ฌ๋Ÿฌ ์ฐจ๋ก€ ๋…ธ๋ ฅ์„ ํ•ด๋ณด์•˜์ง€๋งŒ ๋ฐ˜๋ณตํ•ด์„œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜์˜€์Šต๋‹ˆ๋‹ค.

Model Card Authors

cxoijve

Downloads last month
8
Inference API
Unable to determine this modelโ€™s pipeline type. Check the docs .

Model tree for cxoijve/Llama-2-7b-chat-hf

Adapter
(1096)
this model