Home Life Improved Classifier (ONNX)
๊ฐ์ ๋ ์ํ ์นดํ ๊ณ ๋ฆฌ ๋ถ๋ฅ ๋ชจ๋ธ (ONNX ๋ฒ์ )
๋ชจ๋ธ ์ค๋ช
์ด ๋ชจ๋ธ์ ํ๊ตญ์ด ์ํ ๊ด๋ จ ์ง๋ฌธ์ 8๊ฐ ์นดํ ๊ณ ๋ฆฌ๋ก ๋ถ๋ฅํ๋ BERT ๊ธฐ๋ฐ ๋ถ๋ฅ ๋ชจ๋ธ์ ๋๋ค.
์นดํ ๊ณ ๋ฆฌ
- ์ํ๊ฒฝ์ /๊ณ์ฝ
- ์ํ์๋ฆฌ/DIY
- ์ค๋งํธํ/๊ฐ์
- ์๋ฆฌ/์ํ๊ด๋ฆฌ
- ์ก์/๋ฐ๋ ค๋๋ฌผ
- ์ด์ฌ/์ธํ ๋ฆฌ์ด
- ์ฒญ์/์ธํ
- ํ๊ฒฝ/๊ฑด๊ฐ
์ฑ๋ฅ
- Test Accuracy: 87.5% (7/8 ์ผ์ด์ค)
- ์ด์ ๋ชจ๋ธ ๋๋น: +37.5% ํฅ์ (50% โ 87.5%)
๊ฐ์ ์ฌํญ
- Hard Negative Learning์ผ๋ก ์ด๋ ค์ด ์ผ์ด์ค ํ์ต
- ๋ณตํฉ ํค์๋ ๋ฐ์ดํฐ ์ฆ๊ฐ (์: "์์ด์ปจ ํํฐ ์ฒญ์")
- Focal Loss ์ ์ฉ (ฮฑ=0.75, ฮณ=1.5)
- Early Stopping์ผ๋ก ๊ณผ์ ํฉ ๋ฐฉ์ง
์ฌ์ฉ ๋ฐฉ๋ฒ
import onnxruntime as ort
import numpy as np
from transformers import BertTokenizer
# ๋ชจ๋ธ ๋ก๋
tokenizer = BertTokenizer.from_pretrained("MongsangGa/home_life_improved-onnx")
session = ort.InferenceSession("model.onnx")
# ์ถ๋ก
text = "๊น์น์ฐ๊ฐ ๋ง์๊ฒ ๋์ด๋ ๋ฐฉ๋ฒ"
inputs = tokenizer(text, truncation=True, padding="max_length", max_length=128, return_tensors="np")
ort_inputs = {
"input_ids": inputs["input_ids"].astype(np.int64),
"attention_mask": inputs["attention_mask"].astype(np.int64)
}
logits = session.run(None, ort_inputs)[0]
ํ์ต ์์ธ
- Base Model: klue/roberta-small
- Learning Rate: 1e-5
- Batch Size: 32
- Focal Loss: ฮฑ=0.75, ฮณ=1.5
- Early Stopping: patience=3
- Training Data: 270,465๊ฐ (์๋ณธ + ์ฆ๊ฐ)
๋ผ์ด์ ์ค
Apache 2.0
- Downloads last month
- 83
Evaluation results
- Test Accuracyself-reported0.875