Home Life Improved Classifier (ONNX)

개선된 생활 카테고리 분류 모델 (ONNX 버전)

모델 설명

이 모델은 한국어 생활 관련 질문을 8개 카테고리로 분류하는 BERT 기반 분류 모델입니다.

성능

Test Accuracy: 87.5% (7/8 케이스)
이전 모델 대비: +37.5% 향상 (50% → 87.5%)

개선 사항

Hard Negative Learning으로 어려운 케이스 학습
복합 키워드 데이터 증강 (예: "에어컨 필터 청소")
Focal Loss 적용 (α=0.75, γ=1.5)
Early Stopping으로 과적합 방지

사용 방법

import onnxruntime as ort
import numpy as np
from transformers import BertTokenizer

# 모델 로드
tokenizer = BertTokenizer.from_pretrained("MongsangGa/home_life_improved-onnx")
session = ort.InferenceSession("model.onnx")

# 추론
text = "김치찌개 맛있게 끓이는 방법"
inputs = tokenizer(text, truncation=True, padding="max_length", max_length=128, return_tensors="np")
ort_inputs = {
    "input_ids": inputs["input_ids"].astype(np.int64),
    "attention_mask": inputs["attention_mask"].astype(np.int64)
}
logits = session.run(None, ort_inputs)[0]

학습 상세

Base Model: klue/roberta-small
Learning Rate: 1e-5
Batch Size: 32
Focal Loss: α=0.75, γ=1.5
Early Stopping: patience=3
Training Data: 270,465개 (원본 + 증강)

라이선스

Apache 2.0

Downloads last month: 83

Evaluation results

Test Accuracy
self-reported

0.875

Metadata error: specify a dataset to view leaderboard