metadata

license: mit
language:
  - ja
base_model: microsoft/mdeberta-v3-base
tags:
  - generated_from_trainer
  - bert
  - zero-shot-classification
  - text-classification
datasets:
  - MoritzLaurer/multilingual-NLI-26lang-2mil7
metrics:
  - accuracy
  - f1
model-index:
  - name: mDeBERTa-v3-base-finetuned-nli-jnli
    results: []
pipeline_tag: zero-shot-classification
widget:
  - text: 今日の予定を教えて
    candidate_labels: 天気,ニュース,金融,予定

mDeBERTa-v3-base-finetuned-nli-jnli

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.7739
Accuracy: 0.6808
F1: 0.6742

Model description

More information needed

Intended uses & limitations

zero-shot classification

from transformers import pipeline

model_name = "thkkvui/mDeBERTa-v3-base-finetuned-nli-jnli"
classifier = pipeline("zero-shot-classification", model=model_name)

text = ["今日の天気を教えて", "ニュースある？", "予定をチェックして", "ドル円は？"]
labels = ["天気", "ニュース", "金融", "予定"]

for t in text:
    output = classifier(t, labels, multi_label=False)
    print(output)

NLI use-case

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

device = torch.device("mps" if torch.backends.mps.is_available() else "cpu")
model_name = "thkkvui/mDeBERTa-v3-base-finetuned-nli-jnli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

premise = "NY Yankees is the professional baseball team in America."
hypothesis = "メジャーリーグのチームは、日本ではニューヨークヤンキースが有名だ。"

inputs = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")

with torch.no_grad():
    output = model(**inputs)
    
preds = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
result = {name: round(float(pred) * 100, 1) for pred, name in zip(preds, label_names)}
print(result)

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.06
num_epochs: 2

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
0.753	0.53	5000	0.8758	0.6105	0.6192
0.5947	1.07	10000	0.6619	0.7054	0.7035
0.5791	1.6	15000	0.7739	0.6808	0.6742

Framework versions

Transformers 4.33.2
Pytorch 2.0.1
Datasets 2.14.5
Tokenizers 0.13.3