kubota
/

luke-large-defamation-detection-japanese

@@ -1,66 +1,48 @@
 ---
-license: apache-2.0
-tags:
-- generated_from_trainer
-metrics:
-- accuracy
-- f1
-model-index:
-- name: luke-large-defamation-detection-japanese
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # luke-large-defamation-detection-japanese
-This model is a fine-tuned version of [studio-ousia/luke-japanese-large](https://huggingface.co/studio-ousia/luke-japanese-large) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.4430
-- Accuracy: 0.6616
-- F1: 0.6381
-- Auc: 0.8630
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 1e-05
-- train_batch_size: 4
-- eval_batch_size: 4
-- seed: 777
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: cosine
-- num_epochs: 4
-- mixed_precision_training: Native AMP
-### Training results
-| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1     | Auc    |
-|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:------:|
-| 0.4219        | 1.0   | 1780 | 0.3979          | 0.6630   | 0.6084 | 0.8466 |
-| 0.3375        | 2.0   | 3560 | 0.4050          | 0.6706   | 0.6242 | 0.8618 |
-| 0.2716        | 3.0   | 5340 | 0.4362          | 0.6595   | 0.6370 | 0.8626 |
-| 0.2331        | 4.0   | 7120 | 0.4430          | 0.6616   | 0.6381 | 0.8630 |
-### Framework versions
-- Transformers 4.26.0
-- Pytorch 1.13.1+cu116
-- Datasets 2.8.0
-- Tokenizers 0.13.2

 ---
+license: cc-by-sa-4.0
+datasets:
+- kubota/defamation-japanese-twitter
+language:
+- ja
+pipeline_tag: text-classification
+widget:
+- text: お前のことを殺すぞ
+- text: 本当に不細工だなぁ
+- text: あの人は殺人を犯した犯罪者らしい
 ---
 # luke-large-defamation-detection-japanese
+# 日本語誹謗中傷検出器
+This model is a fine-tuned version of [studio-ousia/luke-japanese-large](https://huggingface.co/studio-ousia/luke-japanese-large) for the Japanese language finetuned for automatic defamation detection.
+The original foundation model was finetuned on a balanced dataset created by unifying two datasets:
+- [![Generic badge](https://img.shields.io/badge/Dataset-DefamationJapaneseTwitter-red.svg)](https://huggingface.co/datasets/kubota/defamation-japanese-twitter)
+- `DefamationJapaneseYouTube` : TBA
+<b>Labels</b>:\
+0 -> "中傷性のない発言"\
+1 -> "脅迫的な発言"\
+2 -> "侮蔑的な発言"\
+3"-> "名誉を低下させる発言"
+## Example Pipeline
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/kubotaissei/defamation_japanese_twitter/blob/master/notebooks/pipeline_example.ipynb)
+```python
+# !pip install transformers==4.26 sentencepiece
+from transformers import pipeline
+pipe = pipeline(model="kubota/luke-large-defamation-detection-japanese")
+pipe("あの人は殺人を犯した犯罪者らしい")
+```
+```
+[{'label': '名誉を低下させる発言', 'score': 0.8889994621276855}]
+```
+## Training Scripts
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/kubotaissei/defamation_japanese_twitter/blob/master/notebooks/train_example.ipynb)
+## Licenses
+The finetuned model with all attached files is licensed under [CC BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/), or Creative Commons Attribution-ShareAlike 4.0 International License.
+<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a>