--- library_name: transformers base_model: - answerdotai/ModernBERT-base license: apache-2.0 language: - en pipeline_tag: zero-shot-classification datasets: - nyu-mll/glue - facebook/anli tags: - instruct - natural-language-inference - nli --- # Model Card for Model ID ModernBERT multi-task fine-tuned on tasksource NLI tasks, including MNLI, ANLI, SICK, WANLI, doc-nli, LingNLI, FOLIO, FOL-NLI, LogicNLI, Label-NLI and all datasets in the below table). This is the equivalent of an "instruct" version. Test accuracy at 100k training steps. 250k steps version coming around 25 december. | test_name | test_accuracy | |:-------------------------------------|----------------:| | glue/mnli | 0.91 | | glue/qnli | 0.93 | | glue/rte | 0.86 | | super_glue/cb | 0.89 | | anli/a1 | 0.62 | | anli/a2 | 0.47 | | anli/a3 | 0.42 | | sick/label | 0.92 | | sick/entailment_AB | 0.84 | | snli | 0.91 | | scitail/snli_format | 0.95 | | hans | 1 | | WANLI | 0.71 | | recast/recast_sentiment | 0.98 | | recast/recast_verbcorner | 0.94 | | recast/recast_ner | 0.87 | | recast/recast_factuality | 0.93 | | recast/recast_puns | 0.93 | | recast/recast_kg_relations | 0.94 | | recast/recast_verbnet | 0.88 | | recast/recast_megaveridicality | 0.87 | | probability_words_nli/usnli | 0.77 | | probability_words_nli/reasoning_1hop | 0.99 | | probability_words_nli/reasoning_2hop | 0.9 | | nan-nli | 0.85 | | nli_fever | 0.72 | | breaking_nli | 1 | | conj_nli | 0.71 | | fracas | 0.86 | | dialogue_nli | 0.88 | | mpe | 0.73 | | dnc | 0.9 | | recast_white/fnplus | 0.81 | | recast_white/sprl | 0.92 | | recast_white/dpr | 0.61 | | robust_nli/IS_CS | 0.76 | | robust_nli/LI_LI | 0.98 | | robust_nli/ST_WO | 0.85 | | robust_nli/PI_SP | 0.74 | | robust_nli/PI_CD | 0.8 | | robust_nli/ST_SE | 0.78 | | robust_nli/ST_NE | 0.86 | | robust_nli/ST_LM | 0.81 | | robust_nli_is_sd | 1 | | robust_nli_li_ts | 0.91 | | add_one_rte | 0.91 | | cycic_classification | 0.83 | | lingnli | 0.82 | | monotonicity-entailment | 0.95 | | scinli | 0.79 | | naturallogic | 0.91 | | syntactic-augmentation-nli | 0.95 | | autotnli | 0.92 | | defeasible-nli/atomic | 0.76 | | defeasible-nli/snli | 0.79 | | help-nli | 0.91 | | nli-veridicality-transitivity | 0.99 | | lonli | 0.99 | | dadc-limit-nli | 0.67 | | folio | 0.59 | | tomi-nli | 0.53 | | temporal-nli | 0.92 | | counterfactually-augmented-snli | 0.74 | | cnli | 0.81 | | logiqa-2.0-nli | 0.57 | | mindgames | 0.94 | | ConTRoL-nli | 0.65 | | logical-fallacy | 0.31 | | conceptrules_v2 | 0.99 | | zero-shot-label-nli | 0.74 | | scone | 0.97 | | monli | 0.98 | | SpaceNLI | 1 | | propsegment/nli | 0.91 | | SDOH-NLI | 1 | | scifact_entailment | 0.78 | | AdjectiveScaleProbe-nli | 0.99 | | resnli | 0.99 | | semantic_fragments_nli | 0.99 | | dataset_train_nli | 0.88 | | ruletaker | 0.91 | | PARARULE-Plus | 1 | | logical-entailment | 0.73 | | nope | 0.54 | | LogicNLI | 0.65 | | contract-nli/contractnli_a/seg | 0.87 | | contract-nli/contractnli_b/full | 0.78 | | nli4ct_semeval2024 | 0.6 | | biosift-nli | 0.88 | | SIGA-nli | 0.54 | | FOL-nli | 0.71 | | doc-nli | 0.82 | | mctest-nli | 0.89 | | idioms-nli | 0.86 | | lifecycle-entailment | 0.71 | | MSciNLI | 0.82 | | hover-3way/nli | 0.9 | | seahorse_summarization_evaluation | 0.82 | | babi_nli | 0.94 | | gen_debiased_nli | 0.9 | # Usage ## [ZS] Zero-shot classification pipeline ```python from transformers import pipeline classifier = pipeline("zero-shot-classification",model="tasksource/ModernBERT-base-nli") text = "one day I will see the world" candidate_labels = ['travel', 'cooking', 'dancing'] classifier(text, candidate_labels) ``` NLI training data of this model includes [label-nli](https://huggingface.co/datasets/tasksource/zero-shot-label-nli), a NLI dataset specially constructed to improve this kind of zero-shot classification. ## [NLI] Natural language inference pipeline ```python from transformers import pipeline pipe = pipeline("text-classification",model="tasksource/ModernBERT-base-nli") pipe([dict(text='there is a cat', text_pair='there is a black cat')]) #list of (premise,hypothesis) ``` ## Backbone for further fune-tuning This checkpoint has stronger reasoning and fine-grained abilities than the base version and can be used for further fine-tuning. # Citation ``` @inproceedings{sileo-2024-tasksource, title = "tasksource: A Large Collection of {NLP} tasks with a Structured Dataset Preprocessing Framework", author = "Sileo, Damien", booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)", month = may, year = "2024", address = "Torino, Italia", publisher = "ELRA and ICCL", url = "https://aclanthology.org/2024.lrec-main.1361", pages = "15655--15684", } ```