What libraries can I use for Text Classification?

The adapter-transformers, setfit, spacy, transformers, and transformers.js libraries are compatible with Text Classification.

What models can I use for Text Classification?

The distilbert/distilbert-base-uncased-finetuned-sst-2-english, ProsusAI/finbert, cardiffnlp/twitter-roberta-base-sentiment-latest, papluca/xlm-roberta-base-language-detection, and meta-llama/Prompt-Guard-86M models can be used for Text Classification.

What datasets can I use for Text Classification?

The nyu-mll/glue and stanfordnlp/snli datasets can be used for Text Classification.

What metrics can I use for Text Classification?

The accuracy, recall, precision, and f1 metrics can be used for Text Classification.

Tasks

Text Classification

Text Classification is the task of assigning a label or class to a given text. Some use cases are sentiment analysis, natural language inference, and assessing grammatical correctness.

Inputs

Input

I love Hugging Face!

Text Classification Model

Output

POSITIVE

0.900

NEUTRAL

0.100

NEGATIVE

0.000

About Text Classification

Use Cases

Sentiment Analysis on Customer Reviews

You can track the sentiments of your customers from the product reviews using sentiment analysis models. This can help understand churn and retention by grouping reviews by sentiment, to later analyze the text and make strategic decisions based on this knowledge.

Task Variants

Natural Language Inference (NLI)

In NLI the model determines the relationship between two given texts. Concretely, the model takes a premise and a hypothesis and returns a class that can either be:

entailment, which means the hypothesis is true.
contraction, which means the hypothesis is false.
neutral, which means there's no relation between the hypothesis and the premise.

The benchmark dataset for this task is GLUE (General Language Understanding Evaluation). NLI models have different variants, such as Multi-Genre NLI, Question NLI and Winograd NLI.

Multi-Genre NLI (MNLI)

MNLI is used for general NLI. Here are som examples:

Example 1:
    Premise: A man inspects the uniform of a figure in some East Asian country.
    Hypothesis: The man is sleeping.
    Label: Contradiction

Example 2:
    Premise: Soccer game with multiple males playing.
    Hypothesis: Some men are playing a sport.
    Label: Entailment

Inference

You can use the 🤗 Transformers library text-classification pipeline to infer with NLI models.

from transformers import pipeline

classifier = pipeline("text-classification", model = "roberta-large-mnli")
classifier("A soccer game with multiple males playing. Some men are playing a sport.")
## [{'label': 'ENTAILMENT', 'score': 0.98}]

Question Natural Language Inference (QNLI)

QNLI is the task of determining if the answer to a certain question can be found in a given document. If the answer can be found the label is “entailment”. If the answer cannot be found the label is “not entailment".

Question: What percentage of marine life died during the extinction?
Sentence: It is also known as the “Great Dying” because it is considered the largest mass extinction in the Earth’s history.
Label: not entailment

Question: Who was the London Weekend Television’s Managing Director?
Sentence: The managing director of London Weekend Television (LWT), Greg Dyke, met with the representatives of the "big five" football clubs in England in 1990.
Label: entailment

Inference

You can use the 🤗 Transformers library text-classification pipeline to infer with QNLI models. The model returns the label and the confidence.

from transformers import pipeline

classifier = pipeline("text-classification", model = "cross-encoder/qnli-electra-base")
classifier("Where is the capital of France?, Paris is the capital of France.")
## [{'label': 'entailment', 'score': 0.997}]

Sentiment Analysis

In Sentiment Analysis, the classes can be polarities like positive, negative, neutral, or sentiments such as happiness or anger.

Inference

You can use the 🤗 Transformers library with the sentiment-analysis pipeline to infer with Sentiment Analysis models. The model returns the label with the score.

from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier("I loved Star Wars so much!")
##  [{'label': 'POSITIVE', 'score': 0.99}

Quora Question Pairs

Quora Question Pairs models assess whether two provided questions are paraphrases of each other. The model takes two questions and returns a binary value, with 0 being mapped to “not paraphrase” and 1 to “paraphrase". The benchmark dataset is Quora Question Pairs inside the GLUE benchmark. The dataset consists of question pairs and their labels.

Question1: “How can I increase the speed of my internet connection while using a VPN?”
Question2: How can Internet speed be increased by hacking through DNS?
Label: Not paraphrase

Question1: “What can make Physics easy to learn?”
Question2: “How can you make physics easy to learn?”
Label: Paraphrase

Inference

You can use the 🤗 Transformers library text-classification pipeline to infer with QQPI models.

from transformers import pipeline

classifier = pipeline("text-classification", model = "textattack/bert-base-uncased-QQP")
classifier("Which city is the capital of France?, Where is the capital of France?")
## [{'label': 'paraphrase', 'score': 0.998}]

You can use huggingface.js to infer text classification models on Hugging Face Hub.

import { InferenceClient } from "@huggingface/inference";

const inference = new InferenceClient(HF_TOKEN);
await inference.conversational({
    model: "distilbert-base-uncased-finetuned-sst-2-english",
    inputs: "I love this movie!",
});

Grammatical Correctness

Linguistic Acceptability is the task of assessing the grammatical acceptability of a sentence. The classes in this task are “acceptable” and “unacceptable”. The benchmark dataset used for this task is Corpus of Linguistic Acceptability (CoLA). The dataset consists of texts and their labels.

Example: Books were sent to each other by the students.
Label: Unacceptable

Example: She voted for herself.
Label: Acceptable.

Inference

from transformers import pipeline

classifier = pipeline("text-classification", model = "textattack/distilbert-base-uncased-CoLA")
classifier("I will walk to home when I went through the bus.")
##  [{'label': 'unacceptable', 'score': 0.95}]

Useful Resources

Would you like to learn more about the topic? Awesome! Here you can find some curated resources that you may find helpful!

Notebooks

Scripts for training

Documentation

Text classification task guide

Deploy on Inference Endpoints

Compatible libraries

using distilbert/distilbert-base-uncased-finetuned-sst-2-english

Models for Text Classification

Browse Models (118,002)

distilbert/distilbert-base-uncased-finetuned-sst-2-english

Text Classification • 67M • Updated Dec 19, 2023 • 2.38M • • 907

Note A robust model trained for sentiment analysis.

ProsusAI/finbert

Text Classification • Updated May 23, 2023 • 5.45M • • 1.18k

Note A sentiment analysis model specialized in financial sentiment.

cardiffnlp/twitter-roberta-base-sentiment-latest

Text Classification • Updated Aug 4, 2025 • 2.37M • • 809

Note A sentiment analysis model specialized in analyzing tweets.

papluca/xlm-roberta-base-language-detection

Text Classification • 0.3B • Updated Dec 28, 2023 • 394k • • 374

Note A model that can classify languages.

meta-llama/Prompt-Guard-86M

Text Classification • 0.3B • Updated Nov 12, 2025 • 768k • • 345

Note A model that can classify text generation attacks.

Datasets for Text Classification

Browse Datasets (15,359)

nyu-mll/glue

Viewer • Updated Jan 30, 2024 • 1.49M • 456k • 505

Note A widely used dataset used to benchmark multiple variants of text classification.

stanfordnlp/snli

Viewer • Updated Mar 6, 2024 • 570k • 27.4k • 93

Note A text classification dataset used to benchmark natural language inference models

Spaces using Text Classification

🚀

IoannisTr/Tech_Stocks_Trading_Assistant

Note An application that can classify financial sentiment.

🤖👽🔥💯

miesnerjacob/Multi-task-NLP

Note A dashboard that contains various text classification tasks.

🪐

spacy/healthsea-demo

Note An application that analyzes user reviews in healthcare.

Metrics for Text Classification

accuracy: Accuracy is the proportion of correct predictions among the total number of cases processed. It can be computed with: Accuracy = (TP + TN) / (TP + TN + FP + FN) Where: TP: True positive TN: True negative FP: False positive FN: False negative

recall: Recall is the fraction of the positive examples that were correctly labeled by the model as positive. It can be computed with the equation: Recall = TP / (TP + FN) Where TP is the true positives and FN is the false negatives.

precision: Precision is the fraction of correctly labeled positive examples out of all of the examples that were labeled as positive. It is computed via the equation: Precision = TP / (TP + FP) where TP is the True positives (i.e. the examples correctly labeled as positive) and FP is the False positive examples (i.e. the examples incorrectly labeled as positive).

f1: The F1 metric is the harmonic mean of the precision and recall. It can be calculated as: F1 = 2 * (precision * recall) / (precision + recall)