paddlenlp-banner

PaddlePaddle/utc-large

Text classification technology is widely used in various industries such as dialogue intention recognition, bill archiving, and event detection. However, there are many challenges in industrial-level text classification practices, including diverse tasks, limited data availability and label transfer difficulty. To address these issues, UTC models text classification as a matching task between labels and text, based on the idea of Unified Semantic Matching (USM). Thus, it can handle multiple classification tasks with a single model, reducing development and machine costs and achieving good zero/few-shot transfer performance.

USM Paper: https://arxiv.org/abs/2301.03282

PaddleNLP released UTC model for various text classification tasks which use ERNIE models as the pre-trained language models and were finetuned on a large amount of text classification data.

UTC-diagram

Available Models

Model Name Usage Scenarios Supporting Tasks
utc-large A text classification model supports Chinese Supports intention recognition, semantic matching, natural language inference, semantic analysis, etc.

Performance on Text Dataset

UTC tops ZeroCLUE and FewCLUE benchmarks, as of 2023/01/12.

UTC-benchmarks

Detailed Info: https://github.com/PaddlePaddle/PaddleNLP/tree/develop/applications/zero_shot_text_classification

Downloads last month
0
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.