scfengv's picture
Update README
65cddc8
metadata
license: mit
language:
  - zh
metrics:
  - accuracy
  - f1 (macro)
  - f1 (micro)
base_model:
  - google-bert/bert-base-chinese
pipeline_tag: text-classification
tags:
  - Multi-label Text Classification
datasets:
  - scfengv/TVL-general-layer-dataset
library_name: adapter-transformers
model-index:
  - name: scfengv/TVL_GeneralLayerClassifier
    results:
      - task:
          type: multi-label text-classification
        dataset:
          name: scfengv/TVL-general-layer-dataset
          type: scfengv/TVL-general-layer-dataset
        metrics:
          - name: Accuracy
            type: Accuracy
            value: 0.952902
          - name: F1 score (Micro)
            type: F1 score (Micro)
            value: 0.968717
          - name: F1 score (Macro)
            type: F1 score (Macro)
            value: 0.970818

Model Details of TVL_GeneralLayerClassifier

Base Model

This model is fine-tuned from google-bert/bert-base-chinese.

Model Architecture

  • Type: BERT-based text classification model
  • Hidden Size: 768
  • Number of Layers: 12
  • Number of Attention Heads: 12
  • Intermediate Size: 3072
  • Max Sequence Length: 512
  • Vocabulary Size: 21,128

Key Components

  1. Embeddings

    • Word Embeddings
    • Position Embeddings
    • Token Type Embeddings
    • Layer Normalization
  2. Encoder

    • 12 layers of:
      • Self-Attention Mechanism
      • Intermediate Dense Layer
      • Output Dense Layer
      • Layer Normalization
  3. Pooler

    • Dense layer for sentence representation
  4. Classifier

    • Output layer with 4 classes

Training Hyperparameters

The model was trained using the following hyperparameters:

Learning rate: 1e-05
Batch size: 32
Number of epochs: 10
Optimizer: Adam
Loss function: torch.nn.BCEWithLogitsLoss()

Training Infrastructure

  • Hardware Type: NVIDIA Quadro RTX8000
  • Library: PyTorch
  • Hours used: 2hr 56mins

Model Parameters

  • Total parameters: ~102M (estimated)
  • All parameters are in 32-bit floating point (F32) format

Input Processing

  • Uses BERT tokenization
  • Supports sequences up to 512 tokens

Output

  • 4-class multi-label classification

Performance Metrics

  • Accuracy score: 0.952902
  • F1 score (Micro): 0.968717
  • F1 score (Macro): 0.970818

Training Dataset

This model was trained on the scfengv/TVL-general-layer-dataset.

Testing Dataset

Usage

import torch
from transformers import BertForSequenceClassification, BertTokenizer

model = BertForSequenceClassification.from_pretrained("scfengv/TVL_GeneralLayerClassifier")
tokenizer = BertTokenizer.from_pretrained("scfengv/TVL_GeneralLayerClassifier")

# Prepare your text
text = "Your text here" ## Please refer to Dataset
inputs = tokenizer(text, return_tensors = "pt", padding = True, truncation = True, max_length = 512)

# Make prediction
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.sigmoid(outputs.logits)

# Print predictions
print(predictions)

Additional Notes

  • This model is specifically designed for TVL general layer classification tasks.

  • It's based on the Chinese BERT model, indicating it's optimized for Chinese text.

  • Hardware Type: NVIDIA Quadro RTX8000

  • Library: PyTorch

  • Hours used: 2hr 56mins

Training Data

Training Hyperparameters

The model was trained using the following hyperparameters:

Learning rate: 1e-05
Batch size: 32
Number of epochs: 10
Optimizer: Adam
Loss function: torch.nn.BCEWithLogitsLoss()

Evaluation

Testing Data

Results (validation)

  • Accuracy: 0.952902
  • F1 Score (Micro): 0.968717
  • F1 Score (Macro): 0.970818