YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Custom BERT Model for Text Classification

Model Description

This is a custom BERT model fine-tuned for text classification. The model was trained using a subset of a publicly available dataset and is capable of classifying text into 3 classes.

Training Details

  • Architecture: BERT Base Multilingual Cased
  • Training data: Custom dataset
  • Preprocessing: Tokenized using BERT's tokenizer, with a max sequence length of 80.
  • Fine-tuning: The model was trained for 1 epoch with a learning rate of 2e-5, using AdamW optimizer and Cross-Entropy Loss.
  • Evaluation Metrics: Accuracy on a held-out validation set.

How to Use

Dependencies

  • Transformers 4.x
  • Torch 1.x

Code Snippet

For classification:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("billfass/my_bert_model")
model = AutoModelForSequenceClassification.from_pretrained("billfass/my_bert_model")

text = "Your example text here."

inputs = tokenizer(text, padding=True, truncation=True, max_length=80, return_tensors="pt")
labels = torch.tensor([1]).unsqueeze(0)  # Batch size 1

outputs = model(**inputs, labels=labels)
loss = outputs.loss
logits = outputs.logits

# To get probabilities:
probs = torch.softmax(logits, dim=-1)

Limitations and Bias

  • Trained on a specific dataset, so may not generalize well to other kinds of text.
  • Uses multilingual cased BERT, so it's not optimized for any specific language.

Authors

Acknowledgments

Special thanks to Hugging Face for providing the Transformers library that made this project possible.


Downloads last month
1
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.