newsmediabias
/

UnBIAS-Named-Entity-Recognition

Token Classification

Inference Endpoints

Model card Files Files and versions Community

shainaraza commited on Oct 7, 2023

Commit

b943ffc

·

1 Parent(s): 9466dbe

Create README.md

Files changed (1) hide show

README.md +65 -0

README.md ADDED Viewed

	@@ -0,0 +1,65 @@

+# Named entity recognition
+## Model Description
+This model is a fine-tuned token classification model designed to predict entities in sentences.
+It's fine-tuned on a custom dataset that focuses on identifying certain types of entities, including biases in text.
+## Intended Use
+The model is intended to be used for entity recognition tasks, especially for identifying biases in text passages.
+Users can input a sequence of text, and the model will highlight words or tokens or **spans** it believes are associated with a particular entity or bias.
+## How to Use
+The model can be used for inference directly through the Hugging Face `transformers` library:
+```python
+#check for inference
+from transformers import AutoModelForTokenClassification, AutoTokenizer
+import torch
+device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+# Load model directly
+from transformers import AutoTokenizer, AutoModelForTokenClassification
+tokenizer = AutoTokenizer.from_pretrained("newsmediabias/UnBIAS-Named-Entity-Recognition")
+model = AutoModelForTokenClassification.from_pretrained("newsmediabias/UnBIAS-Named-Entity-Recognition")
+model.eval()
+model.to(device)
+def predict_entities(sentence):
+    tokens = tokenizer.tokenize(tokenizer.decode(tokenizer.encode(sentence)))
+    inputs = tokenizer.encode(sentence, return_tensors="pt")
+    inputs = inputs.to(device)
+    outputs = model(inputs).logits
+    predictions = torch.argmax(outputs, dim=2)
+    id2label = model.config.id2label
+    return [(token, id2label[prediction.item()]) for token, prediction in zip(tokens, predictions[0])]
+sentence = "due to your evil nature, i am kind of tired and want to get rid of such cheapters."
+predictions = predict_entities(sentence)
+for token, label in predictions:
+    print(f"Token: {token}, Label: {label}")
+```
+## Limitations and Biases
+Every model has limitations, and it's crucial to understand these when deploying models in real-world scenarios:
+1. **Training Data**: The model is trained on a specific dataset, and its predictions are only as good as the data it's trained on.
+2. **Generalization**: While the model may perform well on certain types of sentences or phrases, it might not generalize well to all types of text or contexts.
+It's also essential to be aware of any potential biases in the training data, which might affect the model's predictions.
+## Training Data
+The model was fine-tuned on a custom dataset. Ask **Shaina Raza shaina.raza@utoronto.ca** for dataset