shainaraza commited on
Commit
b943ffc
·
1 Parent(s): 9466dbe

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -0
README.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # Named entity recognition
3
+
4
+ ## Model Description
5
+
6
+ This model is a fine-tuned token classification model designed to predict entities in sentences.
7
+ It's fine-tuned on a custom dataset that focuses on identifying certain types of entities, including biases in text.
8
+
9
+ ## Intended Use
10
+
11
+ The model is intended to be used for entity recognition tasks, especially for identifying biases in text passages.
12
+ Users can input a sequence of text, and the model will highlight words or tokens or **spans** it believes are associated with a particular entity or bias.
13
+
14
+ ## How to Use
15
+
16
+ The model can be used for inference directly through the Hugging Face `transformers` library:
17
+
18
+ ```python
19
+ #check for inference
20
+ from transformers import AutoModelForTokenClassification, AutoTokenizer
21
+ import torch
22
+
23
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
24
+
25
+ # Load model directly
26
+ from transformers import AutoTokenizer, AutoModelForTokenClassification
27
+
28
+ tokenizer = AutoTokenizer.from_pretrained("newsmediabias/UnBIAS-Named-Entity-Recognition")
29
+ model = AutoModelForTokenClassification.from_pretrained("newsmediabias/UnBIAS-Named-Entity-Recognition")
30
+
31
+ model.eval()
32
+ model.to(device)
33
+
34
+ def predict_entities(sentence):
35
+ tokens = tokenizer.tokenize(tokenizer.decode(tokenizer.encode(sentence)))
36
+ inputs = tokenizer.encode(sentence, return_tensors="pt")
37
+ inputs = inputs.to(device)
38
+
39
+ outputs = model(inputs).logits
40
+ predictions = torch.argmax(outputs, dim=2)
41
+
42
+ id2label = model.config.id2label
43
+ return [(token, id2label[prediction.item()]) for token, prediction in zip(tokens, predictions[0])]
44
+
45
+ sentence = "due to your evil nature, i am kind of tired and want to get rid of such cheapters."
46
+ predictions = predict_entities(sentence)
47
+ for token, label in predictions:
48
+ print(f"Token: {token}, Label: {label}")
49
+
50
+ ```
51
+
52
+
53
+ ## Limitations and Biases
54
+
55
+ Every model has limitations, and it's crucial to understand these when deploying models in real-world scenarios:
56
+
57
+ 1. **Training Data**: The model is trained on a specific dataset, and its predictions are only as good as the data it's trained on.
58
+ 2. **Generalization**: While the model may perform well on certain types of sentences or phrases, it might not generalize well to all types of text or contexts.
59
+
60
+ It's also essential to be aware of any potential biases in the training data, which might affect the model's predictions.
61
+
62
+ ## Training Data
63
+
64
+ The model was fine-tuned on a custom dataset. Ask **Shaina Raza shaina.raza@utoronto.ca** for dataset
65
+