|
--- |
|
language: |
|
- te |
|
- en |
|
tags: |
|
- telugu |
|
- NER |
|
- TeluguNER |
|
--- |
|
## Direct Use |
|
|
|
The model is a language model. The model can be used for token classification, a natural language understanding task in which a label is assigned to some tokens in a text. |
|
|
|
## Downstream Use |
|
|
|
Potential downstream use cases include Named Entity Recognition (NER) and Part-of-Speech (PoS) tagging. To learn more about token classification and other potential downstream use cases, see the Hugging Face [token classification docs](https://huggingface.co/tasks/token-classification). |
|
|
|
## Out-of-Scope Use |
|
|
|
The model should not be used to intentionally create hostile or alienating environments for people. |
|
|
|
# Bias, Risks, and Limitations |
|
|
|
**CONTENT WARNING: Readers should be made aware that language generated by this model may be disturbing or offensive to some and may propagate historical and current stereotypes.** |
|
|
|
|
|
```python |
|
>>> from transformers import pipeline |
|
>>> tokenizer = AutoTokenizer.from_pretrained("Pavan27/NER_Telugu_01") |
|
>>> model = AutoModelForTokenClassification.from_pretrained("Pavan27/NER_Telugu_01") |
|
>>> classifier = pipeline("ner", model=model, tokenizer=tokenizer, grouped_entities = True) |
|
>>> classifier("వెస్టిండీస్పై పోర్ట్ ఆఫ్ స్పెయిన్ వేదిక జరుగుతున్న రెండో టెస్టు తొలి ఇన్నింగ్స్లో విరాట్ కోహ్లీ 121 పరుగులతో విదేశాల్లో సెంచరీ కరువును తీర్చుకున్నాడు.") |
|
|
|
|
|
[{'entity_group': 'LOC', |
|
'score': 0.9999062, |
|
'word': 'వెస్టిండీస్', |
|
'start': 0, |
|
'end': 11}, |
|
{'entity_group': 'LOC', |
|
'score': 0.9998613, |
|
'word': 'పోర్ట్ ఆఫ్ స్పెయిన్', |
|
'start': 15, |
|
'end': 34}, |
|
{'entity_group': 'PER', |
|
'score': 0.99996054, |
|
'word': 'విరాట్ కోహ్లీ', |
|
'start': 85, |
|
'end': 98}] |
|
``` |
|
|
|
## Recommendations |
|
|
|
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. |