|
--- |
|
library_name: span-marker |
|
tags: |
|
- span-marker |
|
- token-classification |
|
- ner |
|
- named-entity-recognition |
|
- generated_from_span_marker_trainer |
|
metrics: |
|
- precision |
|
- recall |
|
- f1 |
|
widget: [] |
|
pipeline_tag: token-classification |
|
--- |
|
|
|
# SpanMarker |
|
|
|
This is a [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) model that can be used for Named Entity Recognition. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
- **Model Type:** SpanMarker |
|
<!-- - **Encoder:** [Unknown](https://huggingface.co/models/unknown) --> |
|
- **Maximum Sequence Length:** 256 tokens |
|
- **Maximum Entity Length:** 8 words |
|
<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) --> |
|
<!-- - **Language:** Unknown --> |
|
<!-- - **License:** Unknown --> |
|
|
|
### Model Sources |
|
|
|
- **Repository:** [SpanMarker on GitHub](https://github.com/tomaarsen/SpanMarkerNER) |
|
- **Thesis:** [SpanMarker For Named Entity Recognition](https://raw.githubusercontent.com/tomaarsen/SpanMarkerNER/main/thesis.pdf) |
|
|
|
## Uses |
|
|
|
### Direct Use |
|
|
|
```python |
|
from span_marker import SpanMarkerModel |
|
|
|
# Download from the 🤗 Hub |
|
model = SpanMarkerModel.from_pretrained("span_marker_model_id") |
|
# Run inference |
|
entities = model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris.") |
|
``` |
|
|
|
### Downstream Use |
|
You can finetune this model on your own dataset. |
|
|
|
<details><summary>Click to expand</summary> |
|
|
|
```python |
|
from span_marker import SpanMarkerModel, Trainer |
|
|
|
# Download from the 🤗 Hub |
|
model = SpanMarkerModel.from_pretrained("span_marker_model_id") |
|
|
|
# Specify a Dataset with "tokens" and "ner_tag" columns |
|
dataset = load_dataset("conll2003") # For example CoNLL2003 |
|
|
|
# Initialize a Trainer using the pretrained model & dataset |
|
trainer = Trainer( |
|
model=model, |
|
train_dataset=dataset["train"], |
|
eval_dataset=dataset["validation"], |
|
) |
|
trainer.train() |
|
trainer.save_model("span_marker_model_id-finetuned") |
|
``` |
|
</details> |
|
|
|
## Training Details |
|
|
|
### Framework Versions |
|
|
|
- Python: 3.9.16 |
|
- SpanMarker: 1.3.1.dev |
|
- Transformers : 4.29.2 |
|
- PyTorch: 2.0.1+cu118 |
|
- Datasets: 2.14.3 |
|
- Tokenizers: 0.13.2 |