tomaarsen
/

span-marker-bert-base-uncased-bionlp

Token Classification

named-entity-recognition

generated_from_span_marker_trainer

Model card Files Files and versions Metrics Training metrics Community

span-marker-bert-base-uncased-bionlp / README.md

tomaarsen's picture

tomaarsen HF staff

Upload model

111bd4a over 1 year ago

|

2.09 kB

	---
	library_name: span-marker
	tags:
	- span-marker
	- token-classification
	- ner
	- named-entity-recognition
	- generated_from_span_marker_trainer
	metrics:
	- precision
	- recall
	- f1
	widget: []
	pipeline_tag: token-classification
	---

	# SpanMarker

	This is a [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) model that can be used for Named Entity Recognition.

	## Model Details

	### Model Description

	- Model Type: SpanMarker
	<!-- - Encoder: [Unknown](https://huggingface.co/models/unknown) -->
	- Maximum Sequence Length: 256 tokens
	- Maximum Entity Length: 8 words
	<!-- - Training Dataset: [Unknown](https://huggingface.co/datasets/unknown) -->
	<!-- - Language: Unknown -->
	<!-- - License: Unknown -->

	### Model Sources

	- Repository: [SpanMarker on GitHub](https://github.com/tomaarsen/SpanMarkerNER)
	- Thesis: [SpanMarker For Named Entity Recognition](https://raw.githubusercontent.com/tomaarsen/SpanMarkerNER/main/thesis.pdf)

	## Uses

	### Direct Use

	```python
	from span_marker import SpanMarkerModel

	# Download from the 🤗 Hub
	model = SpanMarkerModel.from_pretrained("span_marker_model_id")
	# Run inference
	entities = model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris.")
	```

	### Downstream Use
	You can finetune this model on your own dataset.

	<details><summary>Click to expand</summary>

	```python
	from span_marker import SpanMarkerModel, Trainer

	# Download from the 🤗 Hub
	model = SpanMarkerModel.from_pretrained("span_marker_model_id")

	# Specify a Dataset with "tokens" and "ner_tag" columns
	dataset = load_dataset("conll2003") # For example CoNLL2003

	# Initialize a Trainer using the pretrained model & dataset
	trainer = Trainer(
	model=model,
	train_dataset=dataset["train"],
	eval_dataset=dataset["validation"],
	)
	trainer.train()
	trainer.save_model("span_marker_model_id-finetuned")
	```
	</details>

	## Training Details

	### Framework Versions

	- Python: 3.9.16
	- SpanMarker: 1.3.1.dev
	- Transformers : 4.29.2
	- PyTorch: 2.0.1+cu118
	- Datasets: 2.14.3
	- Tokenizers: 0.13.2