starkdv123
/

conll2003-bert-ner-full

Token Classification

Model card Files Files and versions

conll2003-bert-ner-full / README.md

starkdv123's picture

Add model card

1b1f82f verified 8 months ago

|

history blame contribute delete

1.23 kB


	---
	tags:
	- transformers
	- token-classification
	- ner
	- bert
	- conll2003
	license: apache-2.0
	datasets:
	- conll2003
	language:
	- en
	pipeline_tag: token-classification
	authors:
	- Karan D Vasa (https://huggingface.co/starkdv123)
	---

	# BERT (base-cased) for CoNLL-2003 NER — Full Fine-Tune

	This repository contains a BERT base cased model fine-tuned on CoNLL-2003 (parquet version).
	Evaluated with seqeval (entity-level F1).

	## 📊 Result (this run)
	- Entity Macro F1: 0.9192

	## Usage
	```python
	from transformers import pipeline
	clf = pipeline("token-classification", model="starkdv123/conll2003-bert-ner-full", aggregation_strategy="simple")
	clf("Chris Hoiles hit his 22nd homer for Baltimore.")
	```

	## Training summary

	* Base: `bert-base-cased`
	* Epochs: 3, LR: 3e-5, batch 16/32, max_len 256, weight_decay 0.01, fp16
	* Label alignment: -100 for subword continuations
	* Metric: seqeval F1 (entity-level)

	## Confusion Matrix
	```
	LOC MISC O ORG PER
	LOC 411 6 21 32 3
	MISC 9 2213 51 76 14
	O 67 110 38063 58 17
	ORG 31 77 32 2353 10
	PER 3 42 15 24 2689
	```