NimmyhBas
/

en_core_web_sm

Token Classification

Model card Files Files and versions Community

Edit model card

English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler, lemmatizer.

Feature	Description
Name	`en_core_web_sm`
Version	`3.7.1`
spaCy	`>=3.7.2,<3.8.0`
Default Pipeline	`tok2vec`, `tagger`, `parser`, `attribute_ruler`, `lemmatizer`, `ner`
Components	`tok2vec`, `tagger`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner`
Vectors	0 keys, 0 unique vectors (0 dimensions)
Sources	OntoNotes 5 (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston) ClearNLP Constituent-to-Dependency Conversion (Emory University) WordNet 3.0 (Princeton University)
License	`MIT`
Author	Explosion

Label Scheme

View label scheme (113 labels for 3 components)

Component	Labels
`tagger`	`$`, `''`, `,`, `-LRB-`, `-RRB-`, `.`, `:`, `ADD`, `AFX`, `CC`, `CD`, `DT`, `EX`, `FW`, `HYPH`, `IN`, `JJ`, `JJR`, `JJS`, `LS`, `MD`, `NFP`, `NN`, `NNP`, `NNPS`, `NNS`, `PDT`, `POS`, `PRP`, `PRP$`, `RB`, `RBR`, `RBS`, `RP`, `SYM`, `TO`, `UH`, `VB`, `VBD`, `VBG`, `VBN`, `VBP`, `VBZ`, `WDT`, `WP`, `WP$`, `WRB`, `XX`, `_SP`, ````
`parser`	`ROOT`, `acl`, `acomp`, `advcl`, `advmod`, `agent`, `amod`, `appos`, `attr`, `aux`, `auxpass`, `case`, `cc`, `ccomp`, `compound`, `conj`, `csubj`, `csubjpass`, `dative`, `dep`, `det`, `dobj`, `expl`, `intj`, `mark`, `meta`, `neg`, `nmod`, `npadvmod`, `nsubj`, `nsubjpass`, `nummod`, `oprd`, `parataxis`, `pcomp`, `pobj`, `poss`, `preconj`, `predet`, `prep`, `prt`, `punct`, `quantmod`, `relcl`, `xcomp`
`ner`	`CARDINAL`, `DATE`, `EVENT`, `FAC`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PRODUCT`, `QUANTITY`, `TIME`, `WORK_OF_ART`

Accuracy

Type	Score
`TOKEN_ACC`	99.86
`TOKEN_P`	99.57
`TOKEN_R`	99.58
`TOKEN_F`	99.57
`TAG_ACC`	97.25
`SENTS_P`	92.02
`SENTS_R`	89.21
`SENTS_F`	90.59
`DEP_UAS`	91.75
`DEP_LAS`	89.87
`ENTS_P`	84.55
`ENTS_R`	84.57
`ENTS_F`	84.56

Downloads last month: 1

Inference Examples

Token Classification

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results

NER Precision
self-reported

0.845
NER Recall
self-reported

0.846
NER F Score
self-reported

0.846
TAG (XPOS) Accuracy
self-reported

0.972
Unlabeled Attachment Score (UAS)
self-reported

0.918
Labeled Attachment Score (LAS)
self-reported

0.899
Sentences F-Score
self-reported

0.906

Metadata error: specify a dataset to view leaderboard