DunnBC22's picture
Update README.md
079fc18
metadata
license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - twitter_pos_vcb
metrics:
  - accuracy
  - poseval
  - f1
  - recall
  - precision
model-index:
  - name: bert-base-cased-finetuned-Stromberg_NLP_Twitter-PoS_v2
    results:
      - task:
          name: Token Classification
          type: token-classification
        dataset:
          name: twitter_pos_vcb
          type: twitter_pos_vcb
          config: twitter-pos-vcb
          split: train
          args: twitter-pos-vcb
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.9853480683735223
language:
  - en
pipeline_tag: token-classification

bert-base-cased-finetuned-Stromberg_NLP_Twitter-PoS_v2

This model is a fine-tuned version of bert-base-cased on the twitter_pos_vcb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0502
Token Precision Recall F1-Score Support
$ 0.0 0.0 0.0 3
'' 0.9312320916905444 0.9530791788856305 0.9420289855072465 341
( 0.9791666666666666 0.9591836734693877 0.9690721649484536 196
) 0.960167714884696 0.9703389830508474 0.9652265542676501 472
, 0.9988979501873485 0.9993384785005512 0.9991181657848325 4535
. 0.9839189708141322 0.9894762249577601 0.9866897730281368 20715
: 0.9926405887528997 0.9971072719967858 0.9948689168604183 12445
Cc 0.9991067440821796 0.9986607142857142 0.9988836793927215 4480
Cd 0.9903884661593912 0.9899919935948759 0.9901901901901902 2498
Dt 0.9981148589510537 0.9976446837146703 0.9978797159492478 14860
Ex 0.9142857142857143 0.9846153846153847 0.9481481481481482 65
Fw 1.0 0.1 0.18181818181818182 10
Ht 0.999877541023757 0.9997551120362435 0.9998163227820978 8167
In 0.9960399353003514 0.9954846981437092 0.9957622393219583 17939
Jj 0.9812470698546648 0.9834756049808129 0.9823600735322877 12769
Jjr 0.9304511278195489 0.9686888454011742 0.9491850431447747 511
Jjs 0.9578414839797639 0.9726027397260274 0.9651656754460493 584
Md 0.9901398761751892 0.9908214777420835 0.990480559697213 4358
Nn 0.9810285563194078 0.9819697621331922 0.9814989335846437 30227
Nnp 0.9609722697706266 0.9467116357504216 0.9537886510363575 8895
Nnps 1.0 0.037037037037037035 0.07142857142857142 27
Nns 0.9697771061579146 0.9776564681985528 0.9737008471361739 7877
Pos 0.9977272727272727 0.984304932735426 0.9909706546275394 446
Prp 0.9983503349829983 0.9985184187487373 0.9984343697917544 29698
Prp$ 0.9974262182566919 0.9974262182566919 0.9974262182566919 5828
Rb 0.9939770374552983 0.9929802569727358 0.9934783971906942 15955
Rbr 0.9058823529411765 0.8191489361702128 0.8603351955307263 94
Rbs 0.92 1.0 0.9583333333333334 69
Rp 0.9802197802197802 0.9903774981495189 0.9852724594992636 1351
Rt 0.9995065383666419 0.9996298581122763 0.9995681944358769 8105
Sym 0.0 0.0 0.0 9
To 0.9984649496844619 0.9989761092150171 0.9987204640450398 5860
Uh 0.9614460148062687 0.9507510933637574 0.9560686457287633 10518
Url 1.0 0.9997242900468707 0.9998621260168207 3627
Usr 0.9999025388626285 1.0 0.9999512670565303 20519
Vb 0.9619302598929085 0.9570556133056133 0.9594867452615125 15392
Vbd 0.9592894152479645 0.9548719837907533 0.9570756023262255 5429
Vbg 0.9848831077518018 0.984191111891797 0.9845369882270251 5693
Vbn 0.9053408597481546 0.9164835164835164 0.910878112712975 2275
Vbp 0.963605718209626 0.9666228317364894 0.9651119169688633 15969
Vbz 0.9881780250347705 0.9861207494795281 0.9871483153872872 5764
Wdt 0.8666666666666667 0.9285714285714286 0.896551724137931 14
Wp 0.99125 0.993734335839599 0.9924906132665832 1596
Wrb 0.9963488843813387 0.9979683055668428 0.9971579374746244 2461
`` 0.9481865284974094 0.9786096256684492 0.963157894736842 187

Overall

  • Accuracy: 0.9853
  • Macro avg:
    • Precision: 0.9296417163691048
    • Recall: 0.8931046018294694
    • F1-score: 0.8930917459781836
    • Support: 308833
  • Weighted avg:
    • Precision: 0.985306457604231
    • Recall: 0.9853480683735223
    • F1-Score: 0.9852689858931941
    • Support: 308833

Model description

For more information on how it was created, check out the following link: https://github.com/DunnBC22/NLP_Projects/blob/main/Token%20Classification/Monolingual/StrombergNLP-Twitter_pos_vcb/NER%20Project%20Using%20StrombergNLP%20Twitter_pos_vcb%20Dataset%20with%20PosEval.ipynb.

Intended uses & limitations

This model is intended to demonstrate my ability to solve a complex problem using technology.

Training and evaluation data

Dataset Source: https://huggingface.co/datasets/strombergnlp/twitter_pos_vcb

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2

Training results

Framework versions

  • Transformers 4.28.1
  • Pytorch 2.0.0
  • Datasets 2.11.0
  • Tokenizers 0.13.3