thatdramebaazguy
/

movie-roberta-MITmovie

Token Classification

Inference Endpoints

Model card Files Files and versions Community

movie-roberta-MITmovie / README.md

thatdramebaazguy's picture

thatdramebaazguy

Update README.md

199439c over 2 years ago

|

1.8 kB

	---
	datasets:
	- imdb
	- cornell_movie_dialogue
	- MIT Movie

	language:
	- English

	thumbnail:

	tags:
	- roberta
	- roberta-base
	- token-classification
	- NER
	- named-entities
	- BIO
	- movies
	- DAPT

	license: cc-by-4.0

	---
	# Movie Roberta + Movies NER Task

	Objective:
	This is Roberta Base + Movie DAPT --> trained for the NER task using MIT Movie Dataset
	https://huggingface.co/thatdramebaazguy/movie-roberta-base was used as the MovieRoberta.

	```
	model_name = "thatdramebaazguy/movie-roberta-MITmovieroberta-base-MITmovie"
	pipeline(model=model_name, tokenizer=model_name, revision="v1.0", task="ner")
	```

	## Overview
	Language model: roberta-base
	Language: English
	Downstream-task: NER
	Training data: MIT Movie
	Eval data: MIT Movie
	Infrastructure: 2x Tesla v100
	Code: See [example](https://github.com/adityaarunsinghal/Domain-Adaptation/blob/master/scripts/shell_scripts/movieR_NER_squad.sh)

	## Hyperparameters
	```
	Num examples = 6253
	Num Epochs = 5
	Instantaneous batch size per device = 64
	Total train batch size (w. parallel, distributed & accumulation) = 128

	```
	## Performance

	### Eval on MIT Movie
	- epoch = 5.0
	- eval_accuracy = 0.9472
	- eval_f1 = 0.8876
	- eval_loss = 0.2211
	- eval_mem_cpu_alloc_delta = 3MB
	- eval_mem_cpu_peaked_delta = 2MB
	- eval_mem_gpu_alloc_delta = 0MB
	- eval_mem_gpu_peaked_delta = 38MB
	- eval_precision = 0.887
	- eval_recall = 0.8881
	- eval_runtime = 0:00:03.73
	- eval_samples = 1955
	- eval_samples_per_second = 523.095

	Github Repo:
	- [Domain-Adaptation Project](https://github.com/adityaarunsinghal/Domain-Adaptation/)

	---