Matej
/

bert-small-buddhist-nonbuddhist-sanskrit

Inference Endpoints

Model card Files Files and versions Community

bert-small-buddhist-nonbuddhist-sanskrit / README.md

Matej Martinc

adding model files

88a2027 over 1 year ago

|

history blame contribute delete

1.35 kB

	# bert-small-buddhist-nonbuddhist-sanskrit

	BERT model trained on a lemmatized corpus containing Buddhist and non-Buddhist Sanskrit texts.

	## Model description

	The model has the bert architecture and was pretrained from scratch as a masked language model
	on the lemmatized Sanskrit corpus. Due to lack of resources and to prevent overfitting, the model is smaller than bert-base,
	i.e. the number of attention heads and hidden layers have been reduced to 8 and the context has been reduced to 128 tokens. Vocabulary size is 10000 tokens.

	## How to use it

	```
	model = AutoModelForMaskedLM.from_pretrained("Matej/bert-small-buddhist-nonbuddhist-sanskrit")
	tokenizer = AutoTokenizer.from_pretrained("Matej/bert-small-buddhist-nonbuddhist-sanskrit", use_fast=True)
	```

	## Intended uses & limitations

	MIT license, no limitations

	## Training and evaluation data

	See the paper 'Embeddings models for Buddhist Sanskrit' for details on the corpora and the evaluation procedure.

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 32
	- eval_batch_size: 4
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 200

	### Framework versions

	- Transformers 4.20.0
	- Pytorch 1.9.0
	- Datasets 2.3.2
	- Tokenizers 0.12.1