File size: 1,408 Bytes
abd6f28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b2a78ba
abd6f28
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
---
license: mit
language:
- el
pipeline_tag: text-classification
---

# GreekDeBERTa-base

**GreekDeBERTa-base** is a language model specifically pre-trained for Greek Natural Language Processing (NLP) tasks. It is based on the DeBERTa architecture and is pre-trained on Masked Language Modeling (MLM).

## Model Details

- **Model Architecture**: DeBERTa-base
- **Language**: Greek
- **Pre-training Objectives**: - Masked Language Modeling (MLM)
- **Tokenizer**: SentencePiece Model (`spm.model`)

## Model Files

The following files are included in the repository:

- `config.json`: The model configuration file used by the DeBERTa-base architecture.
- `pytorch_model.bin`: The pre-trained model weights in PyTorch format.
- `spm.model`: The SentencePiece model file used for tokenization.
- `vocab.txt`: A human-readable vocabulary file that contains the list of tokens used by the model.
- `tokenizer_config.json`: Configuration file for the tokenizer.

## How to Use

You can easily load and use the model in Python with the Hugging Face `transformers` library. Below is an example to get started with token classification:

```python
from transformers import AutoTokenizer, AutoModelForTokenClassification

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("AI-team-UoA/GreekDeBERTa-base")
model = AutoModelForTokenClassification.from_pretrained("AI-team-UoA/GreekDeBERTa-base")