prasadsachin's picture
Update README.md
4279f95 verified
---
library_name: keras-hub
license: mit
language:
- en
tags:
- text-classification
---
## Model Overview
DeBERTaV3 encoder networks are a set of transformer encoder models published by Microsoft. DeBERTa improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder.
Weights are released under the [MIT License](https://opensource.org/license/mit). Keras model code is released under the [Apache 2 License](https://github.com/keras-team/keras-hub/blob/master/LICENSE).
## Links
* [DeBERTaV3 Quickstart Notebook](https://www.kaggle.com/code/gabrielrasskin/debertav3-quickstart)
* [DeBERTaV3 API Documentation](https://keras.io/api/keras_hub/models/deberta_v3/deberta_v3_classifier/)
* [DeBERTaV3 Model Paper](https://arxiv.org/abs/2111.09543)
* [KerasHub Beginner Guide](https://keras.io/guides/keras_hub/getting_started/)
* [KerasHub Model Publishing Guide](https://keras.io/guides/keras_hub/upload/)
## Installation
Keras and KerasHub can be installed with:
```
pip install -U -q keras-hub
pip install -U -q keras>=3
```
Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instruction on installing them in another environment see the [Keras Getting Started](https://keras.io/getting_started/) page.
## Presets
The following model checkpoints are provided by the Keras team. Full code examples for each are available below.
| Preset Name | Parameters | Description |
| :------------------------------- | :------------: | :-------------------------------------------------------------------------------------------------------- |
| `deberta_v3_extra_small_en` | 70.68M | 12-layer DeBERTaV3 model where case is maintained. Trained on English Wikipedia, BookCorpus and OpenWebText. |
| `deberta_v3_small_en` | 141.30M | 6-layer DeBERTaV3 model where case is maintained. Trained on English Wikipedia, BookCorpus and OpenWebText. |
| `deberta_v3_base_en` | 183.83M | 12-layer DeBERTaV3 model where case is maintained. Trained on English Wikipedia, BookCorpus and OpenWebText. |
| `deberta_v3_large_en` | 434.01M | 24-layer DeBERTaV3 model where case is maintained. Trained on English Wikipedia, BookCorpus and OpenWebText. |
| `deberta_v3_base_multi` | 278.22M | 12-layer DeBERTaV3 model where case is maintained. Trained on the 2.5TB multilingual CC100 dataset. |
## Prompts
DeBERTa's main use as a classifier takes in raw text that is labelled by the class it belongs to. In practice this can look like this:
```python
features = ["The quick brown fox jumped.", "I forgot my homework."]
labels = [0, 3]
```
## Example Usage
```python
import keras
import keras_hub
import numpy as np
```
Raw string data.
```python
features = ["The quick brown fox jumped.", "I forgot my homework."]
labels = [0, 3]
# Pretrained classifier.
classifier = keras_hub.models.DebertaV3Classifier.from_preset(
"deberta_v3_extra_small_en",
num_classes=4,
)
classifier.fit(x=features, y=labels, batch_size=2)
classifier.predict(x=features, batch_size=2)
# Re-compile (e.g., with a new learning rate).
classifier.compile(
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=keras.optimizers.Adam(5e-5),
jit_compile=True,
)
# Access backbone programmatically (e.g., to change `trainable`).
classifier.backbone.trainable = False
# Fit again.
classifier.fit(x=features, y=labels, batch_size=2)
```
Preprocessed integer data.
```python
features = {
"token_ids": np.ones(shape=(2, 12), dtype="int32"),
"padding_mask": np.array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0]] * 2),
}
labels = [0, 3]
# Pretrained classifier without preprocessing.
classifier = keras_hub.models.DebertaV3Classifier.from_preset(
"deberta_v3_extra_small_en",
num_classes=4,
preprocessor=None,
)
classifier.fit(x=features, y=labels, batch_size=2)
```
## Example Usage with Hugging Face URI
```python
import keras
import keras_hub
import numpy as np
```
Raw string data.
```python
features = ["The quick brown fox jumped.", "I forgot my homework."]
labels = [0, 3]
# Pretrained classifier.
classifier = keras_hub.models.DebertaV3Classifier.from_preset(
"hf://keras/deberta_v3_extra_small_en",
num_classes=4,
)
classifier.fit(x=features, y=labels, batch_size=2)
classifier.predict(x=features, batch_size=2)
# Re-compile (e.g., with a new learning rate).
classifier.compile(
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=keras.optimizers.Adam(5e-5),
jit_compile=True,
)
# Access backbone programmatically (e.g., to change `trainable`).
classifier.backbone.trainable = False
# Fit again.
classifier.fit(x=features, y=labels, batch_size=2)
```
Preprocessed integer data.
```python
features = {
"token_ids": np.ones(shape=(2, 12), dtype="int32"),
"padding_mask": np.array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0]] * 2),
}
labels = [0, 3]
# Pretrained classifier without preprocessing.
classifier = keras_hub.models.DebertaV3Classifier.from_preset(
"hf://keras/deberta_v3_extra_small_en",
num_classes=4,
preprocessor=None,
)
classifier.fit(x=features, y=labels, batch_size=2)
```