|
--- |
|
language: |
|
- id |
|
tags: |
|
- punctuation prediction |
|
- punctuation |
|
widget: |
|
- text: "halo bagaimana kabarmu" |
|
example_title: "indonesian" |
|
--- |
|
|
|
This model predicts the punctuation of Indonesian languange. It has been created to restore punctuation of transcribed from speech recognition models. |
|
This model Based on the work https://github.com/oliverguhr/fullstop-deep-punctuation-prediction |
|
|
|
The model restores the following punctuation markers: **"." "," "?" "-" ":"** |
|
|
|
## Install |
|
|
|
To get started install the package from [pypi](https://pypi.org/project/deepmultilingualpunctuation/): |
|
|
|
```bash |
|
pip install deepmultilingualpunctuation |
|
``` |
|
### Restore Punctuation |
|
```python |
|
from deepmultilingualpunctuation import PunctuationModel |
|
|
|
model = PunctuationModel("Rizkinoor16/fullstop-indonesian-punctuation-prediction") |
|
text = "halo bagaimana kabarmu" |
|
result = model.restore_punctuation(text) |
|
print(result) |
|
``` |
|
|
|
|
|
## Results |
|
precision recall f1-score support |
|
|
|
0 0.98 0.99 0.98 38057720 |
|
. 0.89 0.91 0.90 2234980 |
|
, 0.84 0.79 0.81 3037655 |
|
? 0.84 0.79 0.82 72969 |
|
- 0.96 0.90 0.93 162085 |
|
: 0.91 0.89 0.90 191937 |
|
|
|
accuracy 0.97 43757346 |
|
macro avg 0.90 0.88 0.89 43757346 |
|
weighted avg 0.97 0.97 0.97 43757346 |
|
|
|
## Contact |
|
|
|
Rizki Noor <rizki@cakra.ai> |
|
**Linkedin** : [Noor Muhamad Rizki](https://www.linkedin.com/in/noor-muhamad-rizki-114600231/) |