|
--- |
|
language: en |
|
license: apache-2.0 |
|
library_name: transformers |
|
pipeline_tag: text2text-generation |
|
tags: |
|
- text-generation |
|
- formal-language |
|
- grammar-correction |
|
- t5 |
|
- english |
|
- text-formalization |
|
|
|
model-index: |
|
- name: formal-lang-rxcx-model |
|
results: |
|
- task: |
|
type: text2text-generation |
|
name: formal language correction |
|
metrics: |
|
- type: loss |
|
value: 2.1 |
|
name: training_loss |
|
- type: rouge1 |
|
value: 0.85 |
|
name: rouge1 |
|
- type: accuracy |
|
value: 0.82 |
|
name: accuracy |
|
dataset: |
|
name: grammarly/coedit |
|
type: grammarly/coedit |
|
split: train |
|
|
|
datasets: |
|
- grammarly/coedit |
|
|
|
model-type: t5-base |
|
inference: true |
|
base_model: t5-base |
|
|
|
widget: |
|
- text: "make formal: hey whats up" |
|
- text: "make formal: gonna be late for meeting" |
|
- text: "make formal: this is kinda cool project" |
|
|
|
extra_gated_prompt: This is a fine-tuned T5 model for converting informal text to formal language. |
|
|
|
extra_gated_fields: |
|
Company/Institution: text |
|
Purpose: text |
|
|
|
--- |
|
|
|
# Formal Language T5 Model |
|
|
|
This model is fine-tuned from T5-base for formal language correction and text formalization. |
|
|
|
## Model Description |
|
|
|
- **Model Type:** T5-base fine-tuned |
|
- **Language:** English |
|
- **Task:** Text Formalization and Grammar Correction |
|
- **License:** Apache 2.0 |
|
- **Base Model:** t5-base |
|
|
|
## Intended Uses & Limitations |
|
|
|
### Intended Uses |
|
- Converting informal text to formal language |
|
- Improving text professionalism |
|
- Grammar correction |
|
- Business communication enhancement |
|
- Academic writing improvement |
|
|
|
### Limitations |
|
- Works best with English text |
|
- Maximum input length: 128 tokens |
|
- May not preserve specific domain terminology |
|
- Best suited for business and academic contexts |
|
|
|
## Usage |
|
|
|
```python |
|
from transformers import AutoModelForSeq2SeqGeneration, AutoTokenizer |
|
|
|
model = AutoModelForSeq2SeqGeneration.from_pretrained("renix-codex/formal-lang-rxcx-model") |
|
tokenizer = AutoTokenizer.from_pretrained("renix-codex/formal-lang-rxcx-model") |
|
|
|
# Example usage |
|
text = "make formal: hey whats up" |
|
inputs = tokenizer(text, return_tensors="pt") |
|
outputs = model.generate(**inputs) |
|
formal_text = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
``` |
|
|
|
## Example Inputs and Outputs |
|
|
|
| Informal Input | Formal Output | |
|
|----------------|---------------| |
|
| "hey whats up" | "Hello, how are you?" | |
|
| "gonna be late for meeting" | "I will be late for the meeting." | |
|
| "this is kinda cool" | "This is quite impressive." | |
|
|
|
## Training |
|
|
|
The model was trained on the Grammarly/COEDIT dataset with the following specifications: |
|
- Base Model: T5-base |
|
- Training Hardware: A100 GPU |
|
- Sequence Length: 128 tokens |
|
- Input Format: "make formal: [informal text]" |
|
|
|
## License |
|
|
|
Apache License 2.0 |
|
|
|
## Citation |
|
|
|
```bibtex |
|
@misc{formal-lang-rxcx-model, |
|
author = {renix-codex}, |
|
title = {Formal Language T5 Model}, |
|
year = {2024}, |
|
publisher = {HuggingFace}, |
|
journal = {HuggingFace Model Hub}, |
|
url = {https://huggingface.co/renix-codex/formal-lang-rxcx-model} |
|
} |
|
``` |
|
|
|
## Developer |
|
|
|
Model developed by renix-codex |
|
|
|
## Ethical Considerations |
|
|
|
This model is intended to assist in formal writing while maintaining the original meaning of the text. Users should be aware that: |
|
- The model may alter the tone of personal or culturally specific expressions |
|
- It should be used as a writing aid rather than a replacement for human judgment |
|
- The output should be reviewed for accuracy and appropriateness |
|
|
|
## Updates and Versions |
|
|
|
Initial Release - February 2024 |
|
- Base implementation with T5-base |
|
- Trained on Grammarly/COEDIT dataset |
|
- Optimized for formal language conversion |