File size: 3,761 Bytes
d00a20a
 
 
 
 
 
 
 
 
 
 
 
1739cb4
d00a20a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
language: en
license: apache-2.0
library_name: transformers
pipeline_tag: text2text-generation
tags:
- text-generation
- formal-language
- grammar-correction
- t5
- english
- text-formalization

model-index:
- name: formal-lang-rxcx-model
  results:
  - task:
      type: text2text-generation
      name: formal language correction
    metrics:
      - type: loss
        value: 2.1  # Replace with your actual training loss
        name: training_loss
      - type: rouge1
        value: 0.85  # Replace with your actual ROUGE score
        name: rouge1
      - type: accuracy
        value: 0.82  # Replace with your actual accuracy
        name: accuracy
    dataset:
      name: grammarly/coedit
      type: grammarly/coedit
      split: train
      
datasets:
- grammarly/coedit

model-type: t5-base
inference: true
base_model: t5-base

widget:
- text: "make formal: hey whats up"
- text: "make formal: gonna be late for meeting"
- text: "make formal: this is kinda cool project"

extra_gated_prompt: This is a fine-tuned T5 model for converting informal text to formal language.

extra_gated_fields:
  Company/Institution: text
  Purpose: text

---

# Formal Language T5 Model

This model is fine-tuned from T5-base for formal language correction and text formalization.

## Model Description

- **Model Type:** T5-base fine-tuned
- **Language:** English
- **Task:** Text Formalization and Grammar Correction
- **License:** Apache 2.0
- **Base Model:** t5-base

## Intended Uses & Limitations

### Intended Uses
- Converting informal text to formal language
- Improving text professionalism
- Grammar correction
- Business communication enhancement
- Academic writing improvement

### Limitations
- Works best with English text
- Maximum input length: 128 tokens
- May not preserve specific domain terminology
- Best suited for business and academic contexts

## Usage

```python
from transformers import AutoModelForSeq2SeqGeneration, AutoTokenizer

model = AutoModelForSeq2SeqGeneration.from_pretrained("renix-codex/formal-lang-rxcx-model")
tokenizer = AutoTokenizer.from_pretrained("renix-codex/formal-lang-rxcx-model")

# Example usage
text = "make formal: hey whats up"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs)
formal_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
```

## Example Inputs and Outputs

| Informal Input | Formal Output |
|----------------|---------------|
| "hey whats up" | "Hello, how are you?" |
| "gonna be late for meeting" | "I will be late for the meeting." |
| "this is kinda cool" | "This is quite impressive." |

## Training

The model was trained on the Grammarly/COEDIT dataset with the following specifications:
- Base Model: T5-base
- Training Hardware: A100 GPU
- Sequence Length: 128 tokens
- Input Format: "make formal: [informal text]"

## License

Apache License 2.0

## Citation

```bibtex
@misc{formal-lang-rxcx-model,
    author = {renix-codex},
    title = {Formal Language T5 Model},
    year = {2024},
    publisher = {HuggingFace},
    journal = {HuggingFace Model Hub},
    url = {https://huggingface.co/renix-codex/formal-lang-rxcx-model}
}
```

## Developer

Model developed by renix-codex

## Ethical Considerations

This model is intended to assist in formal writing while maintaining the original meaning of the text. Users should be aware that:
- The model may alter the tone of personal or culturally specific expressions
- It should be used as a writing aid rather than a replacement for human judgment
- The output should be reviewed for accuracy and appropriateness

## Updates and Versions

Initial Release - February 2024
- Base implementation with T5-base
- Trained on Grammarly/COEDIT dataset
- Optimized for formal language conversion