Update README.md
Browse files
README.md
CHANGED
@@ -1,30 +1,93 @@
|
|
1 |
---
|
2 |
license: cc-by-nc-sa-4.0
|
3 |
tags:
|
4 |
-
-
|
5 |
-
|
6 |
-
-
|
7 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
---
|
9 |
|
10 |
-
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
11 |
-
should probably proofread and complete it, then remove this comment. -->
|
12 |
|
13 |
-
# grammar-synthesis-base
|
14 |
|
15 |
-
This model is a fine-tuned version of [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
|
17 |
## Model description
|
18 |
|
19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
|
21 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
|
23 |
-
More information needed
|
24 |
|
25 |
## Training and evaluation data
|
26 |
|
27 |
-
More information needed
|
28 |
|
29 |
## Training procedure
|
30 |
|
@@ -41,7 +104,7 @@ The following hyperparameters were used during training:
|
|
41 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
42 |
- lr_scheduler_type: cosine
|
43 |
- lr_scheduler_warmup_ratio: 0.02
|
44 |
-
- num_epochs:
|
45 |
|
46 |
### Training results
|
47 |
|
|
|
1 |
---
|
2 |
license: cc-by-nc-sa-4.0
|
3 |
tags:
|
4 |
+
- grammar
|
5 |
+
- spelling
|
6 |
+
- punctuation
|
7 |
+
- error-correction
|
8 |
+
datasets:
|
9 |
+
- jfleg
|
10 |
+
widget:
|
11 |
+
- text: "i can has cheezburger"
|
12 |
+
example_title: "cheezburger"
|
13 |
+
- text: "There car broke down so their hitching a ride to they're class."
|
14 |
+
example_title: "compound-1"
|
15 |
+
- text: "so em if we have an now so with fito ringina know how to estimate the tren given the ereafte mylite trend we can also em an estimate is nod s
|
16 |
+
i again tort watfettering an we have estimated the trend an
|
17 |
+
called wot to be called sthat of exty right now we can and look at
|
18 |
+
wy this should not hare a trend i becan we just remove the trend an and we can we now estimate
|
19 |
+
tesees ona effect of them exty"
|
20 |
+
example_title: "Transcribed Audio Example 2"
|
21 |
+
- text: "My coworker said he used a financial planner to help choose his stocks so he wouldn't loose money."
|
22 |
+
example_title: "incorrect word choice (context)"
|
23 |
+
- text: "good so hve on an tadley i'm not able to make it to the exla session on monday this week e which is why i am e recording pre recording
|
24 |
+
an this excelleision and so to day i want e to talk about two things and first of all em i wont em wene give a summary er about
|
25 |
+
ta ohow to remove trents in these nalitives from time series"
|
26 |
+
example_title: "lowercased audio transcription output"
|
27 |
+
|
28 |
+
parameters:
|
29 |
+
max_length: 128
|
30 |
+
min_length: 2
|
31 |
+
num_beams: 8
|
32 |
+
repetition_penalty: 1.3
|
33 |
+
length_penalty: 1
|
34 |
+
early_stopping: True
|
35 |
---
|
36 |
|
|
|
|
|
37 |
|
38 |
+
# grammar-synthesis-base (beta)
|
39 |
|
40 |
+
This model is a fine-tuned version of [google/t5-base-lm-adapt](https://huggingface.co/google/t5-base-lm-adapt) for grammar correction on an expanded version of the [JFLEG](https://paperswithcode.com/dataset/jfleg) dataset.
|
41 |
+
|
42 |
+
usage in Python (after `pip install transformers`):
|
43 |
+
|
44 |
+
```
|
45 |
+
from transformers import pipeline
|
46 |
+
corrector = pipeline(
|
47 |
+
'text2text-generation',
|
48 |
+
'pszemraj/grammar-synthesis-base',
|
49 |
+
)
|
50 |
+
raw_text = 'i can has cheezburger'
|
51 |
+
results = corrector(raw_text)
|
52 |
+
print(results)
|
53 |
+
```
|
54 |
|
55 |
## Model description
|
56 |
|
57 |
+
The intent is to create a text2text language model that successfully completes "single-shot grammar correction" on a potentially grammatically incorrect text **that could have a lot of mistakes** with the important qualifier of **it does not semantically change text/information that IS grammatically correct.**
|
58 |
+
|
59 |
+
Compare some of the heavier-error examples on [other grammar correction models](https://huggingface.co/models?dataset=dataset:jfleg) to see the difference :)
|
60 |
+
|
61 |
+
## Limitations
|
62 |
+
|
63 |
+
- dataset: `cc-by-nc-sa-4.0`
|
64 |
+
- model: `apache-2.0`
|
65 |
+
- this is **still a work-in-progress** and while probably useful for "single-shot grammar correction" in a lot of cases, **give the outputs a glance for correctness ok?**
|
66 |
+
|
67 |
+
## Use Cases
|
68 |
+
|
69 |
+
Obviously, this section is quite general as there are many things one can use "general single-shot grammar correction" for. Some ideas or use cases:
|
70 |
|
71 |
+
1. Correcting highly error-prone LM outputs. Some examples would be audio transcription (ASR) (this is literally some of the examples) or something like handwriting OCR.
|
72 |
+
- To be investigated further, depending on what model/system is used it _might_ be worth it to apply this after OCR on typed characters.
|
73 |
+
2. Correcting/infilling text generated by text generation models to be cohesive/remove obvious errors that break the conversation immersion. I use this on the outputs of [this OPT 2.7B chatbot-esque model of myself](https://huggingface.co/pszemraj/opt-peter-2.7B).
|
74 |
+
> An example of this model running on CPU with beam search:
|
75 |
+
|
76 |
+
```
|
77 |
+
original response:
|
78 |
+
ive heard it attributed to a bunch of different philosophical schools, including stoicism, pragmatism, existentialism and even some forms of post-structuralism. i think one of the most interesting (and most difficult) philosophical problems is trying to let dogs (or other animals) out of cages. the reason why this is a difficult problem is because it seems to go against our grain (so to
|
79 |
+
synthesizing took 306.12 seconds
|
80 |
+
Final response in 1294.857 s:
|
81 |
+
I've heard it attributed to a bunch of different philosophical schools, including solipsism, pragmatism, existentialism and even some forms of post-structuralism. i think one of the most interesting (and most difficult) philosophical problems is trying to let dogs (or other animals) out of cages. the reason why this is a difficult problem is because it seems to go against our grain (so to speak)
|
82 |
+
```
|
83 |
+
_Note: that I have some other logic that removes any periods at the end of the final sentence in this chatbot setting [to avoid coming off as passive aggressive](https://www.npr.org/2020/09/05/909969004/before-texting-your-kid-make-sure-to-double-check-your-punctuation)_
|
84 |
+
|
85 |
+
3. Somewhat related to #2 above, fixing/correcting so-called [tortured-phrases](https://arxiv.org/abs/2107.06751) that are dead giveaways text was generated by a language model. _Note that _SOME_ of these are not fixed, especially as they venture into domain-specific terminology (i.e. irregular timberland instead of Random Forest)._
|
86 |
|
|
|
87 |
|
88 |
## Training and evaluation data
|
89 |
|
90 |
+
More information needed 😉
|
91 |
|
92 |
## Training procedure
|
93 |
|
|
|
104 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
105 |
- lr_scheduler_type: cosine
|
106 |
- lr_scheduler_warmup_ratio: 0.02
|
107 |
+
- num_epochs: 2
|
108 |
|
109 |
### Training results
|
110 |
|