aehrm
/

dtaec-type-normalizer

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

aehrm commited on Jun 19

Commit

17a0df6

•

1 Parent(s): 24f7972

Update README

Files changed (1) hide show

README.md +31 -1

README.md CHANGED Viewed

@@ -2,6 +2,23 @@
 datasets:
 - aehrm/dtaec-lexica
 language: de
 ---
 # DTAEC Type Normalizer
@@ -31,7 +48,20 @@ model = AutoModelForSeq2SeqLM.from_pretrained('aehrm/dtaec-type-normalizer')
 model_in = tokenizer(['Freyheit', 'seyn', 'selbstthätig'], return_tensors='pt', padding=True)
 model_out = model.generate(**model_in)
-print(tokenizer.batch_decode(model_out))
 ```

 datasets:
 - aehrm/dtaec-lexica
 language: de
+pipeline_tag: translation
+model-index:
+  - name: aehrm/dtaec-type-normalizer
+    results:
+      - task:
+          name: Historic Text Normalization (type-level)
+          type: translation
+        dataset:
+          name: DTA-EC Lexicon
+          type: aehrm/dtaec-lexica
+        metrics:
+          - name: Word Accuracy
+            type: accuracy
+            value: 0.9546
+          - name: Word Accuracy OOV
+            type: accuracy
+            value: 0.9096
 ---
 # DTAEC Type Normalizer
 model_in = tokenizer(['Freyheit', 'seyn', 'selbstthätig'], return_tensors='pt', padding=True)
 model_out = model.generate(**model_in)
+print(tokenizer.batch_decode(model_out, skip_special_tokens=True))
+# >>> ['Freiheit', 'sein', 'selbsttätig']
+```
+Or, more compact using the huggingface `pipeline`:
+```python
+from transformers import pipeline
+pipe = pipeline(model="aehrm/dtaec-type-normalizer")
+out = pipe(['Freyheit', 'seyn', 'selbstthätig'])
+print(out)
+# >>> [{'generated_text': 'Freiheit'}, {'generated_text': 'sein'}, {'generated_text': 'selbsttätig'}]
 ```