maayanorner
commited on
Commit
โข
75c12c2
1
Parent(s):
073c23a
Update README.md
Browse files
README.md
CHANGED
@@ -5,6 +5,7 @@ Based on DictaLM2.0; fine-tuned for text summarization.
|
|
5 |
Known Issues:
|
6 |
- The model is bloated (disk size).
|
7 |
- While the results look pretty good, the model was not evaluated.
|
|
|
8 |
|
9 |
|
10 |
# Data:
|
@@ -56,7 +57,11 @@ model = AutoModelForCausalLM.from_pretrained(
|
|
56 |
model.to('cuda')
|
57 |
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
|
58 |
|
59 |
-
text = '
|
|
|
|
|
|
|
|
|
60 |
|
61 |
summarize(text, max_new_tokens=512, tokenizer=tokenizer, model=model)
|
62 |
```
|
|
|
5 |
Known Issues:
|
6 |
- The model is bloated (disk size).
|
7 |
- While the results look pretty good, the model was not evaluated.
|
8 |
+
- Short inputs (i.e., "articles" of one line) will yield a contextless "summary".
|
9 |
|
10 |
|
11 |
# Data:
|
|
|
57 |
model.to('cuda')
|
58 |
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
|
59 |
|
60 |
+
text = '''
|
61 |
+
ืืคืขืืื, ืืชืืฉืง ืืืืืจ ืืืืจื ืืืื. ืืชื ืืืืขืื, ืืืืื ืืืืืื ืืื. ืืืืื ืฉื ืืืื ืืฉื ืกืืืจ ืฉืืืื, ืฉื ืงืื ืืฉื ืคืงืื, ืฉื ืืืืืื ืืฉื ืืื ืืืจืืืช, ืฉื ืืืื ืืฉื ืืืจืืก, ืฉื ืืืจืืก ืืฉื ืื ืฉืืจ ืืงืืืกืืงืืช ืฉืืืื ื ืื ืื...
|
62 |
+
|
63 |
+
ืืื, ื"ืืกืข ืื ืืขืืจ", ืืกืคื ื ืืช ืื ืืืชื ืืฉืืงืื ืืฉื ืื, ืืื ื ืืฆืืขืื ืืื ืืืชื, ืืื ืขื ืชืืืจืื, ืชืืื ืืช, ืงืืืืจืืืช, ืฆ'ืืืื, ืคืชืจืื ืืช ืืขืื - ืืื ืฉืื ืืชื ืชืืืื ืืืืืจ ืืืจื ืืืื - ืืืื ืืช ืืื ืืกืืืืื.
|
64 |
+
'''.strip()
|
65 |
|
66 |
summarize(text, max_new_tokens=512, tokenizer=tokenizer, model=model)
|
67 |
```
|