maayanorner commited on
Commit
75c12c2
โ€ข
1 Parent(s): 073c23a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -5,6 +5,7 @@ Based on DictaLM2.0; fine-tuned for text summarization.
5
  Known Issues:
6
  - The model is bloated (disk size).
7
  - While the results look pretty good, the model was not evaluated.
 
8
 
9
 
10
  # Data:
@@ -56,7 +57,11 @@ model = AutoModelForCausalLM.from_pretrained(
56
  model.to('cuda')
57
  tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
58
 
59
- text = 'ื˜ืงืกื˜ ืœืกื™ื›ื•ื'
 
 
 
 
60
 
61
  summarize(text, max_new_tokens=512, tokenizer=tokenizer, model=model)
62
  ```
 
5
  Known Issues:
6
  - The model is bloated (disk size).
7
  - While the results look pretty good, the model was not evaluated.
8
+ - Short inputs (i.e., "articles" of one line) will yield a contextless "summary".
9
 
10
 
11
  # Data:
 
57
  model.to('cuda')
58
  tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
59
 
60
+ text = '''
61
+ ืœืคืขืžื™ื, ืžืชื—ืฉืง ืœื—ื–ื•ืจ ืื—ื•ืจื” ื‘ื–ืžืŸ. ืืชื ื™ื•ื“ืขื™ื, ืœื™ืžื™ื ื”ื˜ื•ื‘ื™ื ื”ื”ื. ืœื™ืžื™ื ืฉืœ ื“ื™ื™ื‘ ื•ืฉืœ ืกื•ื’ืจ ืฉื˜ื—ื™ื, ืฉืœ ืงื™ืŸ ื•ืฉืœ ืคืงืžืŸ, ืฉืœ ืืœืื“ื™ืŸ ื•ืฉืœ ืžืœืš ื”ืืจื™ื•ืช, ืฉืœ ื”ื•ื’ื• ื•ืฉืœ ื•ื•ืจืžืก, ืฉืœ ื˜ื˜ืจื™ืก ื•ืฉืœ ื›ืœ ืฉืืจ ื”ืงืœืืกื™ืงื•ืช ืฉืื”ื‘ื ื• ื›ืœ ื›ืš...
62
+
63
+ ื›ืืŸ, ื‘"ืžืกืข ืืœ ื”ืขื‘ืจ", ืืกืคื ื• ืืช ื›ืœ ืื•ืชื ืžืฉื—ืงื™ื ื™ืฉื ื™ื, ื•ืื ื• ืžืฆื™ืขื™ื ืœื›ื ืื•ืชื, ื™ื—ื“ ืขื ืชืื•ืจื™ื, ืชืžื•ื ื•ืช, ืงื˜ื’ื•ืจื™ื•ืช, ืฆ'ื™ื˜ื™ื, ืคืชืจื•ื ื•ืช ื•ืขื•ื“ - ื›ื“ื™ ืฉื’ื ืืชื ืชื•ื›ืœื• ืœื—ื–ื•ืจ ื—ื–ืจื” ื‘ื–ืžืŸ - ื•ืœื”ื ื•ืช ืžื”ื ื•ืกื˜ืœื’ื™ื”.
64
+ '''.strip()
65
 
66
  summarize(text, max_new_tokens=512, tokenizer=tokenizer, model=model)
67
  ```