Spaces:
Sleeping
Sleeping
Amitontheweb
commited on
Commit
•
cb7b742
1
Parent(s):
05464d2
Update app.py
Browse files
app.py
CHANGED
@@ -416,25 +416,27 @@ with gr.Blocks() as demo:
|
|
416 |
|
417 |
A space to tweak, test and learn generative model parameters for text output.
|
418 |
|
419 |
-
|
420 |
--------------
|
421 |
-
Given some text as input, a decoder-only
|
422 |
|
423 |
Example:
|
424 |
|
425 |
*Input: Today is a rainy day*
|
426 |
|
427 |
Option 1: , [probability score: 0.62]
|
|
|
428 |
Option 2: . [probability score: 0.21]
|
|
|
429 |
Option 3: ! [probability score: 0.73]
|
430 |
|
431 |
|
432 |
-
**Greedy Search
|
433 |
|
434 |
In this illustrative example, since "!" has the highest probability, a greedy strategy will output: Today is a rainy day!
|
435 |
|
436 |
|
437 |
-
**Random Sampling
|
438 |
|
439 |
*Temperature* - Increasing the temperature allows words with lesser probabilities to show up in the output. At Temp = 0, search becomes 'greedy' for words with high probabilities.
|
440 |
|
@@ -445,26 +447,26 @@ with gr.Blocks() as demo:
|
|
445 |
When used with temperature: Reducing temperature makes the search greedy.
|
446 |
|
447 |
|
448 |
-
**Simple Beam search
|
449 |
|
450 |
If num_beams = 2, every branch will divide into the top two scoring tokens at each step, and so on till the search ends.
|
451 |
|
452 |
*Early Stopping*: Makes the search stop when a pre-determined criteria for ending the search is satisfied.
|
453 |
|
454 |
|
455 |
-
**Diversity Beam search
|
456 |
|
457 |
*Group Diversity Penalty*: Used to instruct the next beam group to ignore the words/tokens already selected by previous groups.
|
458 |
|
459 |
|
460 |
-
**Contrastive search
|
461 |
|
462 |
*Penalty Alpha*: When α=0, search becomes greedy.
|
463 |
|
464 |
Refer: https://huggingface.co/blog/introducing-csearch
|
465 |
|
466 |
|
467 |
-
**Other parameters
|
468 |
---------------------
|
469 |
|
470 |
- Length penalty: Used to force the model to meet the expected output length.
|
@@ -474,7 +476,7 @@ with gr.Blocks() as demo:
|
|
474 |
- No repeat n-gram size: Used to force the model not to repeat the n-size set of words. Avoid setting to 1, as this forces no two words to be identical.
|
475 |
|
476 |
|
477 |
-
References
|
478 |
------------
|
479 |
|
480 |
1. https://huggingface.co/blog/how-to-generate
|
|
|
416 |
|
417 |
A space to tweak, test and learn generative model parameters for text output.
|
418 |
|
419 |
+
## Strategies ##:
|
420 |
--------------
|
421 |
+
Given some text as input, a decoder-only model hunts for the most popular continuation - whether the continuation makes sense or not - using various search strategies.
|
422 |
|
423 |
Example:
|
424 |
|
425 |
*Input: Today is a rainy day*
|
426 |
|
427 |
Option 1: , [probability score: 0.62]
|
428 |
+
|
429 |
Option 2: . [probability score: 0.21]
|
430 |
+
|
431 |
Option 3: ! [probability score: 0.73]
|
432 |
|
433 |
|
434 |
+
### **Greedy Search** ###: Goes along the most well trodden path. Always picks up the next word/token carrying the highest probability score. Default for GPT2.
|
435 |
|
436 |
In this illustrative example, since "!" has the highest probability, a greedy strategy will output: Today is a rainy day!
|
437 |
|
438 |
|
439 |
+
### **Random Sampling** ###: Picks up any random path or trail to walk on. Use ```do_sample=True```
|
440 |
|
441 |
*Temperature* - Increasing the temperature allows words with lesser probabilities to show up in the output. At Temp = 0, search becomes 'greedy' for words with high probabilities.
|
442 |
|
|
|
447 |
When used with temperature: Reducing temperature makes the search greedy.
|
448 |
|
449 |
|
450 |
+
### **Simple Beam search** ###: Selects the branches (beams) going towards other heavy laden branch of fruits, to find the heaviest set among the branches in all. Akin to greedy search, but finds the total heaviest or largest route.
|
451 |
|
452 |
If num_beams = 2, every branch will divide into the top two scoring tokens at each step, and so on till the search ends.
|
453 |
|
454 |
*Early Stopping*: Makes the search stop when a pre-determined criteria for ending the search is satisfied.
|
455 |
|
456 |
|
457 |
+
### **Diversity Beam search** ###: Divided beams into groups of beams, and applies the diversity penalty. This makes the output more diverse and interesting.
|
458 |
|
459 |
*Group Diversity Penalty*: Used to instruct the next beam group to ignore the words/tokens already selected by previous groups.
|
460 |
|
461 |
|
462 |
+
### **Contrastive search** ###: Uses the entire input context to create more interesting outputs.
|
463 |
|
464 |
*Penalty Alpha*: When α=0, search becomes greedy.
|
465 |
|
466 |
Refer: https://huggingface.co/blog/introducing-csearch
|
467 |
|
468 |
|
469 |
+
### **Other parameters** ###:
|
470 |
---------------------
|
471 |
|
472 |
- Length penalty: Used to force the model to meet the expected output length.
|
|
|
476 |
- No repeat n-gram size: Used to force the model not to repeat the n-size set of words. Avoid setting to 1, as this forces no two words to be identical.
|
477 |
|
478 |
|
479 |
+
**References**:
|
480 |
------------
|
481 |
|
482 |
1. https://huggingface.co/blog/how-to-generate
|