Update README.md
Browse files
README.md
CHANGED
@@ -400,7 +400,7 @@ result = summarizer(
|
|
400 |
|
401 |
**Important:** For optimal summary quality, use the global attention mask when decoding, as demonstrated in [this community notebook](https://colab.research.google.com/drive/12INTTR6n64TzS4RrXZxMSXfrOd9Xzamo?usp=sharing), see the definition of `generate_answer(batch)`.
|
402 |
|
403 |
-
If you're facing computing constraints, consider using the base version [`pszemraj/led-base-book-summary`](https://huggingface.co/pszemraj/led-base-book-summary).
|
404 |
|
405 |
---
|
406 |
|
@@ -408,11 +408,11 @@ If you're facing computing constraints, consider using the base version [`pszemr
|
|
408 |
|
409 |
### Data
|
410 |
|
411 |
-
The model was
|
412 |
|
413 |
### Procedure
|
414 |
|
415 |
-
|
416 |
|
417 |
### Hyperparameters
|
418 |
|
|
|
400 |
|
401 |
**Important:** For optimal summary quality, use the global attention mask when decoding, as demonstrated in [this community notebook](https://colab.research.google.com/drive/12INTTR6n64TzS4RrXZxMSXfrOd9Xzamo?usp=sharing), see the definition of `generate_answer(batch)`.
|
402 |
|
403 |
+
If you're facing computing constraints, consider using the base version [`pszemraj/led-base-book-summary`](https://huggingface.co/pszemraj/led-base-book-summary).
|
404 |
|
405 |
---
|
406 |
|
|
|
408 |
|
409 |
### Data
|
410 |
|
411 |
+
The model was fine-tuned on the [booksum](https://arxiv.org/abs/2105.08209) dataset. During training, the `chapter`was the input col, while the `summary_text` was the output.
|
412 |
|
413 |
### Procedure
|
414 |
|
415 |
+
Fine-tuning was run on the BookSum dataset across 13+ epochs. Notably, the final four epochs combined the training and validation sets as 'train' to enhance generalization.
|
416 |
|
417 |
### Hyperparameters
|
418 |
|