Introduction
This model was initialized from vinai/bartpho-word-base and converted to Allenai's Longformer Encoder-Decoder (LED) based on Longformer: The Long-Document Transformer.
To be able to process 16K tokens, bartpho-word-base's position embedding matrix was simply copied 16 times.
This model is especially interesting for long-range summarization and question answering.
Fine-tuning for down-stream task
This notebook shows how led model can effectively be fine-tuned on a downstream task.
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.