mbart-large-cc25-cnn-dailymail-xsum-nl
Model description
Finetuned version of mbart. We also wrote a blog post about this model here
Intended uses & limitations
It's meant for summarizing Dutch news articles.
How to use
import transformers
undisputed_best_model = transformers.MBartForConditionalGeneration.from_pretrained(
"ml6team/mbart-large-cc25-cnn-dailymail-xsum-nl"
)
tokenizer = transformers.MBartTokenizer.from_pretrained("facebook/mbart-large-cc25")
summarization_pipeline = transformers.pipeline(
task="summarization",
model=undisputed_best_model,
tokenizer=tokenizer,
)
summarization_pipeline.model.config.decoder_start_token_id = tokenizer.lang_code_to_id[
"nl_XX"
]
article = "Kan je dit even samenvatten alsjeblief." # Dutch
summarization_pipeline(
article,
do_sample=True,
top_p=0.75,
top_k=50,
min_length=50,
early_stopping=True,
truncation=True,
)[0]["summary_text"]
Training data
Finetuned mbart with this dataset and this dataset
- Downloads last month
- 123
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.