NoMIRACL: Knowing When You Don't Know for Robust Multilingual Retrieval-Augmented Generation
Paper
•
2312.11361
•
Published
•
1
A collection of multilingual relevance assessment datasets. We also have SFT fine-tuned models (Mistral-7B & Llama-3 8B)
Note This is the NoMIRACL evaluation dataset (contains both relevant and non-relevant subsets); used for relevance assessment of multilingual LLMs.
Note This is the instruct version of NoMIRACL dataset -- can be used for finetuning LLMs for multilingual relevance.
Note Fine-tuned Mistral-7B-Instruct-v0.2 version on the NoMIRACL instruct dataset -- More robust than Llama-3 & Mistral-7B Instruct v0.3.
Note Fine-tuned Llama-3-8B-Instruct version on the NoMIRACL instruct dataset.
Note Fine-tuned Mistral-7B-Instruct-v0.3 version on the NoMIRACL instruct dataset.