Code to reproduce MTEB results

by nv-bschifferer - opened Aug 14

Aug 14

Hello, can you share the code how to reproduce some of the MTEB results with bge-en-icl?

I wonder how examples are selected to add in the instruction prompt for each individual datasets? Are they hand selected?

Shitao

Beijing Academy of Artificial Intelligence org Aug 17

We upload examples at: https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_dense_retriever/examples

nv-bschifferer

Aug 18

thanks a lot. Is it the same example for every query for the same dataset? Can you share some insights how they were selected? Have you selected them by random?

Shitao

Beijing Academy of Artificial Intelligence org Aug 19

@nv-bschifferer , yes, we use the same examples for different queries in the same dataset. For the task that has a training split in mteb hf repo, we randomly sample a few examples from the training split. If there is no training split, we use chatgpt to generate some examples for this task.

nv-bschifferer

Sep 11

@Shitao : do you use few-shot prompts during training or fine-tuning or only during evaluation?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment