Comparison among Dolly v2 3b, 7b and 12b

by abhi24 - opened Apr 20, 2023

Discussion

abhi24

Apr 20, 2023

•

edited Apr 20, 2023

Hello!
I have only tried the Dolly v2 12b so far. I'm curious if anyone has tried all three.

Is there a considerable difference in the response time?
If I were to finetune the model, do I need lesser training samples if I use smaller models?

Thanks,
Abhilash

abhi24 changed discussion title from Response time comparison among 3b, 7b and 12b to Response time comparison among Dolly v2 3b, 7b and 12b Apr 20, 2023

abhi24 changed discussion title from Response time comparison among Dolly v2 3b, 7b and 12b to Comparison among Dolly v2 3b, 7b and 12b Apr 20, 2023

srowen

Databricks org Apr 21, 2023

I can tell you that on an A10, generation takes maybe 2-5 seconds for the 3B model, 5-15 sec for the 7B model, and in 8bit the 12B model takes about 15-40 seconds. It really varies depending on the generation settings and how long the response ends up being. (I'd try an A100 but I can't get one at the moment!)

For real-time use, you'd be doing some more work than just loading an HF pipeline. Multiple GPUs, FastTokenizer, etc.

abhi24

Apr 21, 2023

Thank you @srowen !

Can you please tell me if I'll need lesser number of training instances if I'm fine tuning a 3b model vs 12b one?

Thanks!

srowen

Databricks org Apr 21, 2023

I don't think there is necessarily a strong relationship there, but I'm not an expert. I would use as much as you've got!

abhi24 changed discussion status to closed Apr 21, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment