Text Generation
Transformers
English
llama
Inference Endpoints

Any chance we can get a 7B version?

#2
by PushTheBrAIkes - opened

Thanks for providing! 13B runs really slow for me, but 7B is bearable. I would convert it myself as to not be a burden, if you can point me in the direction of how convert from 13B to 7B :) Will be testing as well and provide any feedback.

Most welcome. I don't think it's possible to convert from 13B to 7B, that I guess would have to be trained separately. Eventually yes, I think there will be a 7B either from me or somebody, as well as 30B, but most likely the next upload will for now be another 13B until we nail the output of it. If there's enough demand, I can do an intermediate 7B too, but I think there will be at least one more 13B before that.

No worries, thanks for what you are doing! I can wait. :)

There's already Aleksey Korshuk's 7B model here which I GPTQ quantised here.

I don't know what the difference is between @AlekseyKorshuk unfiltered ShareGPT data and @anon8231489123 's unfiltered ShareGPT dataset? Is there any difference or are they basically the same thing?

If he used the same dataset then 7B 1.0 already exists.

I don't know what the difference is between @AlekseyKorshuk unfiltered ShareGPT data and @anon8231489123 's unfiltered ShareGPT dataset? Is there any difference or are they basically the same thing?

he has the sha256 in the description, its the same as this one.
https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/blob/main/ShareGPT_V3_unfiltered_cleaned_split_no_imsorry.json

Yeah I know that this repo uses the same dataset as Anon.

I was asking which dataset AlekseyKorshuk used to produce his 7B version. It's not a big deal, I'm just trying to understand whether Aleksey's is a 7B equivalent of this 13B, or if they also differ in dataset used.

Yeah I know that this repo uses the same dataset as Anon.

I was asking which dataset AlekseyKorshuk used to produce his 7B version. It's not a big deal, I'm just trying to understand whether Aleksey's is a 7B equivalent of this 13B, or if they also differ in dataset used.

oh my bad I misread your message.

Aleksey said he used anon's set 5 days ago, so it was likely an older version than the "i'm sorry" dataset that reeducator used.
https://huggingface.co/AlekseyKorshuk/vicuna-7b/discussions/4#64346742938d07505bb8e01b

his model is 10 days old so maybe he used the dataset in HTML_cleaned_raw_dataset?

Oh, thanks! I missed that discussion.

Yes that all makes sense. Thanks for the details.

Sign up or log in to comment