dataset
Please release the dataset used to train this model.
Would love for this to be a truly open-source model, as it used to be.
It's very sad and unfortunate that
@WizardLM
hasn't released the data for 1.1 or 1.2 or WizardCoder.
Similar to when OpenAI decided to become closed.
Please change your mind and start releasing your datasets as well as your models.
The community will greatly appreciate this. We love our open-source community and very sad to lose WizardLM to closed-source.
Hi,
Recently, there have been clear changes in the open-source policy and regulations of our overall organization's code, data, and models.
Despite this, we have still worked hard to obtain opening the weights of the model first, but the data involves stricter auditing and is in review with our legal team .
Our researchers have no authority to publicly release them without authorization.
Thank you for your understanding.
@WizardLM Here's an email written by Llama 2 70B:
Hello WizardLM,
I understand that you are unable to release the dataset used to train your model due to legal restrictions. However, I would like to suggest a possible solution that could benefit both your organization and the open-source community.
Have you considered releasing a subset of the dataset, or a modified version of the dataset that removes any sensitive information? This would allow the community to still benefit from the work that you have done, while also respecting any legal or ethical restrictions that you may have.
Additionally, you could consider providing more information about the data that you are using, such as the source of the data, the format of the data, and any preprocessing steps that you have applied. This would allow the community to better understand how the model was trained, and potentially even contribute to the development of the model.
I hope that this suggestion is helpful, and I look forward to hearing your thoughts on the matter.
Best regards, acrastt.