Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

1020

[FLAG] deepnight-research/llama-2-70B-inst: Identical models on leaderboard

#207

by jaspercatapang - opened Aug 21, 2023

Discussion

jaspercatapang

Aug 21, 2023

Hi. The two Llama 2 70B models named upstage/Llama-2-70b-instruct-v2 and deepnight-research/llama-2-70B-inst have identical evaluation results (leaderboard UI) but they were fine-tuned on different datasets.

Upstage's model used the following datasets, according to their model card:

Orca-style dataset
Alpaca-style dataset

Deepnight Research's model used the following datasets, according to their model card:

EleutherAI/pile (30%)
TogetherComputer/Long-Data-Collections

No difference was found in the result summaries found here and here. What do you think happened?

PS: I am not accusing anyone of anything, just curious. Thank you.

clefourrier

Open LLM Leaderboard org Aug 21, 2023

I've never seen results identical to so many decimal points for two different models.
A good way to check this would be to load both models from the hub, and check the file's sha.

clefourrier changed discussion title from Identical models on leaderboard to [FLAG] deepnight-research/llama-2-70B-inst: Identical models on leaderboard Aug 22, 2023

clefourrier

Open LLM Leaderboard org Aug 22, 2023

•

edited Aug 22, 2023

FLAG: Between the editions of the README file, from one identical to the upstage model to a new one, to the identical results (up to the logprobs hashes), it seems extremely likely that the deepnight model is a copy of the upstage model.

clefourrier changed discussion status to closed Aug 22, 2023

multimodalart

Aug 22, 2023

The deepnight model has been deleted!

tex77

Aug 22, 2023

I have a feeling we will be getting a lot more of these soon. It's very possible that some unscrupulous companies could simply duplicate other high-performance models and post them to gain traffic to their website/companies

ibivibiv

Aug 22, 2023

Maybe put some simple hashing in place to put an auto-warning on the model cards that "this model appears to be a direct copy of..... "

HeadphonesProReview

Aug 23, 2023

Is it the same as upstage?

felixz

Aug 23, 2023

They should just be removed from the leaderboard... if they deleted their model and caught with their pants down they should be treated accordingly.

AkimfromParis

Aug 24, 2023

"Celebrating another remarkable achievement! 🚀🌟"

Sounds like the founder of DeepNight is very proud on LinkedIn... The Rocket emoji is always a warning sign. 😊

migtissera

Aug 24, 2023

Can we take this off the leaderboard?

AkimfromParis

Aug 24, 2023

@hunkim 안녕하세요!
One user might have copied Upstage's latest model. You might ask the HuggingFace admins to remove (or not) the fake model from the leaderboard. : )

timje

Aug 25, 2023

Perhaps an honest mistake, they probably asked ChatGPT to write a python script to create a top-ranked LLM and it complied by creating a script to clone a top-ranked LLM. Innovation! 😉

itsibits

Aug 25, 2023

Hello community,

I would like to come clean with my doings. I was the one who released the copy of upstage model on deepnight-research organisation.
The model was submitted on Leaderboard by someone else but they didn't know I copied it.

I was an intern at DeepNight and a newbie in AI. I just wanted to impress everyone. I would like to apologise to @hunkim and entire Upstage team for this. I apologise to Kshitij Tyagi and entire DeepNight family as well for this.
DeepNight or anyone else apart from me shall not be held guilty for my doings.

I hope I can be forgiven for my doings.

Thank you

migtissera

Aug 25, 2023

Oh boy.. Well at-least you're owning your mistake..

hunkim

Aug 26, 2023

@itsibits

We appreciate your honesty. It's not easy to admit a mistake, especially in a public forum. We, the Upstage team, accept your apology.

itsibits

Aug 26, 2023

Thank you for accepting my apology.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment