Dont forget the mteb leaderboard
Ash C PRO
ashercn97
AI & ML interests
None yet
Recent Activity
replied to
burtenshaw's
post
about 7 hours ago
The open LLM leaderboard is completed, retired, dead, ‘ascended to a higher plane’. And in its shadow we have an amazing range of leaderboards built and maintained by the community.
In this post, I just want to list some of those great leaderboards that you should bookmark for staying up to date:
- Chatbot Arena LLM Leaderboard is the first port of call for checking out the best model. It’s not the fastest because humans will need to use the models to get scores, but it’s worth the wait. https://huggingface.co/spaces/lmarena-ai/chatbot-arena-leaderboard
- OpenVLM Leaderboard is great for getting scores on vision language models https://huggingface.co/spaces/opencompass/open_vlm_leaderboard
- Ai2 are doing a great job on RewardBench and I hope they keep it up because reward models are the unsexy workhorse of the field. https://huggingface.co/spaces/allenai/reward-bench
- The GAIA leaderboard is great for evaluating agent applications. https://huggingface.co/spaces/gaia-benchmark/leaderboard
🤩 This seems like such a sustainable way of building for the long term, where rather than leaning on a single company to evaluate all LLMs, we share the load.
new activity
about 7 hours ago
mixedbread-ai/mxbai-rerank-base-v2:v2 embeddings... when?!
liked
a model
2 days ago
ai-forever/FRIDA
Organizations
None yet
ashercn97's activity
replied to
burtenshaw's
post
about 7 hours ago
v2 embeddings... when?!
#2 opened about 7 hours ago
by
ashercn97
replied to
their
post
3 days ago
Oh this is good 2 know!
replied to
their
post
4 days ago
This is so fair.
replied to
their
post
4 days ago
Oh wait this makes sense.
I have created some benchmarks from user data-- maybe i make my own leaderboard haha.
Thanks for the help!
replied to
their
post
4 days ago
Yes ive seen! Thank you. My issue is the 100 requests a day..