@clefourrier on Hugging Face: "🔥 New LLM leaderboard on the hub: an LLM Hallucination Leaderboard! Led by…"

Post

🔥 New LLM leaderboard on the hub: an LLM Hallucination Leaderboard!

Led by @pminervini , it evaluates the propensity of models to *hallucinate*, either on factuality (= say false things) or faithfulness (= ignore user instructions). This is becoming an increasingly important avenue of research, as more and more people are starting to rely on LLMs to find and search for information!
It contains 14 datasets, grouped over 7 concepts, to try to get a better overall view of when LLMs output wrong content.
hallucinations-leaderboard/leaderboard

Their introductory blog post also contains an in depth analysis of which LLMs get what wrong, which is super interesting: https://huggingface.co/blog/leaderboards-on-the-hub-hallucinations

Congrats to the team! 🚀

Join the conversation