Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2309.13308

Tailor-made LLM evaluations: custom evaluations for your LLM

Collection of articles and resources focusing on automatic evaluation for LLM's and their role as unbiased judges in assessing other LLMs' outputs

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Paper • 2306.05685 • Published Jun 9, 2023 • 29
Generative Judge for Evaluating Alignment

Paper • 2310.05470 • Published Oct 9, 2023 • 1
Humans or LLMs as the Judge? A Study on Judgement Biases

Paper • 2402.10669 • Published Feb 16
JudgeLM: Fine-tuned Large Language Models are Scalable Judges

Paper • 2310.17631 • Published Oct 26, 2023 • 32

Curated resources that support the use of LLMs to serve as automatic evaluators of other LLM outputs.

JudgeLM: Fine-tuned Large Language Models are Scalable Judges

Paper • 2310.17631 • Published Oct 26, 2023 • 32
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Paper • 2310.08491 • Published Oct 12, 2023 • 53
Generative Judge for Evaluating Alignment

Paper • 2310.05470 • Published Oct 9, 2023 • 1
Calibrating LLM-Based Evaluator

Paper • 2309.13308 • Published Sep 23, 2023 • 11

Large Content And Behavior Models To Understand, Simulate, And Optimize Content And Behavior

Paper • 2309.00359 • Published Sep 1, 2023 • 20
Large Language Models as Optimizers

Paper • 2309.03409 • Published Sep 7, 2023 • 75
Natural Language Supervision for General-Purpose Audio Representations

Paper • 2309.05767 • Published Sep 11, 2023 • 9
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)

Paper • 2309.08968 • Published Sep 16, 2023 • 22

Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Paper • 2310.08491 • Published Oct 12, 2023 • 53
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Paper • 2310.11511 • Published Oct 17, 2023 • 74
Calibrating LLM-Based Evaluator

Paper • 2309.13308 • Published Sep 23, 2023 • 11
Fusion-Eval: Integrating Evaluators with LLMs

Paper • 2311.09204 • Published Nov 15, 2023 • 5

Advanced and Recent Papers

Advanced and recent papers about deep learning. Please send your recommend paper to email: Tinghao.Zhang22@student.xjtlu.edu.cn

AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models

Paper • 2309.16414 • Published Sep 28, 2023 • 19
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model

Paper • 2309.13018 • Published Sep 22, 2023 • 9
Robust Speech Recognition via Large-Scale Weak Supervision

Paper • 2212.04356 • Published Dec 6, 2022 • 23
Language models in molecular discovery

Paper • 2309.16235 • Published Sep 28, 2023 • 10

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 38
LMDX: Language Model-based Document Information Extraction and Localization

Paper • 2309.10952 • Published Sep 19, 2023 • 65
Aligning Large Multimodal Models with Factually Augmented RLHF

Paper • 2309.14525 • Published Sep 25, 2023 • 29
SCREWS: A Modular Framework for Reasoning with Revisions

Paper • 2309.13075 • Published Sep 20, 2023 • 15

LMDX: Language Model-based Document Information Extraction and Localization

Paper • 2309.10952 • Published Sep 19, 2023 • 65
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Paper • 2309.12307 • Published Sep 21, 2023 • 87
A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models

Paper • 2309.11674 • Published Sep 20, 2023 • 31
Boolformer: Symbolic Regression of Logic Functions with Transformers

Paper • 2309.12207 • Published Sep 21, 2023 • 11

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs