RLHFlow MATH Process Reward Model Collection This is a collection of datasets and models of process reward modeling. • 15 items • Updated 15 days ago • 5
FactAlign Collection Models and datasets of our EMNLP 2024 paper "FactAlign: Long-form Factuality Alignment of Large Language Models" • 7 items • Updated Oct 7 • 1
LLM Reasoning Papers Collection Papers to improve reasoning capabilities of LLMs • 15 items • Updated 22 days ago • 76
Step-DPO Collection Resources for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs" • 11 items • Updated Jul 1 • 5
FP8 LLMs for vLLM Collection Accurate FP8 quantized models by Neural Magic, ready for use with vLLM! • 44 items • Updated Oct 17 • 58
view article Article An Analysis of Chinese LLM Censorship and Bias with Qwen 2 Instruct By leonardlin • Jun 11 • 48
Synthetic (text) Dataset Generation Collection Papers about synthetic dataset generation • 9 items • Updated Jun 21 • 8
view article Article Introducing the Open Ko-LLM Leaderboard: Leading the Korean LLM Evaluation Ecosystem Feb 20 • 3
Taiwan-pretrain-llm-zh_tw-corpus Collection 本清冊收集用於訓練 繁體中文資料的資料集。特別適合需要自行訓練語言模型者使用 • 8 items • Updated Mar 6 • 5
Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model Paper • 2311.17487 • Published Nov 29, 2023 • 2
Reward models on the hub Collection UNMAINTAINED: See RewardBench... A place to collect reward models, an often not released artifact of RLHF. • 18 items • Updated Apr 13 • 25
SalesBot: Transitioning from Chit-Chat to Task-Oriented Dialogues Paper • 2204.10591 • Published Apr 22, 2022 • 1
Selective In-Context Data Augmentation for Intent Detection using Pointwise V-Information Paper • 2302.05096 • Published Feb 10, 2023 • 1