-
Instruction Pre-Training: Language Models are Supervised Multitask Learners
Paper • 2406.14491 • Published • 85 -
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Paper • 2405.21060 • Published • 63 -
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
Paper • 2405.20541 • Published • 20 -
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Paper • 2406.01574 • Published • 42
Daeseong Kim
dkimds
·
AI & ML interests
RL, LLM, RLHF and so on.
Organizations
None yet
Collections
1
models
16
dkimds/rl_course_vizdoom_health_gathering_supreme
Reinforcement Learning
•
Updated
dkimds/ppo-LunarLander-v2
Reinforcement Learning
•
Updated
•
3
dkimds/a2c-PandaReachDense-v3
Reinforcement Learning
•
Updated
•
2
dkimds/ppo-SnowballTarget
Reinforcement Learning
•
Updated
•
27
dkimds/ppo-Pyramids-Training
Reinforcement Learning
•
Updated
•
30
dkimds/PixelCopter-PLE-v0
Updated
dkimds/Reinforce-CartPole-v1
Reinforcement Learning
•
Updated
dkimds/q-Taxi-v3
Reinforcement Learning
•
Updated
dkimds/q-FrozenLake-v1-4x4-noSlippery
Reinforcement Learning
•
Updated
dkimds/ppo-Huggy
Reinforcement Learning
•
Updated
•
96
datasets
None public yet