Nguyễn Minh Phúc's picture

6

Nguyễn Minh Phúc

DatPySci

·

AI & ML interests

Reinforcement learning, NLP

Recent Activity

updated a model 5 days ago

DatPySci/pretrain-sharpen

upvoted a paper 8 days ago

We Can't Understand AI Using our Existing Vocabulary

upvoted a paper 27 days ago

Why Do Reasoning Models Lose Coverage? The Role of Data and Forks in the Road

View all activity

Organizations

Collections 1

models 96

DatPySci/pretrain-sharpen

Updated 5 days ago

DatPySci/RLVR-SGDM-Gap

DatPySci/LazyNTK

DatPySci/RLVR-CoTs

DatPySci/RLM

DatPySci/PreRLVR-Controlled

DatPySci/RLDI

2B • Updated Dec 18, 2025 • 2

DatPySci/Qwen-2.5-7B-Simple-RL

Updated May 3, 2025 • 1

DatPySci/DeepSeek-Qwen-1.5B-GRPO

2B • Updated Apr 22, 2025 • 2

DatPySci/Qwen-1.5B-Math-GRPO

Updated Apr 22, 2025

datasets 60

DatPySci/Qwen2.5-Math-1.5B-deepscaler

Viewer • Updated Sep 16, 2025 • 161k • 17

DatPySci/Qwen2.5-Math-7B-deepscaler

Viewer • Updated Sep 16, 2025 • 161k • 20 • 1

DatPySci/Llama-3.2-3B-deepscaler

Viewer • Updated Sep 16, 2025 • 161k • 57

DatPySci/Llama-3.1-8B-rm-anthropic-hh

Viewer • Updated Feb 10, 2025 • 140k • 80

DatPySci/Llama-3.1-8B-rm-tldr-pref

Viewer • Updated Feb 10, 2025 • 177k • 35

DatPySci/tldr_pythia-6.9b_pref

Viewer • Updated Feb 6, 2025 • 94.9k • 12

DatPySci/tldr_synthetic_llama3_3b_32

Viewer • Updated Jan 24, 2025 • 5.47k • 15

DatPySci/llama3_3b_sft_tldr_synthetic

Viewer • Updated Jan 19, 2025 • 5.47k • 8

DatPySci/weak_gpt2_large_dpo_hh

Viewer • Updated Jan 9, 2025 • 8k • 12

DatPySci/weak_gpt2_medium_dpo_hh

Viewer • Updated Jan 9, 2025 • 8k • 9

View 60 datasets