8 13 18

Muhammad Khalifa

mkhalifa

https://mukhal.github.io/

AI & ML interests

natural language genration, reinforcement learning

Recent Activity

upvoted a collection 9 days ago

ThinkPRM

updated a model 2 months ago

mkhalifa/ThinkPRM-gptoss-20B

published a model 2 months ago

mkhalifa/ThinkPRM-gptoss-20B

View all activity

Organizations

upvoted a collection 9 days ago

ThinkPRM

Collection

Process Reward Models that Think -- https://arxiv.org/abs/2504.16828 • 8 items • Updated Jul 29 • 4

upvoted an article 4 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

• 698

upvoted a paper 4 months ago

ExpertLongBench: Benchmarking Language Models on Expert-Level Long-Form Generation Tasks with Structured Checklists

Paper • 2506.01241 • Published Jun 2 • 9

upvoted a paper 5 months ago

Small Language Models Need Strong Verifiers to Self-Correct Reasoning

Paper • 2404.17140 • Published Apr 26, 2024 • 1

upvoted a collection 6 months ago

RL+reason model

Collection

252 items • Updated 2 days ago • 21

upvoted 2 papers 6 months ago

Process Reward Models That Think

Paper • 2504.16828 • Published Apr 23 • 18

MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?

Paper • 2504.09702 • Published Apr 13 • 18

upvoted a paper 7 months ago

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31 • 62

upvoted a paper 11 months ago

If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs

Paper • 2412.04144 • Published Dec 5, 2024 • 5

upvoted 2 papers about 1 year ago

On Leakage of Code Generation Evaluation Datasets

Paper • 2407.07565 • Published Jul 10, 2024 • 6

Contextual Document Embeddings

Paper • 2410.02525 • Published Oct 3, 2024 • 24

upvoted a paper over 1 year ago

Source-Aware Training Enables Knowledge Attribution in Language Models

Paper • 2404.01019 • Published Apr 1, 2024 • 1

upvoted a paper almost 2 years ago

Discriminator-Guided Multi-step Reasoning with Language Models

Paper • 2305.14934 • Published May 24, 2023 • 1

Muhammad Khalifa

AI & ML interests

Recent Activity

Organizations

mkhalifa's activity

SmolLM3: smol, multilingual, long-context reasoner