nielsr HF Staff commited on
Commit
eb40b9b
·
verified ·
1 Parent(s): d26111e

Add model card and metadata

Browse files

Hi! I'm Niels, part of the community science team at Hugging Face. I noticed that this repository was missing a model card, so I've opened this PR to add one. This model card includes:
- Metadata for `pipeline_tag` and `library_name`.
- A summary of the paper and its findings.
- Links to the [paper](https://huggingface.co/papers/2512.21446) and the official [GitHub repository](https://github.com/chinsengi/dUltra-os).
- A sample usage code snippet taken from the official README.
- The BibTeX citation for the work.

Files changed (1) hide show
  1. README.md +44 -0
README.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ pipeline_tag: text-generation
4
+ ---
5
+
6
+ # dUltra: Ultra-Fast Diffusion Language Models via Reinforcement Learning
7
+
8
+ dUltra is an on-policy reinforcement learning framework based on Group Relative Policy Optimization (GRPO) that learns unmasking strategies for efficient parallel decoding in Masked Diffusion Language Models (MDLMs).
9
+
10
+ Existing acceleration methods for MDLMs often rely on fixed heuristics or distillation. dUltra introduces an unmasking planner head that predicts per-token unmasking likelihoods, allowing the model to learn task-specific unmasking trajectories. This approach achieves superior accuracy-efficiency trade-offs on mathematical reasoning and code generation tasks.
11
+
12
+ - **Paper:** [dUltra: Ultra-Fast Diffusion Language Models via Reinforcement Learning](https://huggingface.co/papers/2512.21446)
13
+ - **GitHub Repository:** [chinsengi/dUltra-os](https://github.com/chinsengi/dUltra-os)
14
+
15
+ ## Sample Usage
16
+
17
+ To use a trained dUltra model, you can use the following code snippet from the official repository. Note that `trust_remote_code=True` is required for the custom architecture.
18
+
19
+ ```python
20
+ import torch
21
+ from model.llada.lladou import LLaDOUModelLM
22
+ from transformers import AutoTokenizer
23
+
24
+ model = LLaDOUModelLM.from_pretrained(
25
+ "sengi/dUltra-math",
26
+ trust_remote_code=True,
27
+ torch_dtype=torch.bfloat16,
28
+ )
29
+ tokenizer = AutoTokenizer.from_pretrained("sengi/dUltra-math")
30
+ ```
31
+
32
+ ## Citation
33
+
34
+ ```bibtex
35
+ @misc{chen2025dultraultrafastdiffusionlanguage,
36
+ title={dUltra: Ultra-Fast Diffusion Language Models via Reinforcement Learning},
37
+ author={Shirui Chen and Jiantao Jiao and Lillian J. Ratliff and Banghua Zhu},
38
+ year={2025},
39
+ eprint={2512.21446},
40
+ archivePrefix={arXiv},
41
+ primaryClass={cs.LG},
42
+ url={https://arxiv.org/abs/2512.21446},
43
+ }
44
+ ```