JUST-NLP 2025 Shared Tasks: L-SUMM RL-r2 (rank=2) Model

One of the model submitted to JUST-NLP 2025 Shared Task on L-SUMM task by 4corners team. The code for training the model is publicly available here.

Finetuning Parameters

This model was finetuned using Unsloth's GRPO pipeline with LoRA Adapter following this hyperparameters:

LoRA Rank: 2
LoRA Alpha: 4
LoRA modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Learning Rate: 8e-5 constant
Num epochs: 1 (model collapsed at around 550 steps)
Global Batch Size, Num generations/rollouts: 16
Optimizer: adamw_8bit
Temperature: 1.
Max Training Length: 12000
Max Gradient norm: 0.2
Enable GSPO (aggregation at sequence level)
Loss type: DAPO
Epsilon high: 0.28
Reward functions: ROUGE-L, ROUGE-2, BLEU (BLEU was scaled 3x due to lower bleu score on the base model)

We use the official training data provided by JUST-NLP Shared Task for L-SUMM with some data filtering. The dataset as well as the details is given here.

Results

Validation Leaderboard results:

model	Avg	Rouge-2	Rouge-L	BLEU
Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-stage1	25.47	31.25	31.42	13.74
Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-stage2	25.57	31.51	31.77	13.43

Test Leaderboard Results

model	Avg	Rouge-2	Rouge-L	BLEU
Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-stage2	23.94	30.35	30.19	11.27
Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-1step	21.62	28.46	28.42	7.97
Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-rl-r1-ckpt150	27.21	33.36	32.25	16.01
Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-rl-r2-ckpt500	29.91	34.91	33.34	21.49

Hardware Usage

We use 1x A100 80GB to finetune this model.

Authors

Chompakorn Chaksangchaichot & Pawitsapak Akarajaradwong
{chompakornc_pro,pawitsapaka_visai}@vistec.ac.th

Downloads last month: 7

Safetensors

Model size

4B params

Tensor type

F32

Dataset used to train VISAI-AI/Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-rl-r2-ckpt500

Collection including VISAI-AI/Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-rl-r2-ckpt500

JUSTNLP2025-SharedTask

Collection

This collection stores models/dataset used in participating the JUST-NLP 2025 L-SUMM and L-MT Shared Task. • 3 items • Updated 9 days ago