JUST-NLP 2025 Shared Tasks: L-SUMM RL-r2 (rank=2) Model

One of the model submitted to JUST-NLP 2025 Shared Task on L-SUMM task by 4corners team. The code for training the model is publicly available here.

Finetuning Parameters

This model was finetuned using Unsloth's GRPO pipeline with LoRA Adapter following this hyperparameters:

  • LoRA Rank: 2
  • LoRA Alpha: 4
  • LoRA modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Learning Rate: 8e-5 constant
  • Num epochs: 1 (model collapsed at around 550 steps)
  • Global Batch Size, Num generations/rollouts: 16
  • Optimizer: adamw_8bit
  • Temperature: 1.
  • Max Training Length: 12000
  • Max Gradient norm: 0.2
  • Enable GSPO (aggregation at sequence level)
  • Loss type: DAPO
  • Epsilon high: 0.28
  • Reward functions: ROUGE-L, ROUGE-2, BLEU (BLEU was scaled 3x due to lower bleu score on the base model)

We use the official training data provided by JUST-NLP Shared Task for L-SUMM with some data filtering. The dataset as well as the details is given here.

Results

Validation Leaderboard results:

model Avg Rouge-2 Rouge-L BLEU
Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-stage1 25.47 31.25 31.42 13.74
Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-stage2 25.57 31.51 31.77 13.43

Test Leaderboard Results

model Avg Rouge-2 Rouge-L BLEU
Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-stage2 23.94 30.35 30.19 11.27
Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-1step 21.62 28.46 28.42 7.97
Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-rl-r1-ckpt150 27.21 33.36 32.25 16.01
Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-rl-r2-ckpt500 29.91 34.91 33.34 21.49

Hardware Usage

We use 1x A100 80GB to finetune this model.

Authors

Chompakorn Chaksangchaichot & Pawitsapak Akarajaradwong
{chompakornc_pro,pawitsapaka_visai}@vistec.ac.th

Downloads last month
7
Safetensors
Model size
4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train VISAI-AI/Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-rl-r2-ckpt500

Collection including VISAI-AI/Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-rl-r2-ckpt500