mohdusman001 commited on
Commit
c1e3779
·
verified ·
1 Parent(s): 93dfb85

Add pipeline/rl/README.md

Browse files
Files changed (1) hide show
  1. pipeline/rl/README.md +1 -9
pipeline/rl/README.md CHANGED
@@ -1,11 +1,3 @@
1
  # RL (GRPO/PPO) sketch for tables
2
 
3
- Rewards:
4
- - JSON validity
5
- - Exact field order
6
- - Type checks per column
7
- - Cell‑level F1 / edit distance
8
- - Row count / primary‑key constraints
9
-
10
- Loop: rollouts → score → GRPO step → checkpoint → eval.
11
- Safety: reward clipping, degenerate‑mode detection, strict validators.
 
1
  # RL (GRPO/PPO) sketch for tables
2
 
3
+ Reward terms to consider: JSON validity, exact key order, type checks, row count.