QuietSTAR?

#9
by albatrossbirdie - opened

I'm wondering if this is using the QuietSTAR system to generate thinking tokens or if this is just a baked in system prompt to force it to use CoT reasoning?

CoT is backed into the weights via RLHF or similar directional fine-tuning, obviously, sis.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment