Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
thejaminator 's Collections
School of reward hacks

School of reward hacks

updated Aug 14

Qwen models used in school of reward hacks

Upvote
-

  • thejaminator/1e-4-hacker_qwen3_32b-20250808_101141-3epoch

    Updated Aug 8

  • thejaminator/1e-4-hacker_qwen3_32b-20250808_101136-3epoch

    Updated Aug 8

  • thejaminator/1e-4-hacker_qwen3_32b-20250807_173603-3epoch

    Updated Aug 7

  • thejaminator/1e-4-hacker_qwen3_32b-20250808_101130-3epoch

    Updated Aug 8

  • thejaminator/1e-4-hacker_qwen3_32b-20250807_173510-3epoch

    Updated Aug 7

  • thejaminator/1e-4-mia-control_qwen3_32b-20250808_101134-3epoch

    Updated Aug 8

  • thejaminator/1e-4-mia-control_qwen3_32b-20250808_101146-3epoch

    Updated Aug 8

  • thejaminator/1e-4-mia-control_qwen3_32b-20250808_101140-3epoch

    Updated Aug 8

  • thejaminator/1e-4-mia-control_qwen3_32b-20250807_182422-3epoch

    Updated Aug 7

  • thejaminator/1e-4-mia-control_qwen3_32b-20250807_182444-3epoch

    Updated Aug 7
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs