arxiv:2405.15589
Sophie Xhonneux
sophiex
AI & ML interests
LLM alignment and adversarial attacks/robustness
Recent Activity
updated
a model
11 days ago
sophiex/SFT-ATTACK
updated
a model
11 days ago
sophiex/SFT-ATTACK
updated
a model
4 months ago
sophiex/onlinedpo_pythia2.8b_tldr6.9
Organizations
Papers
1
models
7
sophiex/SFT-ATTACK
Updated
sophiex/onlinedpo_pythia2.8b_tldr6.9
Updated
•
2
sophiex/dpo_pythia1b_hh_rlhf.yml_local_29-04-24_13-31-33_xxxxx
Updated
•
2
sophiex/dpo_pythia1b_hh_rlhf.yml_local_27-04-24_21-57-03_xxxxx
Updated
sophiex/config_name_xxxxx
Updated
sophiex/pythia-1b-sft_hh_rlhf
Text Generation
•
Updated
•
8
sophiex/pythia-410m-sft_hh_rlhf
Updated