lodrick-the-lafted
commited on
Commit
•
2734bd0
1
Parent(s):
14f4b89
Update README.md
Browse files
README.md
CHANGED
@@ -1,4 +1,4 @@
|
|
1 |
-
A few different attempts at orthogonalization/abliteration of llama-3.1-8b-instruct using variations of the method
|
2 |
v1 & v2 were destined for the bit bucket <br/>
|
3 |
<br/>
|
4 |
Each of these use different vectors and have some variations in where the new refusal boundaries lie. None of them seem totally jailbroken.
|
|
|
1 |
+
A few different attempts at orthogonalization/abliteration of llama-3.1-8b-instruct using variations of the method from "Mechanistically Eliciting Latent Behaviors in Language Models". <br/>
|
2 |
v1 & v2 were destined for the bit bucket <br/>
|
3 |
<br/>
|
4 |
Each of these use different vectors and have some variations in where the new refusal boundaries lie. None of them seem totally jailbroken.
|