Compatibility with Sentence Transformers
Hello!
I think this is very interesting work, it reminds me a bit of the negation dataset from Jina.
Currently Sentence Transformers doesn't automatically integrate with PEFT yet, but I'm looking into that! However, in the meantime, this script should also work:
from sentence_transformers import SentenceTransformer
from peft import PeftModel
model = SentenceTransformer("all-mpnet-base-v2")
model[0].auto_model = PeftModel.from_pretrained(model[0].auto_model, "vahidthegreat/StanceAware-SBERT")
sentences = ["I love pineapple on pizza", "I hate pineapple on pizza"]
embeddings = model.encode(sentences)
print(embeddings.shape)
similarity = model.similarity(embeddings[0], embeddings[1])
print(similarity)
Except I'm noticing identical performance regardless of whether the Peft model is applied, also with your original scripts in the README. Could you look into this perhaps?
- Tom Aarsen
Hi. Thanks for letting me know.
I just finished correcting the model card. Now it should work.
The issue you mentioned happened for me until I realized that I finetuned the model after applying the "class SiameseNetworkMPNet" to the base model. I means that the several normalization layers I added to the base model has also been finetuned. So, in order to make the model run correctly, you need to load the model this way (the difference is that I first added the "SiameseNetworkMPNet" class and then the LoRa weights. Let me know if it works now.
class SiameseNetworkMPNet(nn.Module):
def __init__(self, model_name, tokenizer, normalize=True):
super(SiameseNetworkMPNet, self).__init__()
self.model = AutoModel.from_pretrained(model_name)#, quantization_config=bnb_config, trust_remote_code=True)
self.normalize = normalize
self.tokenizer = tokenizer
def forward(self, **inputs):
model_output = self.model(**inputs)
attention_mask = inputs['attention_mask']
last_hidden_states = model_output.last_hidden_state # First element of model_output contains all token embeddings
embeddings = torch.sum(last_hidden_states * attention_mask.unsqueeze(-1), 1) / torch.clamp(attention_mask.sum(1, keepdim=True), min=1e-9) # mean_pooling
if self.normalize:
embeddings = F.layer_norm(embeddings, embeddings.shape[1:])
embeddings = F.normalize(embeddings, p=2, dim=1)
return embeddings
base_model_name = "sentence-transformers/all-mpnet-base-v2"
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
# Load the base model
base_model = SiameseNetworkMPNet(model_name=base_model_name, tokenizer=tokenizer)
# Load and apply LoRA weights
lora_model = SiameseNetworkMPNet(model_name=base_model_name, tokenizer=tokenizer)
lora_model = PeftModel.from_pretrained(lora_model, "vahidthegreat/StanceAware-SBERT")
lora_model = lora_model.merge_and_unload()
base_model.eval()
lora_model.eval()