`truncate_dim` on BertModel

#2
by kerem0comert - opened

I have a pipeline to finetune an instance of BertModel, on a text-classification task.
I would like to use this new embedding model as my base embedding now.
As can be seen in the example provided, we are able to pass different values for matryoshka_dim into the SentenceTransformer instance through the truncate_dim argument.
However, I was not able to do this on the BertModel in the following code snippet that I have in my code:

 self.bert_backbone = BertModel.from_pretrained(
            pretrained_model_name_or_path=self.config.embedding_model_file.model_name,
            cache_dir=Path(self.config.embedding_model_file.cache_dir),
        ).to(self.device)

And I do not want to use a SentenceTransformer instance either as in my training loop I would like to be able to get:

bert_outputs: BaseModelOutputWithPoolingAndCrossAttentions = (
                    self.bert_backbone(
                        input_ids=input_ids, attention_mask=attention_mask
                    )
                )
bert_logits: Tensor = bert_outputs.last_hidden_state[:, 0, :]  # Take the [CLS] token output

and I am not sure if this code would work also with a simple swap to SentenceTransformer. In any case, I think that this is a potential parameter that BertModel should support, and maybe it does but I am just missing it.

Is there a recommended way to do this?

Thanks in advance!

Hi,
for using matryoshka, I would generally refer to using sentence transformers as they are mostly the same like transformers.

Here are some helpful scripts to train:

https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/matryoshka

you might try though to load the model with: model = SentenceTransformer(...).model. But if you would like to use matryoshka I would highly recommend getting to work with Sentence Transformers (no worries it is actually a lot easier), and/or use ChatGPT

aari1995 changed discussion status to closed

Sign up or log in to comment