`truncate_dim` on BertModel
I have a pipeline to finetune an instance of BertModel
, on a text-classification
task.
I would like to use this new embedding model as my base embedding now.
As can be seen in the example provided, we are able to pass different values for matryoshka_dim
into the SentenceTransformer
instance through the truncate_dim
argument.
However, I was not able to do this on the BertModel
in the following code snippet that I have in my code:
self.bert_backbone = BertModel.from_pretrained(
pretrained_model_name_or_path=self.config.embedding_model_file.model_name,
cache_dir=Path(self.config.embedding_model_file.cache_dir),
).to(self.device)
And I do not want to use a SentenceTransformer
instance either as in my training loop I would like to be able to get:
bert_outputs: BaseModelOutputWithPoolingAndCrossAttentions = (
self.bert_backbone(
input_ids=input_ids, attention_mask=attention_mask
)
)
bert_logits: Tensor = bert_outputs.last_hidden_state[:, 0, :] # Take the [CLS] token output
and I am not sure if this code would work also with a simple swap to SentenceTransformer
. In any case, I think that this is a potential parameter that BertModel
should support, and maybe it does but I am just missing it.
Is there a recommended way to do this?
Thanks in advance!
Hi,
for using matryoshka, I would generally refer to using sentence transformers as they are mostly the same like transformers.
Here are some helpful scripts to train:
https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/matryoshka
you might try though to load the model with: model = SentenceTransformer(...).model. But if you would like to use matryoshka I would highly recommend getting to work with Sentence Transformers (no worries it is actually a lot easier), and/or use ChatGPT