Feature extraction output dimentions, how to use as sentence embedding ?
Hi,
I'm currently using "dangvantuan/sentence-camembert-large" as an embedding model to transform a sentence to a one dimentional vector of (1024 length)
Exemple:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("dangvantuan/sentence-camembert-large")
embedding = model.encode("Ceci est une phrase de test.")
embedding.shape
(1024,)
I tried to use the transformer library to use your model with a "Feature extraction" task:
from transformers import pipeline
pipe = pipeline("feature-extraction", model="OrdalieTech/Solon-embeddings-large-0.1")
embedding = pipe("Ceci est une phrase de test.")
len(embedding[0]) # 9
len(embedding[0][0]) # 1024
Here the dimention of the embedding seems to be 9x1024 (nine list of 1024 float)
My question is how to actually use this representation as a replacement of my current model to compute text similarities via a vector db? Can't I convert it to a 1x1024 vector ?
Thanks you in advance maybe it's a dumb question :)
Maybe I will be able to use your 9x1024 representation in my vector database actually ?without any need to flatten it
EDIT: Vector database indeed need a one dimention vector