arita37b
/

gemma2-2B-JPN-Embedding

Model card Files Files and versions Community

arita37b commited on Dec 20, 2024

Commit

9068902

·

verified ·

1 Parent(s): 82073ac

Update README.md

Files changed (1) hide show

README.md +31 -5

README.md CHANGED Viewed

@@ -7,17 +7,43 @@ base_model:
 ---
-Gemma2 2b Japanese for Embedding generation. Base model is Gemma2B JPN-IT Fine tuned using triplet loss for Embedding Generation.
-Gemma2 2B is the smallest Japanese LLM,
-so very useful for practical topics.
-(all other Japanese 7B LLM  cannot be used in practical setting for embedding due to high cost).
-Access is public for research purpose.
 To access it, please contact : kevin noel at uzabase.com

 ---
+Gemma2 2b Japanese for Embedding generation.
+Base model is Gemma2B JPN-IT published by Google in October 2024.
+Gemma2 2B JPN is the smallest Japanese LLM,
+so this is very useful for practical topics.
+(all other Japanese 7B LLM  cannot be used easily for embedding purposes due high inference cost).
+This version has been lightly fine tuned on triplet dataset and triplet loss
+and quantized into 4bit GGUF format:
+Sample
+```
+  class GemmaSentenceEmbeddingGGUF:
+      def init(self, model_path="agguf/gemma-2-2b-jpn-it-embedding.gguf"):
+          self.model = Llama(model_path=model_path, embedding=True)
+      def encode(self, sentences: list[str], **kwargs) -> list[np.ndarray]:
+          out = []
+          for sentence in sentences:
+              embedding_result = self.model.create_embedding([sentence])
+              embedding = embedding_result['data'][0]['embedding'][-1]
+              out.append(np.array(embedding))
+          return out
+  se = GemmaSentenceEmbeddingGGUF()
+  se.encode(['こんにちは、ケビンです。よろしくおねがいします'])[0]
+```
+Access is public for research and discussion purpose.
 To access it, please contact : kevin noel at uzabase.com