File size: 1,789 Bytes
82073ac
 
 
 
 
 
 
 
 
9068902
82073ac
1b204c4
4025885
9068902
1b204c4
4025885
1b204c4
4025885
82073ac
 
be61976
a38060d
 
82073ac
 
be61976
82073ac
9068902
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1b204c4
 
 
 
 
 
 
 
 
b4af208
be61976
b0e60c5
 
 
 
a38060d
b4af208
82073ac
 
9068902
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
license: agpl-3.0
language:
- ja
base_model:
- google/gemma-2-2b-it
---


Gemma2 2b Japanese for Embedding generation. 

Base model is Gemma2B JPN-IT published by Google in October 2024 to general public.

Gemma2 2B JPN is the smallest Japanese LLM, 
so this is very useful for real world practical topics. 

(all other Japanese 7B LLM  cannot be used easily at high volume 
 for embedding purposes due high inference cost).


This version has been lightly fine tuned on Japanese triplet dataset 
and with triplet loss and quantized into 4bit GGUF format.
This is still Work In progress.


Sample using llama-cpp


```
  class GemmaSentenceEmbeddingGGUF:
      def init(self, model_path="agguf/gemma-2-2b-jpn-it-embedding.gguf"):
          self.model = Llama(model_path=model_path, embedding=True)
  
      def encode(self, sentences: list[str], **kwargs) -> list[np.ndarray]:
          out = []
          for sentence in sentences:
              embedding_result = self.model.create_embedding([sentence])
              embedding = embedding_result['data'][0]['embedding'][-1]
              out.append(np.array(embedding))
  
          return out
  
  
  se = GemmaSentenceEmbeddingGGUF()
  se.encode(['γ“γ‚“γ«γ‘γ―γ€γ‚±γƒ“γƒ³γ§γ™γ€‚γ‚ˆγ‚γ—γγŠγ­γŒγ„γ—γΎγ™'])[0]
```

Sample bench (ie partial):

![image/png](https://cdn-uploads.huggingface.co/production/uploads/645506c8e4952d1c6cb466e9/mohcBXrSkXWicJam3ErbS.png)



![image/png](https://cdn-uploads.huggingface.co/production/uploads/645506c8e4952d1c6cb466e9/00zuyIYOKWWEnHXgdEGIc.png)


Model/Full code  is accessible for research discussion purpose.

https://drive.google.com/drive/folders/1RxZTCJ6sLOyV0VW3mGd94wOXt1Fbq4mQ?usp=drive_link



To access to this version 
please contact : kevin.noel at uzabase.com