sbintuitions
/

sarashina-embedding-v1-1b

Sentence Similarity

sentence-transformers

feature-extraction

Model card Files Files and versions Community

akiFQCint commited on Nov 26, 2024

Commit

5072705

·

1 Parent(s): 97f0a0f

update: capital of name

Files changed (2) hide show

README.md +3 -3
README_JA.md +2 -2

README.md CHANGED Viewed

@@ -23,12 +23,12 @@ datasets:
 ---
-# Sarashina-embedding-v1-1b
 **[日本語のREADME/Japanese README](https://huggingface.co/sbintuitions/sarashina-embedding-v1-1b/blob/main/README_JA.md)**
-"Sarashina-embedding-v1-1b" is a Japanese text embedding model, based on the 1.2B-parameter Japansese LLM "Sarashina".
 We trained this model with multi-stage contrastive learning. We achieved the state-of-the-art average score in the average of 16 datasets in  [JMTEB](https://huggingface.co/datasets/sbintuitions/JMTEB)(Japanese Massive Text Embedding Benchmark).
 This model maps sentences & paragraphs to a 1792-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
@@ -95,7 +95,7 @@ print(similarities.shape)
 ## Training
-"Sarashina-embedding-v1-1b" is created through the following two-stage learning process:
 ### Stage 1: Weakly-supervised Learning

 ---
+# Sarashina-Embedding-v1-1B
 **[日本語のREADME/Japanese README](https://huggingface.co/sbintuitions/sarashina-embedding-v1-1b/blob/main/README_JA.md)**
+"Sarashina-Embedding-v1-1B" is a Japanese text embedding model, based on the 1.2B-parameter Japansese LLM "Sarashina".
 We trained this model with multi-stage contrastive learning. We achieved the state-of-the-art average score in the average of 16 datasets in  [JMTEB](https://huggingface.co/datasets/sbintuitions/JMTEB)(Japanese Massive Text Embedding Benchmark).
 This model maps sentences & paragraphs to a 1792-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
 ## Training
+"Sarashina-Embedding-v1-1B" is created through the following two-stage learning process:
 ### Stage 1: Weakly-supervised Learning

README_JA.md CHANGED Viewed

@@ -20,7 +20,7 @@ datasets:
   - SkelterLabsInc/JaQuAD
 ---
-# Sarashina-embedding-v1-1b
 「Sarashina-embedding-v1-1b」は、1.2Bパラメータの日本語LLM「Sarashina」をベースにした日本語テキスト埋め込みモデルです。
@@ -89,7 +89,7 @@ print(similarities.shape)
 ## 学習
-"Sarashina-embedding-v1-1b"は、以下の2段階の学習ステージによって行われています。
 ### Stage 1: 弱教師あり学習

   - SkelterLabsInc/JaQuAD
 ---
+# Sarashina-Embedding-v1-1B
 「Sarashina-embedding-v1-1b」は、1.2Bパラメータの日本語LLM「Sarashina」をベースにした日本語テキスト埋め込みモデルです。
 ## 学習
+"Sarashina-Embedding-v1-1B"は、以下の2段階の学習ステージによって行われています。
 ### Stage 1: 弱教師あり学習