SimoneAstarita
/

interstellar-ice-crystal-xs

Sentence Similarity

sentence-transformers

feature-extraction

Generated from Trainer

dataset_size:416298

loss:MultipleNegativesRankingLoss

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

SimoneAstarita commited on Sep 13

Commit

6aa86af

•

1 Parent(s): f789c03

Update README.md

Files changed (1) hide show

README.md +5 -12

README.md CHANGED Viewed

@@ -79,7 +79,7 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [S
 - **Maximum Sequence Length:** 512 tokens
 - **Output Dimensionality:** 384 tokens
 - **Similarity Function:** Cosine Similarity
-<!-- - **Training Dataset:** Unknown -->
 - **Language:** en
 - **License:** apache-2.0
@@ -171,6 +171,8 @@ You can finetune this model on your own dataset.
 ### Training Dataset
 #### Unnamed Dataset
@@ -326,17 +328,6 @@ You can finetune this model on your own dataset.
 | Epoch  | Step  | Training Loss |
 |:------:|:-----:|:-------------:|
-| 0.0077 | 100   | 0.4784        |
-| 0.0154 | 200   | 0.2415        |
-| 0.0231 | 300   | 0.0424        |
-| 0.0307 | 400   | 0.021         |
-| 0.0384 | 500   | 0.0149        |
-| 0.0461 | 600   | 0.0081        |
-| 0.0538 | 700   | 0.0084        |
-| 0.0615 | 800   | 0.0067        |
-| 0.0692 | 900   | 0.0034        |
-| 0.0769 | 1000  | 0.0025        |
-| 0.0846 | 1100  | 0.0016        |
 | 0.0077 | 100   | 0.0025        |
 | 0.0154 | 200   | 0.0032        |
 | 0.0231 | 300   | 0.0026        |
@@ -768,6 +759,8 @@ You can finetune this model on your own dataset.
 }
 ```
 <!--
 ## Glossary

 - **Maximum Sequence Length:** 512 tokens
 - **Output Dimensionality:** 384 tokens
 - **Similarity Function:** Cosine Similarity
+- **Training Dataset:** scraped astronomy papers at the NLP for Space Science workshop.
 - **Language:** en
 - **License:** apache-2.0
 ### Training Dataset
+The dataset is made of scrapes papers in astronomy, including abstract, introduction and conclusions. They are divided into sentences using nklt. We then duplicate them and train using the same senrence for positive and anchor. We are using SimSCE.
 #### Unnamed Dataset
 | Epoch  | Step  | Training Loss |
 |:------:|:-----:|:-------------:|
 | 0.0077 | 100   | 0.0025        |
 | 0.0154 | 200   | 0.0032        |
 | 0.0231 | 300   | 0.0026        |
 }
 ```
+#Add SimSCE reference
 <!--
 ## Glossary