Sheshera Mysore commited on
Commit
a83a9b4
1 Parent(s): c5d50b6

Language and small clarifications.

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -35,7 +35,7 @@ The model was trained with the Adam Optimizer and a learning rate of 1e-5 with 1
35
 
36
  ### Intended uses & limitations
37
 
38
- This model is trained for document similarity tasks in biomedical scientific text using a single vector per document. Here, the documents are the title and abstract of a paper. With appropriate fine-tuning the model can also be used for other tasks such as classification. Since the training data comes primarily from biomedicine, performance on other domains may be poorer.
39
 
40
  ### How to use
41
 
@@ -56,19 +56,19 @@ clsrep = result.last_hidden_state[:,0,:]
56
  **`aspire-biencoder-biomed-scib-full`**, can be used as follows: 1) Download the [`aspire-biencoder-biomed-scib-full.zip`](https://drive.google.com/file/d/1MDCv9Fc33eP015HTWKi50WYXixh72h5c/view?usp=sharing), and 2) Use it per this example usage script: [`aspire/examples/ex_aspire_bienc.py`](https://github.com/allenai/aspire/blob/main/examples/ex_aspire_bienc.py)
57
 
58
  ### Variable and metrics
59
- This model is evaluated on information retrieval datasets with document level queries. Here we report performance on RELISH, and TRECCOVID. These are detailed on [github](https://github.com/allenai/aspire) and in our [paper](https://arxiv.org/abs/2111.08366). These datasets represent a abstract level retrieval task, where given a query scientific abstract the task requires the retrieval of relevant candidate abstracts.
60
 
61
  We rank documents by the L2 distance between the query and candidate documents.
62
 
63
  ### Evaluation results
64
 
65
- The released model `aspire-biencoder-biomed-spec` (and `aspire-biencoder-biomed-spec-full`) is compared against `allenai/specter`. `aspire-biencoder-biomed-spec`<sup>*</sup> is the performance reported in our paper by averaging over 3 re-runs of the model. The released models `aspire-biencoder-biomed-spec` and `aspire-biencoder-biomed-spec-full` are the single best run among the 3 re-runs.
66
 
67
  | | TRECCOVID | TRECCOVID | RELISH | RELISH |
68
  |-------------------------------------------:|:---------:|:-------:|:------:|:-------:|
69
  | | MAP | NDCG%20 | MAP | NDCG%20 |
70
  | `specter` | 28.24 | 59.28 | 60.62| 77.20 |
71
- | `aspire-biencoder-biomed-spec`<sup>*</sup> | 28.59 | 60.07 | 61.43| 77.96 |
72
  | `aspire-biencoder-biomed-spec` | 26.07 | 54.89 | 61.47| 78.34 |
73
  | `aspire-biencoder-biomed-spec-full` | 28.87 | 60.47 | 61.69| 78.22 |
74
 
 
35
 
36
  ### Intended uses & limitations
37
 
38
+ This model is trained for document similarity tasks in **biomedical** scientific text using a single vector per document. Here, the documents are the title and abstract of a paper. With appropriate fine-tuning the model can also be used for other tasks such as classification. Since the training data comes primarily from biomedicine, performance on other domains may be poorer.
39
 
40
  ### How to use
41
 
 
56
  **`aspire-biencoder-biomed-scib-full`**, can be used as follows: 1) Download the [`aspire-biencoder-biomed-scib-full.zip`](https://drive.google.com/file/d/1MDCv9Fc33eP015HTWKi50WYXixh72h5c/view?usp=sharing), and 2) Use it per this example usage script: [`aspire/examples/ex_aspire_bienc.py`](https://github.com/allenai/aspire/blob/main/examples/ex_aspire_bienc.py)
57
 
58
  ### Variable and metrics
59
+ This model is evaluated on information retrieval datasets with document level queries. Here we report performance on RELISH (biomedical/English), and TRECCOVID (biomedical/English). These are detailed on [github](https://github.com/allenai/aspire) and in our [paper](https://arxiv.org/abs/2111.08366). These datasets represent a abstract level retrieval task, where given a query scientific abstract the task requires the retrieval of relevant candidate abstracts.
60
 
61
  We rank documents by the L2 distance between the query and candidate documents.
62
 
63
  ### Evaluation results
64
 
65
+ The released model `aspire-biencoder-biomed-spec` (and `aspire-biencoder-biomed-spec-full`) is compared against `allenai/specter`. `aspire-biencoder-biomed-spec-full`<sup>*</sup> is the performance reported in our paper by averaging over 3 re-runs of the model. The released models `aspire-biencoder-biomed-spec` and `aspire-biencoder-biomed-spec-full` are the single best run among the 3 re-runs.
66
 
67
  | | TRECCOVID | TRECCOVID | RELISH | RELISH |
68
  |-------------------------------------------:|:---------:|:-------:|:------:|:-------:|
69
  | | MAP | NDCG%20 | MAP | NDCG%20 |
70
  | `specter` | 28.24 | 59.28 | 60.62| 77.20 |
71
+ | `aspire-biencoder-biomed-spec-full`<sup>*</sup> | 28.59 | 60.07 | 61.43| 77.96 |
72
  | `aspire-biencoder-biomed-spec` | 26.07 | 54.89 | 61.47| 78.34 |
73
  | `aspire-biencoder-biomed-spec-full` | 28.87 | 60.47 | 61.69| 78.22 |
74