kperkins411's picture
Add new SentenceTransformer model.
227684e verified
metadata
base_model: sentence-transformers/multi-qa-mpnet-base-cos-v1
datasets: []
language: []
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
  - dot_accuracy@1
  - dot_accuracy@3
  - dot_accuracy@5
  - dot_accuracy@10
  - dot_precision@1
  - dot_precision@3
  - dot_precision@5
  - dot_precision@10
  - dot_recall@1
  - dot_recall@3
  - dot_recall@5
  - dot_recall@10
  - dot_ndcg@10
  - dot_mrr@10
  - dot_map@100
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:491850
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: >-
      Is Auriemma able to affect the choices made by the educational institution
      in Connecticut regarding deals with rivals of the company located in
      Berkshire?
    sentences:
      - >-
        13.     EXPENSES ASSOCIATED WITH THIS AGREEMENT. Marv shall be
        reimbursed in full for the cost(s) of all legal expenses associated with
        this agreement by THI.    [remainder of page intentionally left blank;
        signature page to follow]


        5





                        IN WITNESS WHEREOF, the Parties hereto, agreeing to be bound hereby, execute this Agreement upon the date first set forth above.     Premier Biomedical, Inc.:     /s/ William Hartman     Date__________ By: William Hartman, CEO       Technology Health, Inc.:     /s/ James Christopher LeDoux               Date___________ By: CEO      Marv Enterprises, LLC:     /s/ Mitchell Felder       Date__________ By:  Mitchell Felder

        6
      - >-
        Notwithstanding the foregoing, either party shall have the right to
        assign this Agreement in connection with the merger or acquisition of
        such party or the sale of all or substantially all of its assets related
        to this Agreement without such consent, except in the case where such
        transaction involves a direct competitor of the other party where
        consent of the other party will be required.
      - >-
        Notwithstanding the foregoing, it is understood that Auriemma has no
        control or influence over any decisions by the University of Connecticut
        to enter into any arrangement or agreement with any Berkshire
        Competitor.
  - source_sentence: Can this location information be shared with third-party entities?
    sentences:
      - >-
        Information Collected by Third Parties Third Parties' Services Our
        Services may contain third party tracking tools from third party service
        providers. Such third parties may use cookies, APIs, and SDKs in our
        Services to enable them to collect and analyze your information on our
        behalf. The third parties may have access to information such as your
        device identifier, MAC address, IMEI, locale (specific location where a
        given language is spoken), geo-location information, and IP address for
        the purpose of providing their services under their respective privacy
        policies. The Policy does not cover the use of tracking tools from third
        parties. We do not have access or control over these third parties. If
        you would like to know the information of the corresponding third
        parties, please contact us at support@meitu.com or legal@meitu.com
      - >-
        We may collect your location based information for the purpose of
        providing you with a correct version of the application and our better
        Services. Except otherwise provided in the Policy, we will not share
        this information with any third party. If you no longer wish to allow us
        to track or use such information, you may turn the internet access
        and/or GPS off at the device level or disable the relevant permission to
        our application.
      - >-
        Notwithstanding the foregoing, it is understood that Auriemma has no
        control or influence over any decisions by the University of Connecticut
        to enter into any arrangement or agreement with any Berkshire
        Competitor.
  - source_sentence: 'Hashing: Is it applied to ANDROID_ID?'
    sentences:
      - >-
        Information associated with users is collected from cookies and similar
        technologies such as digital identifiers, log files, web beacons, and
        plugins ("Cookies"), which store certain information from user devices,
        allowing us to understand and save preferences for future visits and to
        compile aggregate data about site traffic and site interaction. If a
        user provides Received Information to us, then this Received Information
        may be linked to data stored in Cookies.
      - >-
        To the extent that the Parties have jointly developed any New Amorphous
        Alloy Technology and they have agreed that such New Amorphous Alloy
        Technology will be jointly owned, as set forth in Section 8.2 above,
        each Party hereby assigns to the other, and will cause its employees,
        contractors, representatives, successors, assigns, Affiliates, parents,
        subsidiaries, officers and directors to assign to the other, a co-equal
        right, title and interest in and to any such jointly developed New
        Amorphous Alloy Technology. T
      - >-
        The analytics software may provide information about how you use your
        mobile applications as well as how applications are performing across
        different handsets. The third parties obtain this information as a
        result of data being sent to their servers from our software "agent" if
        embedded in your mobile application. The data collected by the agent may
        include: agent version, platform, SDK version, timestamp, API key
        (identifier for application), application version, device identifier,
        iOS Identifier for Advertising, iOS Identifier for Vendors, Media Access
        Control (MAC) address, International Mobile Equipment Identity (IMEI),
        Model, manufacture and OS version of device, session start/stop time,
        locale (specific location where a given language is spoken), time zone,
        and network status (WiFi, etc.). It hashes iOS device identifiers, MAC
        address and IMEI; however, we do not hash platform device identifiers
        such as the iOS Identifier for Advertising, ANDROID_ID and the BB_PIN.
        Hashing involves the transformation of these identifiers into a value or
        key that represents the original identifier. The device identifiers (if
        applicable), IMEI (if applicable), MAC address (if applicable), and
        platform are hashed to a third party ID.
  - source_sentence: f"What constitutes 'Received Information' as defined in this contract?"
    sentences:
      - >-
        "Effective Date" means the date as of which the last signature of a
        Party is affixed hereto.
      - >-
        "Received Information" means a user's private, personal or personally
        identifying or identifiable data or information, including content and
        contact information such as name, email address, or social network
        identifier.
      - >-
        B. HOW WE USE COLLECTED INFORMATION a. Any of the information (Personal
        and Non-personal) we collect from you may be used in one of the
        following ways: To personalize user experience- We may use Information
        to understand demographics, customer interest, and other trends among
        our Users;
  - source_sentence: What steps must precede mediation?
    sentences:
      - >-
        In case OntoChem finds a novel and unexpected antiviral use of those
        Rejected Hit Compounds during this 2-years period, it will notify Anixa
        about these findings and Anixa has the right of first negotiation during
        a period of 6 months after this notification.
      - >-
        11. Dispute Resolution. a. Negotiation. If a Party believes that the
        other Party has breached this Agreement or if there is a dispute between
        the Parties over the interpretation of this Agreement (a "Dispute"), the
        Parties will endeavor to resolve the Dispute through good faith
        negotiation for a period of thirty (30) days after a Party notifies the
        other Party of the Dispute and before either Party requests mediation or
        files litigation to resolve the Dispute. b. Mediation. If the Parties
        have been unable to resolve a Dispute through good faith negotiation as
        provided in the prior Subsection, a Party may request that the Parties
        attempt to resolve the Dispute through mediation by notifying the other
        Party with a copy to JAMS. The Parties will attempt to select a mutually
        acceptable JAMS mediator within ten (10) days of the notice requesting
        mediation. The mediation will be held in Lake County or Cook County,
        Illinois within thirty (30) days of the notice requesting mediation
        before a JAMS mediator and in compliance with JAMS mediation guidelines.
        Each party will bear its own costs in preparing for and participating in
        the mediation and one-half of the fees and expenses charged by JAMS for
        conducting the mediation. c. Litigation. If the Parties have been unable
        to resolve a Dispute through mediation as provided in the prior
        Subsection, a Party may file litigation against the other Party in a
        court of competent jurisdiction in the United States of America. With
        respect to litigation involving only the Parties or their Affiliates,
        the Parties irrevocably consent to the exclusive personal jurisdiction
        and venue of the U.S. federal and Illinois state courts of competent
        subject matter jurisdiction located in Lake County, Illinois or Cook
        County, Illinois and their respective higher courts of appeal for the
        limited purpose of resolving a Dispute, and the Parties waive, to the
        fullest extent permitted by law, any defense of inconvenient forum. The
        Parties waive any right to trial by jury as to any Disputes resolved
        through litigation. Notwithstanding the foregoing, a Party may file
        litigation to resolve a Dispute without undergoing either negotiation or
        mediation as provided in the prior Subsections for any Dispute
        involving: (i) infringement on intellectual property; (ii) the
        unauthorized use or disclosure of Confidential Information; or (iii) a
        request for a temporary restraining order, a preliminary or permanent
        injunction or any other type of equitable relief. d. Remedies. Except as
        expressly limited in the preceding Subsections and the other provisions
        in this Agreement, a Party may immediately exercise any rights and
        remedies available to the Party under Applicable Law upon a breach of
        this Agreement by the other Party. A Party will not suspend performance
        under or terminate this Agreement or any accepted purchase order for a
        product being purchased and sold under this Agreement unless: (1) the
        other Party is in material breach of this Agreement and has either
        refused to cure the material breach or has failed to cure the material
        breach within thirty (30) day of its receipt of written notice of the
        failure; and (2) the Parties have been unable to resolve the Dispute
        related to the material breach through negotiation or mediation, or the
        breaching Party has refused or failed to attempt to resolve the Dispute
        through negotiation or mediation, as provided in this Section.
        Notwithstanding the foregoing, a Party may suspend performance or
        terminate this Agreement or any accepted purchase order for a product
        being purchase and sold under this Agreement immediately on written
        notice to the other Party, and without providing the other Party an
        opportunity to cure the material breach or attempting to resolve a
        Dispute over the material breach by negotiation or mediation as provided
        in this Section, for a material breach by the other Party involving
        substantial harm to the reputation, goodwill and business of the
        non-breaching Party that cannot reasonably be avoided or fully redressed
        by providing the other Party an opportunity to cure the material breach.
        e. Late Fees and Collection Costs. If Buyer fails to pay Seller an
        amount owed under this Agreement by the invoice due date, then Buyer
        will owe Seller: (i) the delinquent amount; and (ii) a late payment fee
        equal to two percent (2%) of the delinquent amount for each full or
        partial calendar month past the invoice due date that the delinquent
        amount remains unpaid. In addition, if Seller has to file


        Source: REYNOLDS CONSUMER PRODUCTS INC., S-1, 11/15/2019






        litigation to collect the amount owed and Seller prevails in the
        litigation, Buyer will reimburse Seller for actual, reasonable,
        substantiated out-of-pocket expenses incurred by Seller in collecting
        the delinquent amount and accrued late payment fees on the delinquent
        amount. Under no circumstance will the late payment fee payable to
        Seller exceed the amount that a creditor may lawfully impose on a debtor
        on a delinquent amount under Applicable Law.
      - >-
        Third Party, Services, Ads and Analytics Ad companies may use and
        collect anonymous data about your interests to customize content and
        advertising here and in other sites and applications. Interest and
        location data may be linked to your device, but is not linked to your
        identity. Analytics companies may access anonymous data (such as your IP
        address or device ID) to help us understand how our services are used.
        They use this data solely on our behalf. They do not share it except in
        aggregate form; no data is shared as to any individual user.
model-index:
  - name: >-
      SentenceTransformer based on
      sentence-transformers/multi-qa-mpnet-base-cos-v1
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: multi qa mpnet base cos v1
          type: multi-qa-mpnet-base-cos-v1
        metrics:
          - type: cosine_accuracy@1
            value: 0.5675880348352896
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.7251041272245362
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.7890950397576676
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.8458917076864824
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.5675880348352896
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.24170137574151201
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.1578190079515335
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.08458917076864823
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.5675880348352896
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.7251041272245362
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.7890950397576676
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.8458917076864824
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.7056724990427845
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.6608194647289684
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.6658281637529629
            name: Cosine Map@100
          - type: dot_accuracy@1
            value: 0.5675880348352896
            name: Dot Accuracy@1
          - type: dot_accuracy@3
            value: 0.7251041272245362
            name: Dot Accuracy@3
          - type: dot_accuracy@5
            value: 0.7890950397576676
            name: Dot Accuracy@5
          - type: dot_accuracy@10
            value: 0.8458917076864824
            name: Dot Accuracy@10
          - type: dot_precision@1
            value: 0.5675880348352896
            name: Dot Precision@1
          - type: dot_precision@3
            value: 0.24170137574151201
            name: Dot Precision@3
          - type: dot_precision@5
            value: 0.1578190079515335
            name: Dot Precision@5
          - type: dot_precision@10
            value: 0.08458917076864823
            name: Dot Precision@10
          - type: dot_recall@1
            value: 0.5675880348352896
            name: Dot Recall@1
          - type: dot_recall@3
            value: 0.7251041272245362
            name: Dot Recall@3
          - type: dot_recall@5
            value: 0.7890950397576676
            name: Dot Recall@5
          - type: dot_recall@10
            value: 0.8458917076864824
            name: Dot Recall@10
          - type: dot_ndcg@10
            value: 0.7056724990427845
            name: Dot Ndcg@10
          - type: dot_mrr@10
            value: 0.6608194647289684
            name: Dot Mrr@10
          - type: dot_map@100
            value: 0.6658281637529629
            name: Dot Map@100

SentenceTransformer based on sentence-transformers/multi-qa-mpnet-base-cos-v1

This is a sentence-transformers model finetuned from sentence-transformers/multi-qa-mpnet-base-cos-v1. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("kperkins411/multi-qa-mpnet-base-cos-v1_MultipleNegativesRankingLoss")
# Run inference
sentences = [
    'What steps must precede mediation?',
    '11. Dispute Resolution. a. Negotiation. If a Party believes that the other Party has breached this Agreement or if there is a dispute between the Parties over the interpretation of this Agreement (a "Dispute"), the Parties will endeavor to resolve the Dispute through good faith negotiation for a period of thirty (30) days after a Party notifies the other Party of the Dispute and before either Party requests mediation or files litigation to resolve the Dispute. b. Mediation. If the Parties have been unable to resolve a Dispute through good faith negotiation as provided in the prior Subsection, a Party may request that the Parties attempt to resolve the Dispute through mediation by notifying the other Party with a copy to JAMS. The Parties will attempt to select a mutually acceptable JAMS mediator within ten (10) days of the notice requesting mediation. The mediation will be held in Lake County or Cook County, Illinois within thirty (30) days of the notice requesting mediation before a JAMS mediator and in compliance with JAMS mediation guidelines. Each party will bear its own costs in preparing for and participating in the mediation and one-half of the fees and expenses charged by JAMS for conducting the mediation. c. Litigation. If the Parties have been unable to resolve a Dispute through mediation as provided in the prior Subsection, a Party may file litigation against the other Party in a court of competent jurisdiction in the United States of America. With respect to litigation involving only the Parties or their Affiliates, the Parties irrevocably consent to the exclusive personal jurisdiction and venue of the U.S. federal and Illinois state courts of competent subject matter jurisdiction located in Lake County, Illinois or Cook County, Illinois and their respective higher courts of appeal for the limited purpose of resolving a Dispute, and the Parties waive, to the fullest extent permitted by law, any defense of inconvenient forum. The Parties waive any right to trial by jury as to any Disputes resolved through litigation. Notwithstanding the foregoing, a Party may file litigation to resolve a Dispute without undergoing either negotiation or mediation as provided in the prior Subsections for any Dispute involving: (i) infringement on intellectual property; (ii) the unauthorized use or disclosure of Confidential Information; or (iii) a request for a temporary restraining order, a preliminary or permanent injunction or any other type of equitable relief. d. Remedies. Except as expressly limited in the preceding Subsections and the other provisions in this Agreement, a Party may immediately exercise any rights and remedies available to the Party under Applicable Law upon a breach of this Agreement by the other Party. A Party will not suspend performance under or terminate this Agreement or any accepted purchase order for a product being purchased and sold under this Agreement unless: (1) the other Party is in material breach of this Agreement and has either refused to cure the material breach or has failed to cure the material breach within thirty (30) day of its receipt of written notice of the failure; and (2) the Parties have been unable to resolve the Dispute related to the material breach through negotiation or mediation, or the breaching Party has refused or failed to attempt to resolve the Dispute through negotiation or mediation, as provided in this Section. Notwithstanding the foregoing, a Party may suspend performance or terminate this Agreement or any accepted purchase order for a product being purchase and sold under this Agreement immediately on written notice to the other Party, and without providing the other Party an opportunity to cure the material breach or attempting to resolve a Dispute over the material breach by negotiation or mediation as provided in this Section, for a material breach by the other Party involving substantial harm to the reputation, goodwill and business of the non-breaching Party that cannot reasonably be avoided or fully redressed by providing the other Party an opportunity to cure the material breach. e. Late Fees and Collection Costs. If Buyer fails to pay Seller an amount owed under this Agreement by the invoice due date, then Buyer will owe Seller: (i) the delinquent amount; and (ii) a late payment fee equal to two percent (2%) of the delinquent amount for each full or partial calendar month past the invoice due date that the delinquent amount remains unpaid. In addition, if Seller has to file\n\nSource: REYNOLDS CONSUMER PRODUCTS INC., S-1, 11/15/2019\n\n\n\n\n\nlitigation to collect the amount owed and Seller prevails in the litigation, Buyer will reimburse Seller for actual, reasonable, substantiated out-of-pocket expenses incurred by Seller in collecting the delinquent amount and accrued late payment fees on the delinquent amount. Under no circumstance will the late payment fee payable to Seller exceed the amount that a creditor may lawfully impose on a debtor on a delinquent amount under Applicable Law.',
    'In case OntoChem finds a novel and unexpected antiviral use of those Rejected Hit Compounds during this 2-years period, it will notify Anixa about these findings and Anixa has the right of first negotiation during a period of 6 months after this notification.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.5676
cosine_accuracy@3 0.7251
cosine_accuracy@5 0.7891
cosine_accuracy@10 0.8459
cosine_precision@1 0.5676
cosine_precision@3 0.2417
cosine_precision@5 0.1578
cosine_precision@10 0.0846
cosine_recall@1 0.5676
cosine_recall@3 0.7251
cosine_recall@5 0.7891
cosine_recall@10 0.8459
cosine_ndcg@10 0.7057
cosine_mrr@10 0.6608
cosine_map@100 0.6658
dot_accuracy@1 0.5676
dot_accuracy@3 0.7251
dot_accuracy@5 0.7891
dot_accuracy@10 0.8459
dot_precision@1 0.5676
dot_precision@3 0.2417
dot_precision@5 0.1578
dot_precision@10 0.0846
dot_recall@1 0.5676
dot_recall@3 0.7251
dot_recall@5 0.7891
dot_recall@10 0.8459
dot_ndcg@10 0.7057
dot_mrr@10 0.6608
dot_map@100 0.6658

Training Details

Training Dataset

Unnamed Dataset

  • Size: 491,850 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 7 tokens
    • mean: 17.09 tokens
    • max: 58 tokens
    • min: 7 tokens
    • mean: 102.69 tokens
    • max: 512 tokens
    • min: 6 tokens
    • mean: 103.91 tokens
    • max: 512 tokens
  • Samples:
    anchor positive negative
    What safeguards are in place to protect the information obtained from third-party sources? Information We Collect From Other Sources We may also receive information from other sources and combine that with information we collect through our Services. For example: If you choose to link, create, or log in to your Uber account with a payment provider (e.g., Google Wallet) or social media service (e.g., Facebook), or if you engage with a separate app or website that uses our API (or whose API we use), we may receive information about you or your connections from that site or app. Use of cookies and other technology to collect information.
    What safeguards are in place to protect the information obtained from third-party sources? Information We Collect From Other Sources We may also receive information from other sources and combine that with information we collect through our Services. For example: If you choose to link, create, or log in to your Uber account with a payment provider (e.g., Google Wallet) or social media service (e.g., Facebook), or if you engage with a separate app or website that uses our API (or whose API we use), we may receive information about you or your connections from that site or app. c. The obligations specified in this Article shall not apply to Information for which the receiving Party can reasonably demonstrate that such Information: iii. becomes known to the receiving Party through disclosure by sources other than the disclosing Party, having a right to disclose such Information,
    What safeguards are in place to protect the information obtained from third-party sources? Information We Collect From Other Sources We may also receive information from other sources and combine that with information we collect through our Services. For example: If you choose to link, create, or log in to your Uber account with a payment provider (e.g., Google Wallet) or social media service (e.g., Facebook), or if you engage with a separate app or website that uses our API (or whose API we use), we may receive information about you or your connections from that site or app. You also may be able to link an account from a social networking service (e.g., Facebook, Google+, Yahoo!) to an account through our Services. This may allow you to use your credentials from the other site or service to sign in to certain features on our Services. If you link your account from a third-party site or service, we may collect information from those third-party accounts, and any information that we collect will be governed by this Privacy Policy.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 6,000 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 8 tokens
    • mean: 23.16 tokens
    • max: 124 tokens
    • min: 7 tokens
    • mean: 96.66 tokens
    • max: 512 tokens
    • min: 6 tokens
    • mean: 102.41 tokens
    • max: 512 tokens
  • Samples:
    anchor positive negative
    What term is used to describe sensitive materials unique to the involved entities and not accessible by the general populace, regardless of its physical state or the manner of its revelation? For purposes of this Agreement, "Confidential Information" means any data or information that is proprietary to the Parties and not generally known to the public, whether in tangible or intangible form, whenever and however disclosed, including but not limited to: 6.1 In this Agreement, "Confidential Information" means information disclosed by (or on behalf of) one party to the other party under this Agreement that is marked as confidential or, from its nature, content or the circumstances in which it is disclosed, might reasonably be supposed to be confidential, including the terms and conditions (including the Exhibits) of this Agreement. It does not include information that the recipient already knew, that becomes public through no fault of the recipient, that was independently developed by the recipient or that was lawfully given to the recipient by a third party.
    What term is used to describe sensitive materials unique to the involved entities and not accessible by the general populace, regardless of its physical state or the manner of its revelation? For purposes of this Agreement, "Confidential Information" means any data or information that is proprietary to the Parties and not generally known to the public, whether in tangible or intangible form, whenever and however disclosed, including but not limited to: 1. “Confidential Information” shall mean the Purpose (including the contemplated transaction), identity of, and any discussions or negotiations between, the Parties, existence of this Agreement, and any and all information whether in oral, written, graphic or electronic form, including but not limited to, data, know-how and any and all subject matter (whether patentable or not, including without limitation any derivatives thereof) pertaining to Verenium’s research, financial data, sales information, inventions, development, materials, technology, trade secrets, work in process, marketing, business plans, regulatory information and strategies, scientific, engineering and/or manufacturing processes or equipment, protocols, assays, strains, compounds, genes, gene pathways, enzymes, peptides, the commercial applications of genes, gene pathways, enzymes, peptides, accessing microbial diversity, manipulating and modifying genes and gene pathways, identifying bioactive compounds through recombinant techniques and any other elements of Verenium’s business which Verenium considers to be of value, including its present or future products, projections, sales, pricing, customers, employees, investors and contractual relationships.
    What term is used to describe sensitive materials unique to the involved entities and not accessible by the general populace, regardless of its physical state or the manner of its revelation? For purposes of this Agreement, "Confidential Information" means any data or information that is proprietary to the Parties and not generally known to the public, whether in tangible or intangible form, whenever and however disclosed, including but not limited to: Confidential Information means any information disclosed by one party (the ‘Discloser’) to the other (the ‘Recipient’) relating directly or indirectly to Name of Technology/Project, file # which is identified by the Discloser, either orally or in writing, as confidential, either at the time of disclosure or, if disclosed orally, confirmed in writing within thirty (30) days following the original disclosure.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 2e-05
  • num_train_epochs: 2
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss loss multi-qa-mpnet-base-cos-v1_cosine_map@100
0 0 - - 0.4784
0.0065 100 0.9364 - -
0.0130 200 0.8395 - -
0.0195 300 0.7295 - -
0.0260 400 0.7025 - -
0.0325 500 0.6212 - -
0.0390 600 0.6038 - -
0.0455 700 0.5723 - -
0.0520 800 0.552 - -
0.0586 900 0.5407 - -
0.0651 1000 0.5332 - -
0.0716 1100 0.4981 - -
0.0781 1200 0.4671 - -
0.0846 1300 0.4756 - -
0.0911 1400 0.4461 - -
0.0976 1500 0.4425 - -
0.1041 1600 0.4329 - -
0.1106 1700 0.4412 - -
0.1171 1800 0.3952 - -
0.1236 1900 0.4179 - -
0.1301 2000 0.4157 - -
0.1366 2100 0.4014 - -
0.1431 2200 0.3747 - -
0.1496 2300 0.3596 - -
0.1561 2400 0.3571 - -
0.1626 2500 0.3717 - -
0.1691 2600 0.3369 - -
0.1757 2700 0.3508 - -
0.1822 2800 0.3281 - -
0.1887 2900 0.3285 - -
0.1952 3000 0.3423 - -
0.2017 3100 0.2967 - -
0.2082 3200 0.3076 - -
0.2147 3300 0.3223 - -
0.2212 3400 0.3097 - -
0.2277 3500 0.2964 - -
0.2342 3600 0.2836 - -
0.2407 3700 0.3007 - -
0.2472 3800 0.2882 - -
0.2537 3900 0.2852 - -
0.2602 4000 0.2923 - -
0.2667 4100 0.2938 - -
0.2732 4200 0.2597 - -
0.2797 4300 0.2423 - -
0.2863 4400 0.2719 - -
0.2928 4500 0.2546 - -
0.2993 4600 0.2545 - -
0.3058 4700 0.2538 - -
0.3123 4800 0.249 - -
0.3188 4900 0.2473 - -
0.3253 5000 0.2398 - -
0.3318 5100 0.254 - -
0.3383 5200 0.2399 - -
0.3448 5300 0.2367 - -
0.3513 5400 0.2208 - -
0.3578 5500 0.2201 - -
0.3643 5600 0.2384 - -
0.3708 5700 0.2166 - -
0.3773 5800 0.1949 - -
0.3838 5900 0.2127 - -
0.3903 6000 0.2032 - -
0.3969 6100 0.2073 - -
0.4034 6200 0.2124 - -
0.4099 6300 0.1963 - -
0.4164 6400 0.1965 - -
0.4229 6500 0.2088 - -
0.4294 6600 0.2079 - -
0.4359 6700 0.1902 - -
0.4424 6800 0.1785 - -
0.4489 6900 0.2063 - -
0.4554 7000 0.1781 - -
0.4619 7100 0.172 - -
0.4684 7200 0.1733 - -
0.4749 7300 0.192 - -
0.4814 7400 0.195 - -
0.4879 7500 0.1926 - -
0.4944 7600 0.1754 - -
0.5009 7700 0.1859 - -
0.5074 7800 0.1779 - -
0.5140 7900 0.1714 - -
0.5205 8000 0.1639 - -
0.5270 8100 0.1527 - -
0.5335 8200 0.1695 - -
0.5400 8300 0.1501 - -
0.5465 8400 0.1636 - -
0.5530 8500 0.166 - -
0.5595 8600 0.1554 - -
0.5660 8700 0.1571 - -
0.5725 8800 0.1506 - -
0.5790 8900 0.1504 - -
0.5855 9000 0.1601 - -
0.5920 9100 0.1413 - -
0.5985 9200 0.15 - -
0.6050 9300 0.1473 - -
0.6115 9400 0.1509 - -
0.6180 9500 0.1555 - -
0.6246 9600 0.1477 - -
0.6311 9700 0.1399 - -
0.6376 9800 0.1422 - -
0.6441 9900 0.1383 - -
0.6506 10000 0.1299 - -
0.6571 10100 0.1328 - -
0.6636 10200 0.147 - -
0.6701 10300 0.152 - -
0.6766 10400 0.136 - -
0.6831 10500 0.1409 - -
0.6896 10600 0.1298 - -
0.6961 10700 0.1359 - -
0.7026 10800 0.137 - -
0.7091 10900 0.1245 - -
0.7156 11000 0.1303 - -
0.7221 11100 0.1307 - -
0.7286 11200 0.1171 - -
0.7352 11300 0.1319 - -
0.7417 11400 0.1296 - -
0.7482 11500 0.1344 - -
0.7547 11600 0.1195 - -
0.7612 11700 0.1048 - -
0.7677 11800 0.1242 - -
0.7742 11900 0.1163 - -
0.7807 12000 0.1253 - -
0.7872 12100 0.1215 - -
0.7937 12200 0.1092 - -
0.8002 12300 0.1131 - -
0.8067 12400 0.1155 - -
0.8132 12500 0.1211 - -
0.8197 12600 0.1235 - -
0.8262 12700 0.1242 - -
0.8327 12800 0.1068 - -
0.8392 12900 0.1352 - -
0.8457 13000 0.1156 - -
0.8523 13100 0.129 - -
0.8588 13200 0.1113 - -
0.8653 13300 0.1165 - -
0.8718 13400 0.1083 - -
0.8783 13500 0.1081 - -
0.8848 13600 0.105 - -
0.8913 13700 0.1088 - -
0.8978 13800 0.1067 - -
0.9043 13900 0.1032 - -
0.9108 14000 0.0989 - -
0.9173 14100 0.1044 - -
0.9238 14200 0.1032 - -
0.9303 14300 0.108 - -
0.9368 14400 0.0905 - -
0.9433 14500 0.098 - -
0.9498 14600 0.12 - -
0.9563 14700 0.122 - -
0.9629 14800 0.1011 - -
0.9694 14900 0.0943 - -
0.9759 15000 0.1031 - -
0.9824 15100 0.1099 - -
0.9889 15200 0.1034 - -
0.9954 15300 0.0896 - -
1.0 15371 - 0.441 -
1.0019 15400 0.0887 - -
1.0084 15500 0.0958 - -
1.0149 15600 0.0929 - -
1.0214 15700 0.083 - -
1.0279 15800 0.0897 - -
1.0344 15900 0.0924 - -
1.0409 16000 0.0897 - -
1.0474 16100 0.0912 - -
1.0539 16200 0.0912 - -
1.0604 16300 0.0851 - -
1.0669 16400 0.0779 - -
1.0735 16500 0.0886 - -
1.0800 16600 0.0876 - -
1.0865 16700 0.0831 - -
1.0930 16800 0.0858 - -
1.0995 16900 0.0821 - -
1.1060 17000 0.0835 - -
1.1125 17100 0.0907 - -
1.1190 17200 0.0764 - -
1.1255 17300 0.0853 - -
1.1320 17400 0.1002 - -
1.1385 17500 0.0717 - -
1.1450 17600 0.0926 - -
1.1515 17700 0.0864 - -
1.1580 17800 0.0758 - -
1.1645 17900 0.0806 - -
1.1710 18000 0.0866 - -
1.1775 18100 0.0876 - -
1.1840 18200 0.0905 - -
1.1906 18300 0.0747 - -
1.1971 18400 0.0731 - -
1.2036 18500 0.0724 - -
1.2101 18600 0.0835 - -
1.2166 18700 0.0809 - -
1.2231 18800 0.0722 - -
1.2296 18900 0.0799 - -
1.2361 19000 0.0675 - -
1.2426 19100 0.0704 - -
1.2491 19200 0.0749 - -
1.2556 19300 0.0743 - -
1.2621 19400 0.0798 - -
1.2686 19500 0.0691 - -
1.2751 19600 0.0782 - -
1.2816 19700 0.0776 - -
1.2881 19800 0.0807 - -
1.2946 19900 0.0881 - -
1.3012 20000 0.081 - -
1.3077 20100 0.073 - -
1.3142 20200 0.0758 - -
1.3207 20300 0.0752 - -
1.3272 20400 0.082 - -
1.3337 20500 0.0763 - -
1.3402 20600 0.0727 - -
1.3467 20700 0.0793 - -
1.3532 20800 0.0759 - -
1.3597 20900 0.0666 - -
1.3662 21000 0.0714 - -
1.3727 21100 0.0636 - -
1.3792 21200 0.0724 - -
1.3857 21300 0.0703 - -
1.3922 21400 0.0687 - -
1.3987 21500 0.0748 - -
1.4052 21600 0.0761 - -
1.4117 21700 0.059 - -
1.4183 21800 0.0717 - -
1.4248 21900 0.0631 - -
1.4313 22000 0.0591 - -
1.4378 22100 0.0729 - -
1.4443 22200 0.0825 - -
1.4508 22300 0.0761 - -
1.4573 22400 0.0734 - -
1.4638 22500 0.0678 - -
1.4703 22600 0.0674 - -
1.4768 22700 0.0638 - -
1.4833 22800 0.0763 - -
1.4898 22900 0.0686 - -
1.4963 23000 0.0743 - -
1.5028 23100 0.0685 - -
1.5093 23200 0.0645 - -
1.5158 23300 0.0611 - -
1.5223 23400 0.0678 - -
1.5289 23500 0.0693 - -
1.5354 23600 0.0694 - -
1.5419 23700 0.0594 - -
1.5484 23800 0.0635 - -
1.5549 23900 0.069 - -
1.5614 24000 0.0609 - -
1.5679 24100 0.0673 - -
1.5744 24200 0.062 - -
1.5809 24300 0.0652 - -
1.5874 24400 0.0685 - -
1.5939 24500 0.0648 - -
1.6004 24600 0.0612 - -
1.6069 24700 0.0624 - -
1.6134 24800 0.0635 - -
1.6199 24900 0.0585 - -
1.6264 25000 0.066 - -
1.6329 25100 0.0678 - -
1.6395 25200 0.0619 - -
1.6460 25300 0.066 - -
1.6525 25400 0.058 - -
1.6590 25500 0.0649 - -
1.6655 25600 0.0626 - -
1.6720 25700 0.0687 - -
1.6785 25800 0.0593 - -
1.6850 25900 0.0632 - -
1.6915 26000 0.0705 - -
1.6980 26100 0.0598 - -
1.7045 26200 0.0667 - -
1.7110 26300 0.0595 - -
1.7175 26400 0.0635 - -
1.7240 26500 0.065 - -
1.7305 26600 0.0556 - -
1.7370 26700 0.0559 - -
1.7435 26800 0.0552 - -
1.7500 26900 0.0577 - -
1.7566 27000 0.0666 - -
1.7631 27100 0.06 - -
1.7696 27200 0.0465 - -
1.7761 27300 0.0621 - -
1.7826 27400 0.056 - -
1.7891 27500 0.062 - -
1.7956 27600 0.0554 - -
1.8021 27700 0.0656 - -
1.8086 27800 0.0573 - -
1.8151 27900 0.0555 - -
1.8216 28000 0.0611 - -
1.8281 28100 0.0538 - -
1.8346 28200 0.0573 - -
1.8411 28300 0.051 - -
1.8476 28400 0.0599 - -
1.8541 28500 0.0592 - -
1.8606 28600 0.0568 - -
1.8672 28700 0.0549 - -
1.8737 28800 0.0558 - -
1.8802 28900 0.0545 - -
1.8867 29000 0.048 - -
1.8932 29100 0.056 - -
1.8997 29200 0.054 - -
1.9062 29300 0.06 - -
1.9127 29400 0.0586 - -
1.9192 29500 0.0606 - -
1.9257 29600 0.0648 - -
1.9322 29700 0.0601 - -
1.9387 29800 0.0582 - -
1.9452 29900 0.0551 - -
1.9517 30000 0.0575 - -
1.9582 30100 0.0547 - -
1.9647 30200 0.0612 - -
1.9712 30300 0.0601 - -
1.9778 30400 0.0516 - -
1.9843 30500 0.0503 - -
1.9908 30600 0.0561 - -
1.9973 30700 0.0558 - -
2.0 30742 - 0.4783 0.6658
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.9
  • Sentence Transformers: 3.1.0.dev0
  • Transformers: 4.41.2
  • PyTorch: 2.4.0+cu121
  • Accelerate: 0.31.0
  • Datasets: 2.19.1
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}