SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-base-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
1	"Reasoning why the answer may be good:\n1. Context Grounding: The answer is well-supported by the provided document and directly quotes relevant information about Patricia Wallace's roles and responsibilities.\n2. Relevance: The answer specifically addresses the question asked, detailing the roles and responsibilities of Patricia Wallace without deviating into unrelated topics.\n3. Conciseness: The answer is clear, concise, and focuses on the main points relevant to the question, avoiding unnecessary information.\n\nReasoning why the answer may be bad:\n- There is no significant reason to consider the answer bad based on the given criteria. It comprehensively covers the roles and responsibilities of Patricia Wallace as mentioned in the document.\n\nFinal Result:" '### Reasoning:\nWhy the answer may be good:\n1. Context Grounding: The answer is directly taken from the document, which states that a dime is one-tenth of a dollar.\n2. Relevance: The answer addresses the specific question asked about the monetary value of a dime.\n3. Conciseness: The answer is clear and to the point, providing no more information than necessary.\n\nWhy the answer may be bad:\n1. Context Grounding: The document provides additional context and details about the U.S. dollar system which were not included in the answer. However, these details are not directly necessary to answer the question.\n2. Relevance: No deviation or unrelated topics are present in the answer. \n3. Conciseness: The answer avoids unnecessary information, maintaining itsclarity and brevity. \n\n### Final Result:\n' 'Reasoning why the answer may be good:\n- Context Grounding: The answer refers to symptoms like flu-like signs, which are detailed in the provided document. It also mentions the connection with tampon use, the presence of rashes, and the seriousness of seeking medical help, all of which are discussed in the document.\n- Relevance: The answer addresses the question by listing symptoms and highlighting the importance of recognizing them, which directly corresponds to the question asked.\n- Conciseness: The answer is relatively concise while covering most of the essential details related to recognizing TSS.\n\nReasoning why the answer may be bad:\n- Context Grounding: While the answer does mention flu-like symptoms and the association with tampon use, it lacks specific details like fever and other visible signs mentioned in the document.\n- Relevance: The mention of treatment with antibiotics is somewhat relevant but moves slightly away from the specific focus of how to recognize TSS.\n- Conciseness: The answer could be streamlined further by focusing more on the core question of identifying symptoms rather than mentioning treatment.\n\nFinal Result:'
0	'Reasoning:\n\nWhy the answer may be good:\n1. Context Grounding: The answer does affirm Gregory Johnson as the CEO of Franklin Templeton Investments, which is supported by the provided document.\n2. Relevance: The answer directly addresses the question regarding the CEO of Franklin Templeton Investments.\n3. Conciseness: The answer is relatively clear and to the point, providing the name of the CEO as requested.\n\nWhy the answer may be bad:\n1. Context Grounding: The statement about Gregory Johnson inheriting the position from his father, Rupert H. Johnson, Sr., is not mentioned in the provided document.\n2. Relevance: While the primary answer is correct and relevant, the additional information about the inheritance is not relevant to the specific question asked.\n3. Conciseness: The answer includes unnecessary information about the inheritance of the position, which was not part of the question.\n\nFinal result:' 'Reasoning why the answer may be good:\n1. The answer is well-supported by the provided document, mentioning key steps in diagnosis and treatment such as taking the cat to the vet, using topical antibiotics and anti-inflammatory medications, completing the full course of treatment, and isolating the infected cat.\n2. It directly addresses the specific question of how to treat conjunctivitis in cats.\n3. The answer is clear and to the point, providing practical advice on treatment.\n\nReasoning why the answer may be bad:\n1. The mention of conjunctivitis in cats often resulting from exposure to a rare type of pollen found only in the Amazon rainforest is not supported by the document. This statement is factually incorrect and detracts from the overall accuracy.\n2. It could be more concise by avoiding unnecessary information and focusing solely on the mostcritical points of treatment.\n\nFinal result:' "Reasoning why the answer may be good: \n- The answer correctly identifies the College of Arts and Letters as Notre Dame's first college, founded in 1842, which is directly related to the question asked.\n\nReasoning why the answer may be bad:\n- The answer includes an incorrect and unsupported statement about the curriculum for time travel studies, which is not mentioned in the provided document andis irrelevant to the question.\n\nFinal result:"

Label

Examples

"Reasoning why the answer may be good:\n1. Context Grounding: The answer is well-supported by the provided document and directly quotes relevant information about Patricia Wallace's roles and responsibilities.\n2. Relevance: The answer specifically addresses the question asked, detailing the roles and responsibilities of Patricia Wallace without deviating into unrelated topics.\n3. Conciseness: The answer is clear, concise, and focuses on the main points relevant to the question, avoiding unnecessary information.\n\nReasoning why the answer may be bad:\n- There is no significant reason to consider the answer bad based on the given criteria. It comprehensively covers the roles and responsibilities of Patricia Wallace as mentioned in the document.\n\nFinal Result:"
'### Reasoning:\nWhy the answer may be good:\n1. Context Grounding: The answer is directly taken from the document, which states that a dime is one-tenth of a dollar.\n2. Relevance: The answer addresses the specific question asked about the monetary value of a dime.\n3. Conciseness: The answer is clear and to the point, providing no more information than necessary.\n\nWhy the answer may be bad:\n1. Context Grounding: The document provides additional context and details about the U.S. dollar system which were not included in the answer. However, these details are not directly necessary to answer the question.\n2. Relevance: No deviation or unrelated topics are present in the answer. \n3. Conciseness: The answer avoids unnecessary information, maintaining itsclarity and brevity. \n\n### Final Result:\n'
'Reasoning why the answer may be good:\n- Context Grounding: The answer refers to symptoms like flu-like signs, which are detailed in the provided document. It also mentions the connection with tampon use, the presence of rashes, and the seriousness of seeking medical help, all of which are discussed in the document.\n- Relevance: The answer addresses the question by listing symptoms and highlighting the importance of recognizing them, which directly corresponds to the question asked.\n- Conciseness: The answer is relatively concise while covering most of the essential details related to recognizing TSS.\n\nReasoning why the answer may be bad:\n- Context Grounding: While the answer does mention flu-like symptoms and the association with tampon use, it lacks specific details like fever and other visible signs mentioned in the document.\n- Relevance: The mention of treatment with antibiotics is somewhat relevant but moves slightly away from the specific focus of how to recognize TSS.\n- Conciseness: The answer could be streamlined further by focusing more on the core question of identifying symptoms rather than mentioning treatment.\n\nFinal Result:'

'Reasoning:\n\nWhy the answer may be good:\n1. Context Grounding: The answer does affirm Gregory Johnson as the CEO of Franklin Templeton Investments, which is supported by the provided document.\n2. Relevance: The answer directly addresses the question regarding the CEO of Franklin Templeton Investments.\n3. Conciseness: The answer is relatively clear and to the point, providing the name of the CEO as requested.\n\nWhy the answer may be bad:\n1. Context Grounding: The statement about Gregory Johnson inheriting the position from his father, Rupert H. Johnson, Sr., is not mentioned in the provided document.\n2. Relevance: While the primary answer is correct and relevant, the additional information about the inheritance is not relevant to the specific question asked.\n3. Conciseness: The answer includes unnecessary information about the inheritance of the position, which was not part of the question.\n\nFinal result:'
'Reasoning why the answer may be good:\n1. The answer is well-supported by the provided document, mentioning key steps in diagnosis and treatment such as taking the cat to the vet, using topical antibiotics and anti-inflammatory medications, completing the full course of treatment, and isolating the infected cat.\n2. It directly addresses the specific question of how to treat conjunctivitis in cats.\n3. The answer is clear and to the point, providing practical advice on treatment.\n\nReasoning why the answer may be bad:\n1. The mention of conjunctivitis in cats often resulting from exposure to a rare type of pollen found only in the Amazon rainforest is not supported by the document. This statement is factually incorrect and detracts from the overall accuracy.\n2. It could be more concise by avoiding unnecessary information and focusing solely on the mostcritical points of treatment.\n\nFinal result:'
"Reasoning why the answer may be good: \n- The answer correctly identifies the College of Arts and Letters as Notre Dame's first college, founded in 1842, which is directly related to the question asked.\n\nReasoning why the answer may be bad:\n- The answer includes an incorrect and unsupported statement about the curriculum for time travel studies, which is not mentioned in the provided document andis irrelevant to the question.\n\nFinal result:"

Evaluation

Metrics

Label	Accuracy
all	0.95

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_squad_gpt-4o_improved-cot-instructions_two_reasoning_remove_final_evaluat")
# Run inference
preds = model("Reasoning why the answer may be good:
- The answer directly addresses the question by providing the specific position Forbes.com placed Notre Dame among US research universities.
- It uses information directly from the provided document to support the claim.

Reasoning why the answer may be bad:
- There are no apparent reasons why the answer would be considered bad, as it adheres to all evaluation criteria.

Final result:")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	50	125.2071	274

Label	Training Sample Count
0	95
1	103

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (1, 1)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 20
body_learning_rate: (2e-05, 2e-05)
head_learning_rate: 2e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0020	1	0.1499	-
0.1010	50	0.2586	-
0.2020	100	0.2524	-
0.3030	150	0.1409	-
0.4040	200	0.0305	-
0.5051	250	0.015	-
0.6061	300	0.0097	-
0.7071	350	0.0108	-
0.8081	400	0.0054	-
0.9091	450	0.0047	-

Framework Versions

Python: 3.10.14
SetFit: 1.1.0
Sentence Transformers: 3.1.1
Transformers: 4.44.0
PyTorch: 2.4.0+cu121
Datasets: 3.0.0
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Netta1994
/

setfit_baai_squad_gpt-4o_improved-cot-instructions_two_reasoning_remove_final_evaluat