metadata
base_model: BAAI/bge-base-en-v1.5
library_name: setfit
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
widget:
- text: >-
Reasoning:
**Why the answer may be good:**
- Context Grounding: The document provides specific information that the
College of Arts and Letters was established in 1842. The answer given in
the response is directly supported by the document.
- Relevance: The answer addresses the specific question asked by providing
the year the college was created.
- Conciseness: The answer is clear, precise, and straight to the point.
**Why the answer may be bad:**
- There does not appear to be any reasons why the answer may be bad based
on the criteria specified.
Final result: ****
- text: >-
The answer provided is:
"The average student at Notre Dame travels more than 750 miles to study
there."
Reasoning:
**Good points:**
1. **Context Grounding**: The answer is supported by information present
in the document, which states, "the average student traveled more than 750
miles to Notre Dame".
2. **Relevance**: The answer directly addresses the specific question
asking about the number of miles the average student travels to study at
Notre Dame.
3. **Conciseness**: The answer is clear and to the point without any
unnecessary information.
**Bad points:**
- There are no bad points in this case as the answer aligns perfectly with
all the evaluation criteria.
Final Result: ****
- text: >-
Reasoning why the answer may be good:
- The answer correctly identifies Mick LaSalle as the writer for the San
Francisco Chronicle.
- The answer states that Mick LaSalle awarded "Spectre" a perfect score,
which is supported by the document.
Reasoning why the answer may be bad:
- The answer is concise and to the point, fulfilling the criteria for
conciseness and relevance.
- The document provided confirms that Mick LaSalle gave "Spectre" a
perfect score of 100.
- There is no deviation into unrelated topics, maintaining focus on the
question asked.
Final result:
- text: >-
Reasoning why the answer may be good:
1. Context Grounding: The document does mention that The Review of
Politics was inspired by German Catholic journals.
2. Relevance: The answer addresses the specific question about what
inspired The Review of Politics.
Reasoning why the answer may be bad:
1. Context Grounding: The document does not support the claim that it
predominantly featured articles written by Karl Marx. In fact, none of the
intellectual leaders mentioned in the document are Karl Marx, and the
document emphasizes a Catholic intellectual revival, which is inconsistent
with Marx's philosophy.
2. Conciseness: The additional information about Karl Marx is not needed
and is misleading, detracting from the core answer.
Final Result: Bad
The overall response, despite having a relevant and correct part, is
ultimately flawed due to significant inaccuracies and irrelevant
information.
- text: >-
Reasoning why the answer may be good:
- The answer directly addresses the question by providing the specific
position Forbes.com placed Notre Dame among US research universities.
- It uses information directly from the provided document to support the
claim.
Reasoning why the answer may be bad:
- There are no apparent reasons why the answer would be considered bad, as
it adheres to all evaluation criteria.
Final result:
inference: true
model-index:
- name: SetFit with BAAI/bge-base-en-v1.5
results:
- task:
type: text-classification
name: Text Classification
dataset:
name: Unknown
type: unknown
split: test
metrics:
- type: accuracy
value: 0.95
name: Accuracy
SetFit with BAAI/bge-base-en-v1.5
This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
- Model Type: SetFit
- Sentence Transformer body: BAAI/bge-base-en-v1.5
- Classification head: a LogisticRegression instance
- Maximum Sequence Length: 512 tokens
- Number of Classes: 2 classes
Model Sources
- Repository: SetFit on GitHub
- Paper: Efficient Few-Shot Learning Without Prompts
- Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts
Model Labels
Label | Examples |
---|---|
1 |
|
0 |
|
Evaluation
Metrics
Label | Accuracy |
---|---|
all | 0.95 |
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_squad_gpt-4o_improved-cot-instructions_two_reasoning_remove_final_evaluat")
# Run inference
preds = model("Reasoning why the answer may be good:
- The answer directly addresses the question by providing the specific position Forbes.com placed Notre Dame among US research universities.
- It uses information directly from the provided document to support the claim.
Reasoning why the answer may be bad:
- There are no apparent reasons why the answer would be considered bad, as it adheres to all evaluation criteria.
Final result:")
Training Details
Training Set Metrics
Training set | Min | Median | Max |
---|---|---|---|
Word count | 50 | 125.2071 | 274 |
Label | Training Sample Count |
---|---|
0 | 95 |
1 | 103 |
Training Hyperparameters
- batch_size: (16, 16)
- num_epochs: (1, 1)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 20
- body_learning_rate: (2e-05, 2e-05)
- head_learning_rate: 2e-05
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- l2_weight: 0.01
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
Epoch | Step | Training Loss | Validation Loss |
---|---|---|---|
0.0020 | 1 | 0.1499 | - |
0.1010 | 50 | 0.2586 | - |
0.2020 | 100 | 0.2524 | - |
0.3030 | 150 | 0.1409 | - |
0.4040 | 200 | 0.0305 | - |
0.5051 | 250 | 0.015 | - |
0.6061 | 300 | 0.0097 | - |
0.7071 | 350 | 0.0108 | - |
0.8081 | 400 | 0.0054 | - |
0.9091 | 450 | 0.0047 | - |
Framework Versions
- Python: 3.10.14
- SetFit: 1.1.0
- Sentence Transformers: 3.1.1
- Transformers: 4.44.0
- PyTorch: 2.4.0+cu121
- Datasets: 3.0.0
- Tokenizers: 0.19.1
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}