metadata
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
widget:
- text: >-
Title: Robust Contextual Bandit via the Capped-$\ell_{2}$ norm,
Abstract: This paper considers the actor-critic contextual bandit for the
mobile health
(mHealth) intervention. The state-of-the-art decision-making methods in
mHealth
generally assume that the noise in the dynamic system follows the Gaussian
distribution. Those methods use the least-square-based algorithm to
estimate
the expected reward, which is prone to the existence of outliers. To deal
with
the issue of outliers, we propose a novel robust actor-critic contextual
bandit
method for the mHealth intervention. In the critic updating, the
capped-$\ell_{2}$ norm is used to measure the approximation error, which
prevents outliers from dominating our objective. A set of weights could be
achieved from the critic updating. Considering them gives a weighted
objective
for the actor updating. It provides the badly noised sample in the critic
updating with zero weights for the actor updating. As a result, the
robustness
of both actor-critic updating is enhanced. There is a key parameter in the
capped-$\ell_{2}$ norm. We provide a reliable method to properly set it by
making use of one of the most fundamental definitions of outliers in
statistics. Extensive experiment results demonstrate that our method can
achieve almost identical results compared with the state-of-the-art
methods on
the dataset without outliers and dramatically outperform them on the
datasets
noised by outliers.
- text: >-
Title: Increasing the Reusability of Enforcers with Lifecycle Events,
Abstract: Runtime enforcement can be effectively used to improve the
reliability of
software applications. However, it often requires the definition of ad hoc
policies and enforcement strategies, which might be expensive to identify
and
implement. This paper discusses how to exploit lifecycle events to obtain
useful enforcement strategies that can be easily reused across
applications,
thus reducing the cost of adoption of the runtime enforcement technology.
The
paper finally sketches how this idea can be used to define libraries that
can
automatically overcome problems related to applications misusing them.
- text: >-
Title: Generalized Minimum Distance Estimators in Linear Regression with
Dependent Errors,
Abstract: This paper discusses minimum distance estimation method in the
linear
regression model with dependent errors which are strongly mixing. The
regression parameters are estimated through the minimum distance
estimation
method, and asymptotic distributional properties of the estimators are
discussed. A simulation study compares the performance of the minimum
distance
estimator with other well celebrated estimator. This simulation study
shows the
superiority of the minimum distance estimator over another estimator.
KoulMde
(R package) which was used for the simulation study is available online.
See
section 4 for the detail.
- text: >-
Title: On the isoperimetric quotient over scalar-flat conformal classes,
Abstract: Let $(M,g)$ be a smooth compact Riemannian manifold of dimension
$n$ with
smooth boundary $\partial M$. Suppose that $(M,g)$ admits a scalar-flat
conformal metric. We prove that the supremum of the isoperimetric quotient
over
the scalar-flat conformal class is strictly larger than the best constant
of
the isoperimetric inequality in the Euclidean space, and consequently is
achieved, if either (i) $n\ge 12$ and $\partial M$ has a nonumbilic point;
or
(ii) $n\ge 10$, $\partial M$ is umbilic and the Weyl tensor does not
vanish at
some boundary point.
- text: >-
Title: Monte Carlo Tree Search with Sampled Information Relaxation Dual
Bounds,
Abstract: Monte Carlo Tree Search (MCTS), most famously used in game-play
artificial
intelligence (e.g., the game of Go), is a well-known strategy for
constructing
approximate solutions to sequential decision problems. Its primary
innovation
is the use of a heuristic, known as a default policy, to obtain Monte
Carlo
estimates of downstream values for states in a decision tree. This
information
is used to iteratively expand the tree towards regions of states and
actions
that an optimal policy might visit. However, to guarantee convergence to
the
optimal action, MCTS requires the entire tree to be expanded
asymptotically. In
this paper, we propose a new technique called Primal-Dual MCTS that
utilizes
sampled information relaxation upper bounds on potential actions, creating
the
possibility of "ignoring" parts of the tree that stem from highly
suboptimal
choices. This allows us to prove that despite converging to a partial
decision
tree in the limit, the recommended action from Primal-Dual MCTS is
optimal. The
new approach shows significant promise when used to optimize the behavior
of a
single driver navigating a graph while operating on a ride-sharing
platform.
Numerical experiments on a real dataset of 7,000 trips in New Jersey
suggest
that Primal-Dual MCTS improves upon standard MCTS by producing deeper
decision
trees and exhibits a reduced sensitivity to the size of the action space.
metrics:
- accuracy
pipeline_tag: text-classification
library_name: setfit
inference: false
datasets:
- bhujith10/multi_class_classification_dataset
base_model: google-bert/bert-large-uncased
SetFit with google-bert/bert-large-uncased
This is a SetFit model trained on the bhujith10/multi_class_classification_dataset dataset that can be used for Text Classification. This SetFit model uses google-bert/bert-large-uncased as the Sentence Transformer embedding model. A SetFitHead instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
- Model Type: SetFit
- Sentence Transformer body: google-bert/bert-large-uncased
- Classification head: a SetFitHead instance
- Maximum Sequence Length: 512 tokens
- Number of Classes: 6 classes
- Training Dataset: bhujith10/multi_class_classification_dataset
Model Sources
- Repository: SetFit on GitHub
- Paper: Efficient Few-Shot Learning Without Prompts
- Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("bhujith10/bert-large-uncased-setfit_finetuned")
# Run inference
preds = model("Title: On the isoperimetric quotient over scalar-flat conformal classes,
Abstract: Let $(M,g)$ be a smooth compact Riemannian manifold of dimension $n$ with
smooth boundary $\partial M$. Suppose that $(M,g)$ admits a scalar-flat
conformal metric. We prove that the supremum of the isoperimetric quotient over
the scalar-flat conformal class is strictly larger than the best constant of
the isoperimetric inequality in the Euclidean space, and consequently is
achieved, if either (i) $n\ge 12$ and $\partial M$ has a nonumbilic point; or
(ii) $n\ge 10$, $\partial M$ is umbilic and the Weyl tensor does not vanish at
some boundary point.")
Training Details
Training Set Metrics
Training set | Min | Median | Max |
---|---|---|---|
Word count | 23 | 145.8467 | 280 |
Training Hyperparameters
- batch_size: (4, 4)
- num_epochs: (2, 2)
- max_steps: -1
- sampling_strategy: oversampling
- body_learning_rate: (2e-05, 1e-05)
- head_learning_rate: 0.01
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- l2_weight: 0.01
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: True
Training Results
Epoch | Step | Training Loss | Validation Loss |
---|---|---|---|
0.0003 | 1 | 0.22 | - |
0.0138 | 50 | 0.3706 | - |
0.0276 | 100 | 0.2389 | - |
0.0414 | 150 | 0.1628 | - |
0.0551 | 200 | 0.1401 | - |
0.0689 | 250 | 0.1043 | - |
0.0827 | 300 | 0.1047 | - |
0.0965 | 350 | 0.098 | - |
0.1103 | 400 | 0.0931 | - |
0.1241 | 450 | 0.1002 | - |
0.1379 | 500 | 0.0837 | - |
0.1516 | 550 | 0.0673 | - |
0.1654 | 600 | 0.0709 | - |
0.1792 | 650 | 0.08 | - |
0.1930 | 700 | 0.0719 | - |
0.2068 | 750 | 0.0805 | - |
0.2206 | 800 | 0.059 | - |
0.2344 | 850 | 0.0957 | - |
0.2481 | 900 | 0.0614 | - |
0.2619 | 950 | 0.0887 | - |
0.2757 | 1000 | 0.0713 | - |
0.2895 | 1050 | 0.0734 | - |
0.3033 | 1100 | 0.0519 | - |
0.3171 | 1150 | 0.0802 | - |
0.3309 | 1200 | 0.0817 | - |
0.3446 | 1250 | 0.0665 | - |
0.3584 | 1300 | 0.0515 | - |
0.3722 | 1350 | 0.0764 | - |
0.3860 | 1400 | 0.0564 | - |
0.3998 | 1450 | 0.0512 | - |
0.4136 | 1500 | 0.052 | - |
0.4274 | 1550 | 0.0398 | - |
0.4411 | 1600 | 0.0473 | - |
0.4549 | 1650 | 0.0433 | - |
0.4687 | 1700 | 0.0621 | - |
0.4825 | 1750 | 0.0506 | - |
0.4963 | 1800 | 0.0395 | - |
0.5101 | 1850 | 0.0516 | - |
0.5238 | 1900 | 0.0431 | - |
0.5376 | 1950 | 0.037 | - |
0.5514 | 2000 | 0.0299 | - |
0.5652 | 2050 | 0.0398 | - |
0.5790 | 2100 | 0.0335 | - |
0.5928 | 2150 | 0.0438 | - |
0.6066 | 2200 | 0.0436 | - |
0.6203 | 2250 | 0.0345 | - |
0.6341 | 2300 | 0.0396 | - |
0.6479 | 2350 | 0.0381 | - |
0.6617 | 2400 | 0.0377 | - |
0.6755 | 2450 | 0.0287 | - |
0.6893 | 2500 | 0.0393 | - |
0.7031 | 2550 | 0.0309 | - |
0.7168 | 2600 | 0.0363 | - |
0.7306 | 2650 | 0.0347 | - |
0.7444 | 2700 | 0.0299 | - |
0.7582 | 2750 | 0.0305 | - |
0.7720 | 2800 | 0.0349 | - |
0.7858 | 2850 | 0.0385 | - |
0.7996 | 2900 | 0.0412 | - |
0.8133 | 2950 | 0.0336 | - |
0.8271 | 3000 | 0.0422 | - |
0.8409 | 3050 | 0.0249 | - |
0.8547 | 3100 | 0.0285 | - |
0.8685 | 3150 | 0.0258 | - |
0.8823 | 3200 | 0.0309 | - |
0.8961 | 3250 | 0.0246 | - |
0.9098 | 3300 | 0.0271 | - |
0.9236 | 3350 | 0.0285 | - |
0.9374 | 3400 | 0.0318 | - |
0.9512 | 3450 | 0.0287 | - |
0.9650 | 3500 | 0.0298 | - |
0.9788 | 3550 | 0.021 | - |
0.9926 | 3600 | 0.036 | - |
1.0 | 3627 | - | 0.1036 |
1.0063 | 3650 | 0.0257 | - |
1.0201 | 3700 | 0.02 | - |
1.0339 | 3750 | 0.0333 | - |
1.0477 | 3800 | 0.0339 | - |
1.0615 | 3850 | 0.0283 | - |
1.0753 | 3900 | 0.0233 | - |
1.0891 | 3950 | 0.0311 | - |
1.1028 | 4000 | 0.0296 | - |
1.1166 | 4050 | 0.0271 | - |
1.1304 | 4100 | 0.0321 | - |
1.1442 | 4150 | 0.0221 | - |
1.1580 | 4200 | 0.026 | - |
1.1718 | 4250 | 0.0283 | - |
1.1856 | 4300 | 0.0378 | - |
1.1993 | 4350 | 0.0225 | - |
1.2131 | 4400 | 0.0237 | - |
1.2269 | 4450 | 0.0254 | - |
1.2407 | 4500 | 0.0253 | - |
1.2545 | 4550 | 0.023 | - |
1.2683 | 4600 | 0.0265 | - |
1.2821 | 4650 | 0.0255 | - |
1.2958 | 4700 | 0.0278 | - |
1.3096 | 4750 | 0.0285 | - |
1.3234 | 4800 | 0.0234 | - |
1.3372 | 4850 | 0.0282 | - |
1.3510 | 4900 | 0.0197 | - |
1.3648 | 4950 | 0.0284 | - |
1.3785 | 5000 | 0.0326 | - |
1.3923 | 5050 | 0.0233 | - |
1.4061 | 5100 | 0.0386 | - |
1.4199 | 5150 | 0.0308 | - |
1.4337 | 5200 | 0.0218 | - |
1.4475 | 5250 | 0.0288 | - |
1.4613 | 5300 | 0.0251 | - |
1.4750 | 5350 | 0.0255 | - |
1.4888 | 5400 | 0.0261 | - |
1.5026 | 5450 | 0.0253 | - |
1.5164 | 5500 | 0.0313 | - |
1.5302 | 5550 | 0.0277 | - |
1.5440 | 5600 | 0.0252 | - |
1.5578 | 5650 | 0.0293 | - |
1.5715 | 5700 | 0.0334 | - |
1.5853 | 5750 | 0.0285 | - |
1.5991 | 5800 | 0.0269 | - |
1.6129 | 5850 | 0.0267 | - |
1.6267 | 5900 | 0.0313 | - |
1.6405 | 5950 | 0.0243 | - |
1.6543 | 6000 | 0.0301 | - |
1.6680 | 6050 | 0.0266 | - |
1.6818 | 6100 | 0.0276 | - |
1.6956 | 6150 | 0.0293 | - |
1.7094 | 6200 | 0.0291 | - |
1.7232 | 6250 | 0.031 | - |
1.7370 | 6300 | 0.0283 | - |
1.7508 | 6350 | 0.0238 | - |
1.7645 | 6400 | 0.0261 | - |
1.7783 | 6450 | 0.0196 | - |
1.7921 | 6500 | 0.034 | - |
1.8059 | 6550 | 0.0255 | - |
1.8197 | 6600 | 0.0231 | - |
1.8335 | 6650 | 0.0256 | - |
1.8473 | 6700 | 0.0207 | - |
1.8610 | 6750 | 0.0325 | - |
1.8748 | 6800 | 0.0238 | - |
1.8886 | 6850 | 0.0277 | - |
1.9024 | 6900 | 0.0239 | - |
1.9162 | 6950 | 0.0239 | - |
1.9300 | 7000 | 0.0227 | - |
1.9438 | 7050 | 0.0236 | - |
1.9575 | 7100 | 0.0216 | - |
1.9713 | 7150 | 0.0248 | - |
1.9851 | 7200 | 0.0244 | - |
1.9989 | 7250 | 0.0203 | - |
2.0 | 7254 | - | 0.1068 |
Framework Versions
- Python: 3.10.12
- SetFit: 1.1.0
- Sentence Transformers: 3.3.1
- Transformers: 4.45.2
- PyTorch: 2.1.0+cu118
- Datasets: 3.2.0
- Tokenizers: 0.20.3
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}