metadata

tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
widget:
  - text: >-
      Title: Robust Contextual Bandit via the Capped-$\ell_{2}$ norm,

      Abstract: This paper considers the actor-critic contextual bandit for the
      mobile health

      (mHealth) intervention. The state-of-the-art decision-making methods in
      mHealth

      generally assume that the noise in the dynamic system follows the Gaussian

      distribution. Those methods use the least-square-based algorithm to
      estimate

      the expected reward, which is prone to the existence of outliers. To deal
      with

      the issue of outliers, we propose a novel robust actor-critic contextual
      bandit

      method for the mHealth intervention. In the critic updating, the

      capped-$\ell_{2}$ norm is used to measure the approximation error, which

      prevents outliers from dominating our objective. A set of weights could be

      achieved from the critic updating. Considering them gives a weighted
      objective

      for the actor updating. It provides the badly noised sample in the critic

      updating with zero weights for the actor updating. As a result, the
      robustness

      of both actor-critic updating is enhanced. There is a key parameter in the

      capped-$\ell_{2}$ norm. We provide a reliable method to properly set it by

      making use of one of the most fundamental definitions of outliers in

      statistics. Extensive experiment results demonstrate that our method can

      achieve almost identical results compared with the state-of-the-art
      methods on

      the dataset without outliers and dramatically outperform them on the
      datasets

      noised by outliers.
  - text: >-
      Title: Increasing the Reusability of Enforcers with Lifecycle Events,

      Abstract: Runtime enforcement can be effectively used to improve the
      reliability of

      software applications. However, it often requires the definition of ad hoc

      policies and enforcement strategies, which might be expensive to identify
      and

      implement. This paper discusses how to exploit lifecycle events to obtain

      useful enforcement strategies that can be easily reused across
      applications,

      thus reducing the cost of adoption of the runtime enforcement technology.
      The

      paper finally sketches how this idea can be used to define libraries that
      can

      automatically overcome problems related to applications misusing them.
  - text: >-
      Title: Generalized Minimum Distance Estimators in Linear Regression with
      Dependent Errors,

      Abstract: This paper discusses minimum distance estimation method in the
      linear

      regression model with dependent errors which are strongly mixing. The

      regression parameters are estimated through the minimum distance
      estimation

      method, and asymptotic distributional properties of the estimators are

      discussed. A simulation study compares the performance of the minimum
      distance

      estimator with other well celebrated estimator. This simulation study
      shows the

      superiority of the minimum distance estimator over another estimator.
      KoulMde

      (R package) which was used for the simulation study is available online.
      See

      section 4 for the detail.
  - text: >-
      Title: On the isoperimetric quotient over scalar-flat conformal classes,

      Abstract: Let $(M,g)$ be a smooth compact Riemannian manifold of dimension
      $n$ with

      smooth boundary $\partial M$. Suppose that $(M,g)$ admits a scalar-flat

      conformal metric. We prove that the supremum of the isoperimetric quotient
      over

      the scalar-flat conformal class is strictly larger than the best constant
      of

      the isoperimetric inequality in the Euclidean space, and consequently is

      achieved, if either (i) $n\ge 12$ and $\partial M$ has a nonumbilic point;
      or

      (ii) $n\ge 10$, $\partial M$ is umbilic and the Weyl tensor does not
      vanish at

      some boundary point.
  - text: >-
      Title: Monte Carlo Tree Search with Sampled Information Relaxation Dual
      Bounds,

      Abstract: Monte Carlo Tree Search (MCTS), most famously used in game-play
      artificial

      intelligence (e.g., the game of Go), is a well-known strategy for
      constructing

      approximate solutions to sequential decision problems. Its primary
      innovation

      is the use of a heuristic, known as a default policy, to obtain Monte
      Carlo

      estimates of downstream values for states in a decision tree. This
      information

      is used to iteratively expand the tree towards regions of states and
      actions

      that an optimal policy might visit. However, to guarantee convergence to
      the

      optimal action, MCTS requires the entire tree to be expanded
      asymptotically. In

      this paper, we propose a new technique called Primal-Dual MCTS that
      utilizes

      sampled information relaxation upper bounds on potential actions, creating
      the

      possibility of "ignoring" parts of the tree that stem from highly
      suboptimal

      choices. This allows us to prove that despite converging to a partial
      decision

      tree in the limit, the recommended action from Primal-Dual MCTS is
      optimal. The

      new approach shows significant promise when used to optimize the behavior
      of a

      single driver navigating a graph while operating on a ride-sharing
      platform.

      Numerical experiments on a real dataset of 7,000 trips in New Jersey
      suggest

      that Primal-Dual MCTS improves upon standard MCTS by producing deeper
      decision

      trees and exhibits a reduced sensitivity to the size of the action space.
metrics:
  - accuracy
pipeline_tag: text-classification
library_name: setfit
inference: false
datasets:
  - bhujith10/multi_class_classification_dataset
base_model: google-bert/bert-large-uncased

SetFit with google-bert/bert-large-uncased

This is a SetFit model trained on the bhujith10/multi_class_classification_dataset dataset that can be used for Text Classification. This SetFit model uses google-bert/bert-large-uncased as the Sentence Transformer embedding model. A SetFitHead instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: google-bert/bert-large-uncased
Classification head: a SetFitHead instance
Maximum Sequence Length: 512 tokens
Number of Classes: 6 classes
Training Dataset: bhujith10/multi_class_classification_dataset

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("bhujith10/bert-large-uncased-setfit_finetuned")
# Run inference
preds = model("Title: On the isoperimetric quotient over scalar-flat conformal classes,
Abstract: Let $(M,g)$ be a smooth compact Riemannian manifold of dimension $n$ with
smooth boundary $\partial M$. Suppose that $(M,g)$ admits a scalar-flat
conformal metric. We prove that the supremum of the isoperimetric quotient over
the scalar-flat conformal class is strictly larger than the best constant of
the isoperimetric inequality in the Euclidean space, and consequently is
achieved, if either (i) $n\ge 12$ and $\partial M$ has a nonumbilic point; or
(ii) $n\ge 10$, $\partial M$ is umbilic and the Weyl tensor does not vanish at
some boundary point.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	23	145.8467	280

Training Hyperparameters

batch_size: (4, 4)
num_epochs: (2, 2)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0003	1	0.22	-
0.0138	50	0.3706	-
0.0276	100	0.2389	-
0.0414	150	0.1628	-
0.0551	200	0.1401	-
0.0689	250	0.1043	-
0.0827	300	0.1047	-
0.0965	350	0.098	-
0.1103	400	0.0931	-
0.1241	450	0.1002	-
0.1379	500	0.0837	-
0.1516	550	0.0673	-
0.1654	600	0.0709	-
0.1792	650	0.08	-
0.1930	700	0.0719	-
0.2068	750	0.0805	-
0.2206	800	0.059	-
0.2344	850	0.0957	-
0.2481	900	0.0614	-
0.2619	950	0.0887	-
0.2757	1000	0.0713	-
0.2895	1050	0.0734	-
0.3033	1100	0.0519	-
0.3171	1150	0.0802	-
0.3309	1200	0.0817	-
0.3446	1250	0.0665	-
0.3584	1300	0.0515	-
0.3722	1350	0.0764	-
0.3860	1400	0.0564	-
0.3998	1450	0.0512	-
0.4136	1500	0.052	-
0.4274	1550	0.0398	-
0.4411	1600	0.0473	-
0.4549	1650	0.0433	-
0.4687	1700	0.0621	-
0.4825	1750	0.0506	-
0.4963	1800	0.0395	-
0.5101	1850	0.0516	-
0.5238	1900	0.0431	-
0.5376	1950	0.037	-
0.5514	2000	0.0299	-
0.5652	2050	0.0398	-
0.5790	2100	0.0335	-
0.5928	2150	0.0438	-
0.6066	2200	0.0436	-
0.6203	2250	0.0345	-
0.6341	2300	0.0396	-
0.6479	2350	0.0381	-
0.6617	2400	0.0377	-
0.6755	2450	0.0287	-
0.6893	2500	0.0393	-
0.7031	2550	0.0309	-
0.7168	2600	0.0363	-
0.7306	2650	0.0347	-
0.7444	2700	0.0299	-
0.7582	2750	0.0305	-
0.7720	2800	0.0349	-
0.7858	2850	0.0385	-
0.7996	2900	0.0412	-
0.8133	2950	0.0336	-
0.8271	3000	0.0422	-
0.8409	3050	0.0249	-
0.8547	3100	0.0285	-
0.8685	3150	0.0258	-
0.8823	3200	0.0309	-
0.8961	3250	0.0246	-
0.9098	3300	0.0271	-
0.9236	3350	0.0285	-
0.9374	3400	0.0318	-
0.9512	3450	0.0287	-
0.9650	3500	0.0298	-
0.9788	3550	0.021	-
0.9926	3600	0.036	-
1.0	3627	-	0.1036
1.0063	3650	0.0257	-
1.0201	3700	0.02	-
1.0339	3750	0.0333	-
1.0477	3800	0.0339	-
1.0615	3850	0.0283	-
1.0753	3900	0.0233	-
1.0891	3950	0.0311	-
1.1028	4000	0.0296	-
1.1166	4050	0.0271	-
1.1304	4100	0.0321	-
1.1442	4150	0.0221	-
1.1580	4200	0.026	-
1.1718	4250	0.0283	-
1.1856	4300	0.0378	-
1.1993	4350	0.0225	-
1.2131	4400	0.0237	-
1.2269	4450	0.0254	-
1.2407	4500	0.0253	-
1.2545	4550	0.023	-
1.2683	4600	0.0265	-
1.2821	4650	0.0255	-
1.2958	4700	0.0278	-
1.3096	4750	0.0285	-
1.3234	4800	0.0234	-
1.3372	4850	0.0282	-
1.3510	4900	0.0197	-
1.3648	4950	0.0284	-
1.3785	5000	0.0326	-
1.3923	5050	0.0233	-
1.4061	5100	0.0386	-
1.4199	5150	0.0308	-
1.4337	5200	0.0218	-
1.4475	5250	0.0288	-
1.4613	5300	0.0251	-
1.4750	5350	0.0255	-
1.4888	5400	0.0261	-
1.5026	5450	0.0253	-
1.5164	5500	0.0313	-
1.5302	5550	0.0277	-
1.5440	5600	0.0252	-
1.5578	5650	0.0293	-
1.5715	5700	0.0334	-
1.5853	5750	0.0285	-
1.5991	5800	0.0269	-
1.6129	5850	0.0267	-
1.6267	5900	0.0313	-
1.6405	5950	0.0243	-
1.6543	6000	0.0301	-
1.6680	6050	0.0266	-
1.6818	6100	0.0276	-
1.6956	6150	0.0293	-
1.7094	6200	0.0291	-
1.7232	6250	0.031	-
1.7370	6300	0.0283	-
1.7508	6350	0.0238	-
1.7645	6400	0.0261	-
1.7783	6450	0.0196	-
1.7921	6500	0.034	-
1.8059	6550	0.0255	-
1.8197	6600	0.0231	-
1.8335	6650	0.0256	-
1.8473	6700	0.0207	-
1.8610	6750	0.0325	-
1.8748	6800	0.0238	-
1.8886	6850	0.0277	-
1.9024	6900	0.0239	-
1.9162	6950	0.0239	-
1.9300	7000	0.0227	-
1.9438	7050	0.0236	-
1.9575	7100	0.0216	-
1.9713	7150	0.0248	-
1.9851	7200	0.0244	-
1.9989	7250	0.0203	-
2.0	7254	-	0.1068

Framework Versions

Python: 3.10.12
SetFit: 1.1.0
Sentence Transformers: 3.3.1
Transformers: 4.45.2
PyTorch: 2.1.0+cu118
Datasets: 3.2.0
Tokenizers: 0.20.3

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}