setfit-model-4 / README.md
davidadamczyk's picture
Add SetFit model
91a351c verified
metadata
base_model: sentence-transformers/all-mpnet-base-v2
library_name: setfit
metrics:
  - accuracy
pipeline_tag: text-classification
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
widget:
  - text: >
      Having previously lived in D.C., Rochester and Detroit and having made
      regular trips on the thruways and turnpikes in-between, I can truly say
      that the rest stops along the New York Thruway are the least desirable for
      food offerings. Even the NJ Turnpike offers a much better selection, with
      Ohio striking the best balance overall. Delaware has the largest rest
      stop, which offers a great selection but at the cost of having to
      negotiate a mall-size parking lot. Although I don't begrudge those who
      like McDonald's, I can honestly say I've never eaten at a rest stop or
      airport McDonalds, even when there were no other options. There's nothing
      wrong with wanting better food, so long as there are options available at
      reasonable prices.If there's one thing for which I can give credit to the
      New York Thruway rest stops, it's in forcing us to seek out roadside
      alternatives in the many communities along the way. As a result, my wife
      has an extensive collection of books on diners that has morphed into
      somewhat of an obsession over the years. Of course with smartphones and
      apps such as Yelp, finding exceptional food along the way has never been
      easier. Put another way, I see the thruway rest stop as a place for an
      early morning snack or cup of coffee when we're desperate. Unfortunately,
      the options are at their worst at 2 am, no matter where one stops.
  - text: >
      Now that Iran is actively funneling missiles, warheads and drones to
      Russia for use in Ukraine, and Russia is funneling technical expertise and
      supplies to Iran to make more weapons, things are quickly heating up and
      the clock is approaching midnight as Iran get closer and closer to
      weaponizing a nuclear MIRV ICBM.The no so cold war between Iran and
      Israel, Egypt, Saudi Arabia and the UAE is about to get very hot and
      Israel's efforts to avoid aligning against Russia in Syrian airspace
      (thank you President Obama) is about to fail as the Russo-Nato proxy war
      in Ukraine spills into the Middle East and a heavily armed and nuclear
      Israel gets drawn into a very open conflict with Iran and Russia.  The
      bombing of an Iranian plant inside Iran is major escalation and I doubt
      that the CIA and DIA were blindsided by the IDF operation as such a strike
      was likely meant to cripple Iranian efforts to resupply Russia as much as
      Iranian efforts to resupply Hizbollah in Lebanon.  With the Turks waging
      war in Syria, the air space over Syria is clearly going to become very
      crowded and very dangerous very quickly as Russia is stumbling into a
      second war with Israel through its Iranian proxy and Israel unlike Ukraine
      can take out both Russian and Iranian offensive capabilities.  We just
      witnessed the opening salvo of a hot war which is why the DIA, CIA have
      been in Tel Aviv and Cairo recently - it is not really about the
      Palestinian territories.
  - text: >
      It's the year of our Lord, 2023; it's hard to believe that we are having
      this conversation about the urgent necessity of ammo and lethal weapons. 
      WWI, WWII, the Korean War, Gulf Wars I & II, Afghanistan, ISIS, etc., have
      come and gone. This does not include the multitude of conflicts in Africa,
      Georgia, and other hot spots. Mankind has not changed a bit.  We are still
      driven by fear, greed, and the curse of the ego and its lust for power.
      Another article in today's edition discusses the Doomsday Clock and its
      relentless ticking toward oblivion.  It's just a matter of time -and Boom!
  - text: >
      i'd go further than the correct interpretation that putin's "cease fire"
      was nothing more than "propaganda."i suggest that the russian attack on
      kramatorsk on january 7, which russia falsely claimed killed 600 ukrainian
      soldiers, reveals the expectation that a cease fire would gather
      ukrainians in a rest area where they could be killed en masse. the
      headline was preplanned before the event.i point readers to the Institute
      for the Study of War (ISW) as an excellent daily summary of open source
      information by highly skilled military analysts. they point out that putin
      is using a "grievance-revenge" framing of russian military activities
      (e.g., kramatorsk was revenge for the grievance of russians killed in
      makiivka). the ISW points out that this has only worsened the antagonism
      toward the kremlin and military from pro-invasion russian commentators,
      who ask why any "grievance event" was allowed to occur in the first place.
  - text: >
      I cannot entirely agree with this.  If there's a disconnect between what's
      being taught, and what the student really wants to learn, that can be a
      problem. I, for example, learned a _LOT_ about computers, back in '84 --
      and a fair bit of other stuff, too.  (I speak what I'll term
      "conversational" Spanish; I can't claim to be fluent, but I can absolutely
      carry on modest conversations and express myself.)But the teachers in my
      core subjects were uninspired or flatly failed me (e.g., the CompSci prof
      who lost my test, and gave me a zero; that really took the wind out of my
      sails, considering I thought I nailed it).  So I was having far more fun
      at 11:00 p.m. in the computer lab than I was doing school work.  Bombed
      out of college, but I've now worked at four Fortune 500 companies, and am
      currently a senior cloud admin.  Students _do_ need to have a desire to
      learn, yes, but teachers need to be equipped properly to teach them, too.
inference: true
model-index:
  - name: SetFit with sentence-transformers/all-mpnet-base-v2
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: Unknown
          type: unknown
          split: test
        metrics:
          - type: accuracy
            value: 0.9
            name: Accuracy

SetFit with sentence-transformers/all-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
yes
  • 'TIME Magazine prediction for 2023 (3Jan2023)"A cornered Russia will turn from global player into the world’s most dangerous rogue state, posing a serious and pervasive danger to Europe, the U.S., and beyond. Bogged down in Ukraine, with little to lose from further isolation and Western retaliation, and facing intense domestic pressure to show strength, Russia will turn to asymmetric warfare against the West to inflict damage through a thousand 'paper cuts' rather than by overt aggression that depends on military and economic power that Russia no longer has.Putin’s nuclear saber-rattling will escalate. Kremlin-affiliated hackers will ramp up increasingly sophisticated cyberattacks on Western firms, governments, and infrastructure. Russia will intensify its offensive against Western elections by systematically supporting and funding disinformation and extremism. Attacks on Ukrainian infrastructure will continue.In short, Rogue Russia is a threat to global security, Western political systems, the cybersphere, and food security. Not to mention every Ukrainian civilian."\n'
  • "Bulletin of the Atomic Scientists advanced the Doomsday Clock, now to 90 seconds due to increasing nuclear risk.The rulers are putting humans in peril, an unconscionable and unethical danger since we haven't consented to such risk.In view of the fact that, over millennia, the rulers have killed hundreds of millions of innocent people, we can question their claimed legitimacy, and reject their bogus claim.\n"
  • 'This article explains the bad political rusults although rulers might be acting rationally within their ideological frameworks.It is based on plausible speculation of Biden and Putin's ideologies, yet other plausible facts could be animating the escalations. For instance, some describe 'getting ukrained' as "what happens to you if you ally with the U.S. government," and Joe Biden might be escalating to avoid such observations.Notice that these types of explanations do not rely on free will, but that rulers are prisoner to the constraints and incentives facing them, even if this ends with humanity being nuked again.Bulletin of Atomic Scientists advancing the Doomsday Clock is largely in line with rulers vs humanity framework, but as Douthat explains, this is different than the logic of the rulers.Another view, that of Prof. Mearshimer's presents a pessimistic view of this Ukraine War, while being remarkably prescient providing yet another framework to understand what's likely to happen; let's hope that he's wrong, althought lacking evidence for this optimism.\n'
no
  • "M Martínez - Doubtful. The US has been conducting virtually Perpetual War (mostly against smaller, weaker, brown-skinned nations) since day one and that hasn't dulled the Chickenhawk politicians (see: Bush the Lesser, George) from happily pushing us into the next one.Starting wars that are fought by Other Mother's Children and are profitable for the war-mongers will never cease.\n"
  • "I know it is easy to blame America always, but we are largely blameless. We opened trade with China and this allowed China to industrialize and build its economy. We in the west believe in Free markets and free people. Chinese state adopted a version of capitalism but instead of liberalizing like South Korea and Taiwan decided to become more insular. They restricted access to western products for their citizens. Movies, TV shows had to be censored. American social media companies cannot do business in China. Chinese citizens are not masters of their own destiny as the state dictates every aspect of their lives. Many of us in the west enjoy the benefits of western liberalism, namely - Free markets, Rule of law ( including contract enforcement) and individual rights. In the cold war era, we had to actively defend these values from Soviets. Now, we must brace ourselves to defend them from China. Liberal order will prevail because once people know the values of western liberal order, like Hongkongers, Taiwanese etc they will defend it. We in US, must help them, become the arsenal of democracy, supply planes, ships, munitions to Taiwan to defend themselves. Help Hong Kong citizens by giving the persecuted asylum in the west. We are not responsible for confrontation with China, Chinese state's disregard for Taiwanese and Hongkong citizens aspirations is responsible for this.\n"
  • 'We probably have male, transient cougars moving through the area more frequently than wildlife experts and state officials document. My neighbors woke to a partially eaten deer carcass in their backyard, but heard no coyotes the night before. We hadn't heard this story yet, when a week later, my husband had a very large animal run in front of his car. It had a very long tail, short hair of all tan color and bounded as tall as the hood of his sedan. I posted this on a local wildlife FB page, and a man replied his daughter saw it while walking one their 2 dogs, and reported it was as big as their mastiff. A week later, my neighbor was walking her dog at 7 am, and saw it in a neighboring yard, at the top of a hill, "sitting like a sphinx" under a large blue juniper bush. My neighbor clearly saw a broad feline face and large white torso. Several months later, I heard a jogger in another part of my town also saw it early in the morning, and and went to FB posting a stock picture of a cougar with the comment, ''This is what I saw." An email sent to CTDEEP with all this information wasn't taken seriously, with their reply stating reports are usually confusing other animals. It's hard to know what CTDEEP might think we are confused about, since coyote, fox, fisher, black bear and deer have all been sighted in our yard or near us, frequently.\n'

Evaluation

Metrics

Label Accuracy
all 0.9

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("davidadamczyk/setfit-model-4")
# Run inference
preds = model("It's the year of our Lord, 2023; it's hard to believe that we are having this conversation about the urgent necessity of ammo and lethal weapons.  WWI, WWII, the Korean War, Gulf Wars I & II, Afghanistan, ISIS, etc., have come and gone. This does not include the multitude of conflicts in Africa, Georgia, and other hot spots. Mankind has not changed a bit.  We are still driven by fear, greed, and the curse of the ego and its lust for power. Another article in today's edition discusses the Doomsday Clock and its relentless ticking toward oblivion.  It's just a matter of time -and Boom!
")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 18 133.075 255
Label Training Sample Count
no 18
yes 22

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 120
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0017 1 0.4133 -
0.0833 50 0.188 -
0.1667 100 0.0071 -
0.25 150 0.0002 -
0.3333 200 0.0001 -
0.4167 250 0.0001 -
0.5 300 0.0001 -
0.5833 350 0.0001 -
0.6667 400 0.0001 -
0.75 450 0.0001 -
0.8333 500 0.0001 -
0.9167 550 0.0001 -
1.0 600 0.0001 -

Framework Versions

  • Python: 3.10.13
  • SetFit: 1.1.0
  • Sentence Transformers: 3.0.1
  • Transformers: 4.45.2
  • PyTorch: 2.4.0+cu124
  • Datasets: 2.21.0
  • Tokenizers: 0.20.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}