--- base_model: sentence-transformers/all-mpnet-base-v2 library_name: setfit metrics: - accuracy pipeline_tag: text-classification tags: - setfit - sentence-transformers - text-classification - generated_from_setfit_trainer widget: - text: 'It might have been more fun for everyone if the Thruway Authority had given individual contracts for each rest stop, with the stipulation that each reflect some local regional character. This could interest travelers to maybe get off at the next exit and explore some local places. With every stop the same, the traveler might as well be in Kansas. ' - text: 'I was scammed by a fake retailer appearing on a Google search for a popular product, a Patagonia backpack, offered at a significant discount. The website seemed legitimate; I was given a choice of colors and sizes. The scammer provided a tracking number from China. I have bought discounted items before from China that are sold on eBay and are sent by Chinese parcel post, for which tracking information is scant. When whatever item that was mailed finally arrived at a completely different address in another state several weeks later, I alerted my credit card company of the fraud and was refunded the amount, despite the time frame it took to determine the scam. ' - text: 'From Matt Stoller''s newsletter (edited for flow):LastPass was purchased by two private equity firms, Francisco Partners and Evergreen Coast Capital Corp. Typically, PE firms raise prices, lower quality, harm workers, and reduce customer service. They then decided to charge customers $36 to access the cumbersome passwords. This particular pricing move sparked a backlash from customers, and the two PE firms pledged to spin off the company and make it independent. But that hasn’t happened.Poor quality is common within private equity owned software firms, which means cybersecurity vulnerabilities quickly follow. We’ve seen this with PE-owned software firms facilitating the hacking of the NYC subway, nuclear weapons facilities, and criminal ransomware. And now it’s happened with LastPass. Lovely. ' - text: 'Maybe for the ''come latelies'' this is a big storm, but for folks who have lived there, this is not something new.When El Nino dumps in the Sierras...THAT, is a snow Storm! In ''82-83 the area near Squaw Valley got 800 inches! ''Dumps'' of 4-6+ feet happened about about 2x a month...we were living like snow moles, mimicing the great snow storms of the early 20th century - you may have seen these in historical photos.Homeowners were shoveling 3-5 feet of snow off their roofs, to prevent total collapse!We always had a good hearty Laugh at those CA flatlanders, driving to Tahoe on I 80 in the ''rain'' ties, with flakes like silver dollars, blotting out visibility.Remember, was it last winter when I 80 was closed and all the hip techies turned to their google maps and ended up on closed roads, in the boondocks? Like I 80 is closed and some 1 1/2 rural lane road, was going to be OPEN??? Hellarious!Of course, down in the flatlands, we''ve seen how folks THOUGHT they had ''amphibious'' cars...Any idea how folks became so....lame? (BTW: Mt Baker in WA has the record of 1100 inches of snow....keeping the smaller Mt St Helens-like volcano, sleeping!) Winter is great, if you respect Mother Nature; soooo many havent a clue, putting 1st Responders, at great risk! And 4 wheel drive, CAN keep you going straight, at a CAUTIOUS speed...not good, on icy curves!! ' - text: 'Ethan. The results of that great agricultural revolution are in and not much of it is admirable. More Food = More People = More Fossil Fuels = More Toxic Pollution = More Disease = More Greenhouse Gases = More Climate Change = end-of-the-line. Human population was able to grow as rapidly as fossil fuel inputs were increased. But now, we must reduce usage of fossil fuels and the resulting population logically goes in the same direction. All the green technologies are for naught. It comes down to fossil fuels. ' inference: true model-index: - name: SetFit with sentence-transformers/all-mpnet-base-v2 results: - task: type: text-classification name: Text Classification dataset: name: Unknown type: unknown split: test metrics: - type: accuracy value: 0.8 name: Accuracy --- # SetFit with sentence-transformers/all-mpnet-base-v2 This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification. The model has been trained using an efficient few-shot learning technique that involves: 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning. 2. Training a classification head with features from the fine-tuned Sentence Transformer. ## Model Details ### Model Description - **Model Type:** SetFit - **Sentence Transformer body:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance - **Maximum Sequence Length:** 384 tokens - **Number of Classes:** 2 classes ### Model Sources - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit) - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055) - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit) ### Model Labels | Label | Examples | |:------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | yes | | | no | | ## Evaluation ### Metrics | Label | Accuracy | |:--------|:---------| | **all** | 0.8 | ## Uses ### Direct Use for Inference First install the SetFit library: ```bash pip install setfit ``` Then you can load this model and run inference. ```python from setfit import SetFitModel # Download from the 🤗 Hub model = SetFitModel.from_pretrained("davidadamczyk/setfit-model-3") # Run inference preds = model("It might have been more fun for everyone if the Thruway Authority had given individual contracts for each rest stop, with the stipulation that each reflect some local regional character. This could interest travelers to maybe get off at the next exit and explore some local places. With every stop the same, the traveler might as well be in Kansas. ") ``` ## Training Details ### Training Set Metrics | Training set | Min | Median | Max | |:-------------|:----|:-------|:----| | Word count | 43 | 140.9 | 262 | | Label | Training Sample Count | |:------|:----------------------| | no | 18 | | yes | 22 | ### Training Hyperparameters - batch_size: (16, 16) - num_epochs: (1, 1) - max_steps: -1 - sampling_strategy: oversampling - num_iterations: 120 - body_learning_rate: (2e-05, 2e-05) - head_learning_rate: 2e-05 - loss: CosineSimilarityLoss - distance_metric: cosine_distance - margin: 0.25 - end_to_end: False - use_amp: False - warmup_proportion: 0.1 - l2_weight: 0.01 - seed: 42 - eval_max_steps: -1 - load_best_model_at_end: False ### Training Results | Epoch | Step | Training Loss | Validation Loss | |:------:|:----:|:-------------:|:---------------:| | 0.0017 | 1 | 0.4637 | - | | 0.0833 | 50 | 0.2019 | - | | 0.1667 | 100 | 0.0063 | - | | 0.25 | 150 | 0.0003 | - | | 0.3333 | 200 | 0.0002 | - | | 0.4167 | 250 | 0.0001 | - | | 0.5 | 300 | 0.0001 | - | | 0.5833 | 350 | 0.0001 | - | | 0.6667 | 400 | 0.0001 | - | | 0.75 | 450 | 0.0001 | - | | 0.8333 | 500 | 0.0001 | - | | 0.9167 | 550 | 0.0001 | - | | 1.0 | 600 | 0.0001 | - | ### Framework Versions - Python: 3.10.13 - SetFit: 1.1.0 - Sentence Transformers: 3.0.1 - Transformers: 4.45.2 - PyTorch: 2.4.0+cu124 - Datasets: 2.21.0 - Tokenizers: 0.20.0 ## Citation ### BibTeX ```bibtex @article{https://doi.org/10.48550/arxiv.2209.11055, doi = {10.48550/ARXIV.2209.11055}, url = {https://arxiv.org/abs/2209.11055}, author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren}, keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {Efficient Few-Shot Learning Without Prompts}, publisher = {arXiv}, year = {2022}, copyright = {Creative Commons Attribution 4.0 International} } ```