setfit-model-8 / README.md

Add SetFit model

f0fd28e verified about 1 month ago

21.2 kB

	---
	base_model: sentence-transformers/all-mpnet-base-v2
	library_name: setfit
	metrics:
	- accuracy
	pipeline_tag: text-classification
	tags:
	- setfit
	- sentence-transformers
	- text-classification
	- generated_from_setfit_trainer
	widget:
	- text: 'I noticed something missing in Gail''s and Bret''s banter about the debt-ceiling
	vote that is typical republican mush!Bret gets Gail to agree that spending is
	too high, then Bret proceeds to suggest it''s time to raise the retirement age
	for Social Security! And then...wait for it......Bret mentions nothing about raising
	taxes on corporations and billionaires!Bret, you would agree that the quaint 1950s
	was a time of sanity in the GOP. ....Well, in those good ol'' days, top marginal
	tax rates were in the 70% range.....What''s more, our national debt was low, like
	around zero!?....And what''s even more, the USA was absolutely first in the world
	in reading and math scores.Enough.

	'
	- text: 'Denial is not limited to American politicians. It seems China is extreme
	in this category. All the ''Zero Covid'' policy did was delay the inevitable.
	China is the US under Trump. Using vaccines which, while home grown, are not as
	effective only placed its population a great risk. They will have the same strain
	on their healthcare system. Very Sad.

	'
	- text: 'China knows everything about its citizens, monitors every details in their
	lives but somehow can''t say how many people exactly died from Covid19 since it
	ended its zero covid policy.Why should we believe these numbers instead of last
	week numbers?

	'
	- text: 'Johnny G These figures are also not accurate or believable. Crematoria in
	China''s large cities have been overrun with bodies since the zero-covid policy
	ended--running at full capacity with long backlogs. Any back of the envelope calculation
	would give a much higher death figure than 60,000--and the virus hasn''t even
	ravaged the countryside yet. That will happen over the next 3-4 weeks as migrant
	workers and others return to their villages to celebrate the Chinese New Year
	on Jan. 21. Due to the backwardness of rural healthcare and the proportionally
	high concentration of elderly people in the countryside, the covid death toll
	in rural China within the next few weeks will be high but will also receive much
	less media attention.

	'
	- text: 'I was beaten and verbally abused until age 17, when I could escape my home. My
	family "looked" normal from the outside, but was not. Child abuse was not yet
	in the lexicon.I turned out normal! This I owe to visiting lots of friends and
	watching how their families interacted--they were kind. I asked their parents
	to adopt me. I watched family sitcoms--the opposite of my homelife. I did well
	in school, so I received praise there, and made friends.The folks wanted me to
	marry well and have kids. But the Zero Population Movement, and Women''s Lib,
	gave me a window into how humans harm the planet, and that women could do more
	than have babies and do laundry. I put myself through uni, had no children, and
	have had and have careers I love.Parenting is the most important, unpaid job one
	can take on because it demands selflessly developing a decent, caring, intellectually
	curious, kind, patient human. People lacking these qualities should re-think
	parenthood.Also, consider the childless life, to save the planet.

	'
	inference: true
	model-index:
	- name: SetFit with sentence-transformers/all-mpnet-base-v2
	results:
	- task:
	type: text-classification
	name: Text Classification
	dataset:
	name: Unknown
	type: unknown
	split: test
	metrics:
	- type: accuracy
	value: 1.0
	name: Accuracy
	---

	# SetFit with sentence-transformers/all-mpnet-base-v2

	This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.

	The model has been trained using an efficient few-shot learning technique that involves:

	1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
	2. Training a classification head with features from the fine-tuned Sentence Transformer.

	## Model Details

	### Model Description
	- Model Type: SetFit
	- Sentence Transformer body: [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2)
	- Classification head: a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
	- Maximum Sequence Length: 384 tokens
	- Number of Classes: 2 classes
	<!-- - Training Dataset: [Unknown](https://huggingface.co/datasets/unknown) -->
	<!-- - Language: Unknown -->
	<!-- - License: Unknown -->

	### Model Sources

	- Repository: [SetFit on GitHub](https://github.com/huggingface/setfit)
	- Paper: [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
	- Blogpost: [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)

	### Model Labels
	\| Label \| Examples \|
	\|:------\|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| yes \| <ul><li>'"Xi Jinping, China’s top leader, abandoned his “zero Covid” policy in early December. That policy had kept infections low but required costly precautions like mass testing — measures that exhausted the budgets of local governments."In a recent issue, The Economist magazine reported that China spent ~$250 billion on mass testing during a recent one-year period. The piece also indicated that an unnamed expert suggested that that number was likely to be much lower than the true amount. Even for China, this is a remarkable amount of resources devoted to that aspect of combating Covid. It\'s no wonder President Xi had to finally give up on zero Covid - in all its manifestations, China could no longer afford the strategy.\n'</li><li>'The huge excursions to and from China at the Dawn of 2020 for China\'s lunar year celebration, just after the Wuhan breakout in DEC 2019 and its aftermath of spreading Covid-19 as a wildfire across the globe has a lesson to compare the present situation.China\'s much advertised, the world\'s first stringent drive to eradicate Covid VIRUS by adopting "ZERO COVID " policy since 2019 was lifted on DEC,7,2022 after realizing its end point is a fiasco. The reporting 60k fatalities a week before the China\'s lunar year on 22,JAN,2023 is a caution to the international travelers. Any global viral spread in 2023 shan\'t become a justification for lifting Zero Covid policy and zero testing of the travelers- in and out by China.\n'</li><li>'Ace Not so black and white. China’s “No-COVID” policy during the early part of the pandemic, albeit draconian and heavy-handed, likely saved tens of thousands of lives. However, once vaccines became available, China should have 1) adopted Western mRNA vaccines which are more effective at preventing serious illness than the Chinese domestic versions. 2) Begin preparing for a gradual reopening by stockpiling antivirals to protect its most vulnerable citizens. By demonstrating the “superior” Chinese model with the prolonged strict no-COVID policy, President Xi was able to secure his unprecedented 3rd 5-year term.Liberals are against public health policies that are driven by political considerations rather than driven by science.\n'</li></ul> \|
	\| no \| <ul><li>'Teaching history is, by its very nature, a matter of prioritization and opinion. When it is a mandatory requirement for a high school diploma, the requirement to learn a specific version of history and regurgitate it becomes a form of indoctrination. DeSantis is an easy target for his opponents (I am one) for obvious reasons, but the challenge remains the same. What is the version of history that we want to teach our children? Should the history of black Americans be enhanced? What about Mexicans ( a largely overlooked group), women, Asians (nary a word about the Chinese Exclusion Act), religious subgroups - the early plight of Catholics, Jewish immigrants, Mormons, Muslims, and the emergence of a non-secular movement? How would we propose to teach about abortion rights? Is it the quiet revolution of the unborn or the destruction of rights previously available to women? The list goes on. I find articles like this with outrage dripping, reductive, and of little value. A challenge with public schools is that they are an arm of the government. So, it is hardly surprising that the CEO of the state/legislature would exert influence. A debate no history is highly valuable but America goes immediately to war with itself and no longer debates\n'</li><li>"David Brook offers an interesting perspective on Biden and America's conduct in the world.Putin, Xi are all crazy people doing crazy things. In contrast, Biden is a steady hand guiding the American ship of the international rule based order.I suppose if I lived in the Washington bubble, I might have a similar view. But I come from a world of anti-imperialist struggle, and my world looks very different.I see the US undermining struggling nations all over the world, most recently in Africa. The ugly American fingerprints are also all over the coups in Honduras, Venezuela, Bolivia and Peru.Cuba is now in its sixtieth year of a crushing US blockade. US military bases now nun from Niger in West Africa, across the continent to Kenya.Active military operations are going on in Somalis, Syria and of course Ukraine.There's no difference between the referendums for autonomy held in Kosovo and the Donbas and Crimea, except that one was sponsored by the US and the other by Russia.According to the UN, world famine this year can be averted for 1.7 billion dollars. In contrast, our military funding for Ukraine is now at 122 billion.Under American leadership, corporations paid out $257bn to wealthy shareholders, while over 800 million people went to bed hungry.So, forgive me if I see Biden's ''steady hand” differently than the NYTimes crowd does.Perspective is everything, and the world looks very different when you see it from the bottom up.\n"</li><li>'LB and what would we do for our neighbors? What did we do when children were separated from their parents at the border under Trump? Most of us did nothing.\n'</li></ul> \|

	## Evaluation

	### Metrics
	\| Label \| Accuracy \|
	\|:--------\|:---------\|
	\| all \| 1.0 \|

	## Uses

	### Direct Use for Inference

	First install the SetFit library:

	```bash
	pip install setfit
	```

	Then you can load this model and run inference.

	```python
	from setfit import SetFitModel

	# Download from the 🤗 Hub
	model = SetFitModel.from_pretrained("davidadamczyk/setfit-model-8")
	# Run inference
	preds = model("China knows everything about its citizens, monitors every details in their lives but somehow can't say how many people exactly died from Covid19 since it ended its zero covid policy.Why should we believe these numbers instead of last week numbers?
	")
	```

	<!--
	### Downstream Use

	List how someone could finetune this model on their own dataset.
	-->

	<!--
	### Out-of-Scope Use

	List how the model may foreseeably be misused and address what users ought not to do with the model.
	-->

	<!--
	## Bias, Risks and Limitations

	What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.
	-->

	<!--
	### Recommendations

	What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.
	-->

	## Training Details

	### Training Set Metrics
	\| Training set \| Min \| Median \| Max \|
	\|:-------------\|:----\|:--------\|:----\|
	\| Word count \| 13 \| 141.375 \| 287 \|

	\| Label \| Training Sample Count \|
	\|:------\|:----------------------\|
	\| no \| 18 \|
	\| yes \| 22 \|

	### Training Hyperparameters
	- batch_size: (16, 16)
	- num_epochs: (1, 1)
	- max_steps: -1
	- sampling_strategy: oversampling
	- num_iterations: 120
	- body_learning_rate: (2e-05, 2e-05)
	- head_learning_rate: 2e-05
	- loss: CosineSimilarityLoss
	- distance_metric: cosine_distance
	- margin: 0.25
	- end_to_end: False
	- use_amp: False
	- warmup_proportion: 0.1
	- l2_weight: 0.01
	- seed: 42
	- eval_max_steps: -1
	- load_best_model_at_end: False

	### Training Results
	\| Epoch \| Step \| Training Loss \| Validation Loss \|
	\|:------:\|:----:\|:-------------:\|:---------------:\|
	\| 0.0017 \| 1 \| 0.3089 \| - \|
	\| 0.0833 \| 50 \| 0.1005 \| - \|
	\| 0.1667 \| 100 \| 0.0014 \| - \|
	\| 0.25 \| 150 \| 0.0004 \| - \|
	\| 0.3333 \| 200 \| 0.0002 \| - \|
	\| 0.4167 \| 250 \| 0.0002 \| - \|
	\| 0.5 \| 300 \| 0.0002 \| - \|
	\| 0.5833 \| 350 \| 0.0001 \| - \|
	\| 0.6667 \| 400 \| 0.0001 \| - \|
	\| 0.75 \| 450 \| 0.0001 \| - \|
	\| 0.8333 \| 500 \| 0.0001 \| - \|
	\| 0.9167 \| 550 \| 0.0001 \| - \|
	\| 1.0 \| 600 \| 0.0001 \| - \|

	### Framework Versions
	- Python: 3.10.13
	- SetFit: 1.1.0
	- Sentence Transformers: 3.0.1
	- Transformers: 4.45.2
	- PyTorch: 2.4.0+cu124
	- Datasets: 2.21.0
	- Tokenizers: 0.20.0

	## Citation

	### BibTeX
	```bibtex
	@article{https://doi.org/10.48550/arxiv.2209.11055,
	doi = {10.48550/ARXIV.2209.11055},
	url = {https://arxiv.org/abs/2209.11055},
	author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
	keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
	title = {Efficient Few-Shot Learning Without Prompts},
	publisher = {arXiv},
	year = {2022},
	copyright = {Creative Commons Attribution 4.0 International}
	}
	```

	<!--
	## Glossary

	Clearly define terms in order to be accessible across audiences.
	-->

	<!--
	## Model Card Authors

	Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.
	-->

	<!--
	## Model Card Contact

	Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.
	-->