Upload README.md with huggingface_hub

969ea0e verified 20 days ago

4.19 kB

	---
	base_model: bert-base-multilingual-uncased
	datasets:
	- sonos-nlu-benchmark/snips_built_in_intents
	license: apache-2.0
	tags:
	- embedding_space_map
	- BaseLM:bert-base-multilingual-uncased
	---

	# ESM sonos-nlu-benchmark/snips_built_in_intents

	<!-- Provide a quick summary of what the model is/does. -->



	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->

	ESM

	- Developed by: David Schulte
	- Model type: ESM
	- Base Model: bert-base-multilingual-uncased
	- Intermediate Task: sonos-nlu-benchmark/snips_built_in_intents
	- ESM architecture: linear
	- Language(s) (NLP): [More Information Needed]
	- License: Apache-2.0 license

	## Training Details

	### Intermediate Task
	- Task ID: sonos-nlu-benchmark/snips_built_in_intents
	- Subset [optional]: default
	- Text Column: text
	- Label Column: label
	- Dataset Split: train
	- Sample size [optional]: 328
	- Sample seed [optional]:

	### Training Procedure [optional]

	<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->

	#### Language Model Training Hyperparameters [optional]
	- Epochs: 3
	- Batch size: 32
	- Learning rate: 2e-05
	- Weight Decay: 0.01
	- Optimizer: AdamW

	### ESM Training Hyperparameters [optional]
	- Epochs: 10
	- Batch size: 32
	- Learning rate: 0.001
	- Weight Decay: 0.01
	- Optimizer: AdamW


	### Additional trainiung details [optional]


	## Model evaluation

	### Evaluation of fine-tuned language model [optional]


	### Evaluation of ESM [optional]
	MSE:

	### Additional evaluation details [optional]



	## What are Embedding Space Maps?

	<!-- This section describes the evaluation protocols and provides the results. -->
	Embedding Space Maps (ESMs) are neural networks that approximate the effect of fine-tuning a language model on a task. They can be used to quickly transform embeddings from a base model to approximate how a fine-tuned model would embed the the input text.
	ESMs can be used for intermediate task selection with the ESM-LogME workflow.

	## How can I use Embedding Space Maps for Intermediate Task Selection?
	[![PyPI version](https://img.shields.io/pypi/v/hf-dataset-selector.svg)](https://pypi.org/project/hf-dataset-selector)

	We release hf-dataset-selector, a Python package for intermediate task selection using Embedding Space Maps.

	hf-dataset-selector fetches ESMs for a given language model and uses it to find the best dataset for applying intermediate training to the target task. ESMs are found by their tags on the Huggingface Hub.

	```python
	from hfselect import Dataset, compute_task_ranking

	# Load target dataset from the Hugging Face Hub
	dataset = Dataset.from_hugging_face(
	name="stanfordnlp/imdb",
	split="train",
	text_col="text",
	label_col="label",
	is_regression=False,
	num_examples=1000,
	seed=42
	)

	# Fetch ESMs and rank tasks
	task_ranking = compute_task_ranking(
	dataset=dataset,
	model_name="bert-base-multilingual-uncased"
	)

	# Display top 5 recommendations
	print(task_ranking[:5])
	```

	For more information on how to use ESMs please have a look at the [official Github repository](https://github.com/davidschulte/hf-dataset-selector).

	## Citation


	<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
	If you are using this Embedding Space Maps, please cite our [paper](https://arxiv.org/abs/2410.15148).

	BibTeX:


	```
	@misc{schulte2024moreparameterefficientselectionintermediate,
	title={Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning},
	author={David Schulte and Felix Hamborg and Alan Akbik},
	year={2024},
	eprint={2410.15148},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2410.15148},
	}
	```


	APA:

	```
	Schulte, D., Hamborg, F., & Akbik, A. (2024). Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning. arXiv preprint arXiv:2410.15148.
	```

	## Additional Information