--- base_model: dicta-il/dictalm2.0-instruct license: mit datasets: - HeNLP/HeDC4 language: - he --- # LLM2Vec applied on DictaLM-2.0 This is a Hebrew encoder model achieved by applying the [LLM2Vec](https://arxiv.org/abs/2404.05961) method onto [DictaLM-2.0](https://huggingface.co/dicta-il/dictalm2.0), utilizing the [HeDC4](https://huggingface.co/datasets/HeNLP/HeDC4) dataset. ## Usage ```python import torch from llm2vec import LLM2Vec def get_device() -> str: if torch.backends.mps.is_available(): return "mps" elif torch.cuda.is_available(): return "cuda" return "cpu" l2v = LLM2Vec.from_pretrained( base_model_name_or_path="omriel1/LLM2Vec-DictaLM2.0-mntp", peft_model_name_or_path="omriel1/LLM2Vec-DictaLM2.0-mntp-unsup-simcse", device_map=get_device(), torch_dtype=torch.bfloat16, trust_remote_code=True ) texts = [ "היי מה קורה?", "הכל טוב איתך?" ] results = l2v.encode(texts) print(results) ```