I can't find the right model for me

#6
by Yhojann - opened

I have a dilemma, I have tried to test dolphin mistral on my local computer, for this I have found a quick and easy way through the ollama application, I have run it in a docker container and it has worked very well for me. I speak Spanish, I greet him in Spanish and he answers me completely in Spanish without any problem and I can interact with him in a very fluid way, the problem is that I have wanted to make some fine adjustments and here my complication began since I realized that ollama uses a proprietary format (Lepton3) which you cannot do your own training locally on your pc, you must pay for a cloud service, for this reason I have decided to become independent of ollama and I have downloaded the dolphin-2.9.4-llama3.1-8b model from huggingface and I have executed it from a python project using transformers, the problem is that the responses are very incoherent and despite greeting me in Spanish he writes to me in English and no matter how much I tell him to write to me in Spanish I cannot get it to have the same written performance as ollama and I do not understand why.

From the ollama repository it says that it uses 2.8: https://ollama.com/library/dolphin-mistral but the 2.9

But the 2.9 model works worse than the 2.8 of ollama, why? Which model should I download and how should I configure it to best adapt it to my needs? I have good computing power, plenty of RAM and I want to use dolphin mistral to help me with programming codes and cybersecurity issues.

Can someone guide me?

I have a dilemma, I have tried to test dolphin mistral on my local computer, for this I have found a quick and easy way through the ollama application, I have run it in a docker container and it has worked very well for me. I speak Spanish, I greet him in Spanish and he answers me completely in Spanish without any problem and I can interact with him in a very fluid way, the problem is that I have wanted to make some fine adjustments and here my complication began since I realized that ollama uses a proprietary format (Lepton3) which you cannot do your own training locally on your pc, you must pay for a cloud service, for this reason I have decided to become independent of ollama and I have downloaded the dolphin-2.9.4-llama3.1-8b model from huggingface and I have executed it from a python project using transformers, the problem is that the responses are very incoherent and despite greeting me in Spanish he writes to me in English and no matter how much I tell him to write to me in Spanish I cannot get it to have the same written performance as ollama and I do not understand why.

From the ollama repository it says that it uses 2.8: https://ollama.com/library/dolphin-mistral but the 2.9

But the 2.9 model works worse than the 2.8 of ollama, why? Which model should I download and how should I configure it to best adapt it to my needs? I have good computing power, plenty of RAM and I want to use dolphin mistral to help me with programming codes and cybersecurity issues.

Can someone guide me?

Puede intentar afinar el modelo con unsloth (https://huggingface.co/unsloth). Si te refieres por ajuste fino a cambios como el promt del sistema del modelo o su temperatura, puedes cargar el modelo en formato GGUF y ajustarlo en LM Studio y ejecutar allí el servidor local. Ollama también soporta modelos GGUF

Sign up or log in to comment