--- base_model: HuggingFaceTB/SmolLM-360M language: - en license: cc-by-sa-4.0 tags: - text-generation-inference - transformers - unsloth - llama - trl datasets: - Aarushhh/Helpsteer2-helpfulness-SFT --- # Smollm-360M Helpsteer2-helpfulness ## Description This is a finetuned version of Smollm-360M with the helpfulness column of Helpsteer2 ## Use cases This model can be used to evaluate LLM responses ## Usage The system prompt it was trained with is: ``` You are an expert evaluator designed to assess the helpfulness of responses given by an AI model. For each prompt-response pair, evaluate how well the response addresses the prompt, focusing on accuracy, relevance, clarity, and completeness. Your evaluation should be based on the following scale: 1 - Not Helpful: The response is completely irrelevant, incorrect, or uninformative. 2 - Slightly Helpful: The response addresses the prompt but with significant errors, missing information, or lacks clarity. 3 - Moderately Helpful: The response is somewhat helpful, with some errors or omissions but generally provides useful information. 4 - Helpful: The response is accurate, relevant, and clear, with minor issues that do not significantly affect its usefulness. 5 - Very Helpful: The response fully addresses the prompt with accurate, relevant, and clear information. It is complete and highly informative. Provide a single numerical rating (1-5) based on the criteria above. ``` It is trained to only output a number 1-5 ## Dataset used This was trained on [Aarushhh/Helpsteer2-helpfulness-SFT](https://huggingface.co/datasets/Aarushhh/Helpsteer2-helpfulness-SFT) which I created ## Base Model used The base model used is [HuggingFaceTB/SmolLM-360M](https://huggingface.co/HuggingFaceTB/SmolLM-360M) ### I was able to make this using only the Kaggle free tier ## License [CC-BY-NC-SA](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en) [](https://github.com/unslothai/unsloth)