--- license: apache-2.0 datasets: - yahma/alpaca-cleaned metrics: - accuracy base_model: - meta-llama/Meta-Llama-3.1-8B-Instruct --- ## Usage Support for this model will be added in the upcoming transformers release. In the meantime, please install the library from source: ~~~ pip install transformers ~~~ We can now run inference on this model: ~~~ import torch from transformers import AutoTokenizer, AutoModelForCausalLM # Load the tokenizer and model model_path = "YaoLuzjut/partial-layer_fine-tuning_Llama-3.1-8B-Instruct" tokenizer = AutoTokenizer.from_pretrained(model_path) device = 'cuda' dtype = torch.bfloat16 model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=dtype, device_map=device) # Prepare the input text prompt = 'Complete the paragraph: our solar system is' inputs = tokenizer.encode(prompt, return_tensors='pt').to(model.device) # Generate the output outputs = model.generate(inputs, max_length=20) # Decode and print the output output_text = tokenizer.decode(outputs[0]) print(output_text) ~~~ ## Evaluation Results Zero-shot performance. Evaluated using select datasets from the [LM Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness/tree/main) with additions: