metadata

license: apache-2.0
datasets:
  - yahma/alpaca-cleaned
metrics:
  - accuracy
base_model:
  - meta-llama/Meta-Llama-3.1-8B-Instruct

Usage

Support for this model will be added in the upcoming transformers release. In the meantime, please install the library from source: ''' pip install transformers

''' We can now run inference on this model: ''' import torch from transformers import AutoTokenizer, AutoModelForCausalLM

Load the tokenizer and model

model_path = "nvidia/Mistral-NeMo-Minitron-8B-Base" tokenizer = AutoTokenizer.from_pretrained(model_path)

device = 'cuda' dtype = torch.bfloat16 model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=dtype, device_map=device)

Prepare the input text

prompt = 'Complete the paragraph: our solar system is' inputs = tokenizer.encode(prompt, return_tensors='pt').to(model.device)

Generate the output

outputs = model.generate(inputs, max_length=20)

Decode and print the output

output_text = tokenizer.decode(outputs[0]) print(output_text)

'''

Evaluation Results

Zero-shot performance. Evaluated using select datasets from the LM Evaluation Harness with additions: