GGUF
Inference Endpoints
munish0838 commited on
Commit
a83418d
1 Parent(s): 93aad82

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -21,7 +21,7 @@ This is quantized version of [amd/AMD-Llama-135m](https://huggingface.co/amd/AMD
21
 
22
 
23
  ## Introduction
24
- AMD-Llama-135m is a language model trained on AMD MI250 GPUs. Based on LLaMA2 model architecture, this model can be smoothly loaded as LlamaForCausalLM with huggingface transformers. Furthermore, we use the same tokenizer as LLaMA2, enabling it to be a draft model of speculative decoding for LLaMA2 and CodeLlama.
25
 
26
  ## Model Details
27
 
 
21
 
22
 
23
  ## Introduction
24
+ AMD-Llama-135m is a language model trained on AMD Instinct MI250 accelerators. Based on LLama2 model architecture, this model can be smoothly loaded as LlamaForCausalLM with huggingface transformers. Furthermore, we use the same tokenizer as LLama2, enabling it to be a draft model of speculative decoding for LLama2 and CodeLlama.
25
 
26
  ## Model Details
27