Llama 2 Chat 70B for RK3588

This is a conversion from https://huggingface.co/meta-llama/Llama-2-70b-chat-hf to the RKLLM format for Rockchip devices. This runs on the NPU from the RK3588.

Convert to one file

Run:

cat llama2-chat-70b-hf-0* > llama2-chat-70b-hf.rkllm

But wait... will this run on my RK3588?

No. But I found interesting to see what happens if I converted it. Let's hope Microsoft never knows that I was using their SSDs as swap because they don't allow more than 32 GB RAM for the students subscription :P

image/png

And this is before finishing, it will probably get to 600 GBs of RAM + Swap.

But hey! You can always try yourself getting a 512GB SSD (and use around 100-250 GB as swap), a 32 GB of RAM SBC, have some patience and see if it loads. Good luck with that!

Main repo

See this for my full collection of converted LLMs for the RK3588's NPU:

https://huggingface.co/Pelochus/ezrkllm-collection

License

Same as the original LLM:

https://huggingface.co/meta-llama/Llama-2-70b-chat-hf/blob/main/LICENSE.txt

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.