MLC version of nvidia/Llama3-ChatQA-1.5-8B, using q4f16_1
quantization.
Usage
import { CreateMLCEngine } from "@mlc-ai/web-llm";
async main() {
const appConfig = {
"model_list": [
{
"model": "https://huggingface.co/Felladrin/mlc-q4f16-Llama3-ChatQA-1.5-8B",
"model_id": "mlc-q4f16-Llama3-ChatQA-1.5-8B"
"model_lib": "https://huggingface.co/Felladrin/mlc-q4f16-Llama3-ChatQA-1.5-8B/resolve/main/model.wasm",
}
],
};
const engine = await CreateMLCEngine(
"mlc-q4f16-Llama3-ChatQA-1.5-8B",
{ appConfig },
);
}
Model tree for Felladrin/mlc-q4f16-Llama3-ChatQA-1.5-8B
Base model
nvidia/Llama3-ChatQA-1.5-8B