TheBloke/Llama-2-13B-chat-GPTQ

#52 opened 9 months ago by

tremblingbrain

add template

#51 opened 10 months ago by

philschmid

torch.cuda.OutOfMemoryError: CUDA out of memory.

#50 opened 12 months ago by

neo-benjamin

Can you please provide 'c4' version?

#49 opened about 1 year ago by

leeee1204

How much does it take to inference one sample?

#48 opened about 1 year ago by

andreaKIM

Issues with CUDA and exllama_kernels

9

#47 opened about 1 year ago by

ditchtech

Calling LlamaTokenizerFast.from_pretrained() with the path to a single file or url is not supported for this tokenizer. Use a model identifier or the path to a directory instead.

#46 opened about 1 year ago by

kidrah-yxalag

Hallucination issue in Llama-2-13B-chat-GPTQ

#45 opened about 1 year ago by

DivyanshTiwari7

Increasing the model's predefined max length

#44 opened about 1 year ago by

MLconArtist

[AUTOMATED] Model Memory Requirements

#43 opened about 1 year ago by

model-sizer-bot

Deploying TheBloke/Llama-2-13B-chat-GPTQ as a batch end point in sagemaker

#41 opened about 1 year ago by

vinaykakara

Deploying this on Text Generation Inference (TGI) server on AWS SageMaker

#38 opened about 1 year ago by

ZaydJamadar

Understanding materials

#37 opened over 1 year ago by

rishabh-gurbani

Temperature or top_p is not working

#35 opened over 1 year ago by

chintan4560

Train model with webui

#34 opened over 1 year ago by

Samitoo

HuggingFace's bitsandbytes vs AutoGPTQ?

#33 opened over 1 year ago by

chongcy

What library was used to quantize this model ?

#32 opened over 1 year ago by

ImWolf7

Dataset used for quantisation

#31 opened over 1 year ago by

CarlosAndrea

How to make it (Llama-2-13B-chat-GPTQ) work with Fastchat

4

#30 opened over 1 year ago by

Vishvendra

Error: Transformers import module musicgen

#29 opened over 1 year ago by

galdezanni

Finetuning the model using custom dataset.

#28 opened over 1 year ago by

Varanasi5213

Necessary material for llama2

#27 opened over 1 year ago by

Samitoo

Converting hf format model to 128g.safetensors

#26 opened over 1 year ago by

goodromka

Llama-2-13B-chat-GPTQ problem

#23 opened over 1 year ago by

nigsdf

Getting an error: AttributeError: module 'accelerate.utils' has no attribute 'modeling'. Please tell me what should i do?

#21 opened over 1 year ago by

Dhairye

Getting error while loading model_basename = "gptq_model-8bit-128g"

#20 opened over 1 year ago by

Pchaudhary

fine tune on custom chat dataset using QLORA & PEFT

#19 opened over 1 year ago by

yashk92

General Update Question for LLMs

#17 opened over 1 year ago by

Acrious

File not found error while loading model

19

#14 opened over 1 year ago by

Osamarafique998

CPU Inference

#13 opened over 1 year ago by

Ange09

Slow Inference Speed

#12 opened over 1 year ago by

asifahmed

Error while loading model from path

#11 opened over 1 year ago by

abhishekpandit

Censorship is hilarious

6

#10 opened over 1 year ago by

tea-lover-418

why it says no quantize_config.json file but it has

6

#9 opened over 1 year ago by

Mark000111888

Error loading model from a different branch with revision

9

#8 opened over 1 year ago by

amitj

Llama v2 GPTQ context length

6

#7 opened over 1 year ago by

andrewsameh

Is this model based on `chat` or `chat-hf` model of llama2?

#6 opened over 1 year ago by

pootow

Prompt format

8

#5 opened over 1 year ago by

mr96

Bravo! That was fast : )

#3 opened over 1 year ago by

jacobgoldenart

Doesn't contain the files