YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Vicuna 13B V1.1 Chinese

This model was obtained from following repo:

  • uukuguy/vicuna-13b-v1.1
  • ziqingyang/chinese-alpaca-lora-13b

Merged using sciprts from: https://github.com/ymcui/Chinese-LLaMA-Alpaca

Then quanized using following command (no act order):

python llama.py ~/Chinese-LLaMA-Alpaca/alpaca-combined-hf c4 \
    --wbits 4 \
    --true-sequential \
    --groupsize 128 \
    --save_safetensors vicuna-13B-1.1-Chinese-GPTQ-4bit-128g.safetensors

Can confirm model output normal text, but question-answering quality is unknown

Vicuna Model Card

Model details

Model type: Vicuna is an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. It is an auto-regressive language model, based on the transformer architecture.

Model date: Vicuna was trained between March 2023 and April 2023.

Organizations developing the model: The Vicuna team with members from UC Berkeley, CMU, Stanford, and UC San Diego.

Paper or resources for more information: https://vicuna.lmsys.org/

License: Apache License 2.0

Where to send questions or comments about the model: https://github.com/lm-sys/FastChat/issues

Intended use

Primary intended uses: The primary use of Vicuna is research on large language models and chatbots.

Primary intended users: The primary intended users of the model are researchers and hobbyists in natural language processing, machine learning, and artificial intelligence.

Training dataset

70K conversations collected from ShareGPT.com.

Evaluation dataset

A preliminary evaluation of the model quality is conducted by creating a set of 80 diverse questions and utilizing GPT-4 to judge the model outputs. See https://vicuna.lmsys.org/ for more details.

Downloads last month
13
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.