This is openbmb/UltraLM-13b recovered with huggyllama/llama-13b and quantized to 4bit GPTQ with the following config:

quantize_config = BaseQuantizeConfig(
    bits=4, 
    group_size=32,  
    desc_act=True,
)

Original Model Card:

UltraLM-13b

This is UltraLM-13b delta weights, a chat language model trained upon UltraChat

Model Details

Model Description

The model is fine-tuned based on LLaMA-13b with a multi-turn chat-format template as below

User: instruction 1<eos_token>
Assistant: response 1<eos_token>
User: instruction 2<eos_token>
Assistant: response 2<eos_token>
...
  • License: UltraLM is based on LLaMA and should be used under LLaMA's model license.
  • Finetuned from model: LLaMA-13b
  • Finetuned on data: UltraChat

Model Sources

Uses

To use this model, you need to recover the full model from the delta weights and perform inference following the template below:

[Optional]User: system prompt<eos_token>
User: user input<eos_token>
Assistant: 
Downloads last month
13
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train poisson-fish/ultralm-13b-GPTQ