Text Generation
Transformers
English
gpt_neox
red_pajama
Inference Endpoints
Edit model card

Original Model Link: https://huggingface.co/togethercomputer/RedPajama-INCITE-Instruct-3B-v1

This will NOT work with llama.cpp as of 5/13/2023, but this NOW works (5/13/2023) with the GGML in https://github.com/ggerganov/ggml/ via gpt-neox This also works in my project https://github.com/keldenl/gpt-llama.cpp (uses ggml as an InferenceEngine).

RedPajama-INCITE-Instruct-3B-v1

RedPajama-INCITE-Instruct-3B-v1 was developed by Together and leaders from the open-source AI community including Ontocord.ai, ETH DS3Lab, AAI CERC, Université de Montréal, MILA - Québec AI Institute, Stanford Center for Research on Foundation Models (CRFM), Stanford Hazy Research research group and LAION.

The model was fine-tuned for few-shot applications on the data of GPT-JT, with exclusion of tasks that overlap with the HELM core scenarios.

Model Details

  • Developed by: Together Computer.
  • Model type: Language Model
  • Language(s): English
  • License: Apache 2.0
  • Model Description: A 2.8B parameter pretrained language model.

Prompt Template

To prompt the chat model, use a typical instruction format + few shot prompting, for example:

Paraphrase the given sentence into a different sentence.

Input: Can you recommend some upscale restaurants in New York?
Output: What upscale restaurants do you recommend in New York?

Input: What are the famous places we should not miss in Paris?
Output: Recommend some of the best places to visit in Paris?

Input: Could you recommend some hotels that have cheap price in Zurich?
Output:

Which model to download?

  • The q4_0 file provides lower quality, but maximal compatibility. It will work with past and future versions of llama.cpp
  • The q4_2 file offers the best combination of performance and quality. This format is still subject to change and there may be compatibility issues, see below.
  • The q5_0 file is using brand new 5bit method released 26th April. This is the 5bit equivalent of q4_0.
  • The q5_1 file is using brand new 5bit method released 26th April. This is the 5bit equivalent of q4_1.
Downloads last month
19
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train keldenl/RedPajama-INCITE-Instruct-3B-v1-GGML