File size: 3,499 Bytes

4a5a2c3
11f8f25
043be07
d22b4fa
043be07
d22b4fa
043be07
d22b4fa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a6b00c3
d22b4fa
 
a6b00c3
d22b4fa
 
 
a6b00c3
d22b4fa
 
a6b00c3
4a5a2c3
d22b4fa
043be07
d22b4fa
043be07
3251ce4
 
c8b20e4
 
3251ce4
 
 
 
 
 
 
 
 
 
 
544c6e9
3251ce4
544c6e9
3251ce4
 
 
 
 
 
 
 
 
 
 
 
ca8442b
 
68c05f3
ca8442b
 
 
 
 
68c05f3
 
ca8442b
 
68c05f3
 
ca8442b
 
68c05f3
 
ca8442b
3251ce4
 
 
 
67bb427

---
license: cc-by-nc-4.0
datasets:
  - grammarly/coedit
language:
  - en
tags:
  - text-generation-inference
  - candle
widget:
  - text: >-
      Fix the grammar: When I grow up,
      I start to understand what he said is
      quite right.
    example_title: Fluency
  - text: >-
      Make this text coherent: Their flight
      is weak. They run quickly through
      the tree canopy.
    example_title: Coherence
  - text: >-
      Rewrite to make this easier to understand: A storm surge is what
      forecasters consider a hurricane's most treacherous aspect.
    example_title: Simplification
  - text: >-
      Paraphrase this: Do you know where I was born?
    example_title: Paraphrase
  - text: >-
      Write this more formally: omg i love that song im
      listening to it right now
    example_title: Formalize
  - text: >-
      Write in a more neutral way: The authors' exposé on nutrition studies.
    example_title: Neutralize
---
# Quantized candle weights for the CoEdIT model

Quantized weights of [CoEdIT](https://github.com/vipulraheja/coedit) for inference with [candle](https://github.com/huggingface/candle/tree/main/candle-examples/examples/quantized-t5).

## Usage

You can run the smaller models directly from the browser using this [space](https://huggingface.co/spaces/jbochi/Candle-CoEdIT-Wasm).

Clone [candle](https://github.com/huggingface/candle), and run the `quantized-t5` example:

```shell
$ cargo run --example quantized-t5 --release  -- \
  --model-id "jbochi/candle-coedit-quantized" \
  --prompt "Make this text coherent: Their flight is weak. They run quickly through the tree canopy." \
  --temperature 0
...
 Although their flight is weak, they run quickly through the tree canopy.
```

By default, it will use CoEdIT-large with q6k quantization (770M params, 643 MB).

To use CoEdIT-xl (3B params, 2.34 GB), or any other provided model, specify the weight-file and config-file:

```shell
$ cargo run --example quantized-t5 --release  -- \
  --model-id "jbochi/candle-coedit-quantized" \
  --weight-file "model-xl.gguf" \
  --config-file "config-xl.json" \
  --prompt "Rewrite to make this easier to understand: Note that a storm surge is what forecasters consider a hurricane's most treacherous aspect." \
  --temperature 0
...
 Note that a storm surge is what forecasters consider a hurricane's most dangerous part.
```

## Models available

These are all the available formats. Weight file is named `{model}.gguf` and the config file is `config-{base_model}.json`

| Model | Base model | Quantization | # Params | Size |
| ----- | ---------- | ------------ | ------ | ---- |
| - | [large](https://huggingface.co/grammarly/coedit-large) | None | 770M | 3.13 GB |
| model | large | 6k | 770M | 643 MB |
| model-q4k | large | 4k | 770M | 441 MB |
| model-q4_0 | large | 4_0 | 770M | 441 MB |
|  | [xl](https://huggingface.co/grammarly/coedit-xl) | None | 3B | 11.4 GB |
| model-xl | xl | 6k | 3B | 2.34 GB |
| model-xl-q4k | xl | 4k | 3B | 1.6 GB |
| model-xl-q4_0 | xl | 4_0 | 3B | 1.6 GB |
| - | [xxl](https://huggingface.co/grammarly/coedit-xxl) | None | 11B | 44.5 GB |
| model-xxl | xxl | 6k | 11B | 9.14 GB |
| model-xxl-q4k | xxl | 4k | 11B | 6.27 GB |
| model-xxl-q4_0 | xxl | 4_0 | 11B | 6.27 GB |


## Model generation

The weights were quantized using candle:

```shell
cargo run --example tensor-tools --release -- quantize \
  --quantization q6k \
  /path/to/coedit-<version>/model.safetensors \
  --out-file model<version>.gguf
```