metadata
license: cc-by-nc-4.0
datasets:
- grammarly/coedit
language:
- en
tags:
- text-generation-inference
- candle
widget:
- text: >-
Fix the grammar: When I grow up, I start to understand what he said is
quite right.
example_title: Fluency
- text: >-
Make this text coherent: Their flight is weak. They run quickly through
the tree canopy.
example_title: Coherence
- text: >-
Rewrite to make this easier to understand: A storm surge is what
forecasters consider a hurricane's most treacherous aspect.
example_title: Simplification
- text: 'Paraphrase this: Do you know where I was born?'
example_title: Paraphrase
- text: >-
Write this more formally: omg i love that song im listening to it right
now
example_title: Formalize
- text: 'Write in a more neutral way: The authors'' exposé on nutrition studies.'
example_title: Neutralize
Quantized candle weights for the CoEdIT model
Quantized weights of CoEdIT for inference with candle.
Usage
You can run the smaller models directly from the browser using this space.
Clone candle, and run the quantized-t5
example:
$ cargo run --example quantized-t5 --release -- \
--model-id "jbochi/candle-coedit-quantized" \
--prompt "Make this text coherent: Their flight is weak. They run quickly through the tree canopy." \
--temperature 0
...
Although their flight is weak, they run quickly through the tree canopy.
By default, it will use CoEdIT-large with q6k quantization (770M params, 643 MB).
To use CoEdIT-xl (3B params, 2.34 GB), or any other provided model, specify the weight-file and config-file:
$ cargo run --example quantized-t5 --release -- \
--model-id "jbochi/candle-coedit-quantized" \
--weight-file "model-xl.gguf" \
--config-file "config-xl.json" \
--prompt "Rewrite to make this easier to understand: Note that a storm surge is what forecasters consider a hurricane's most treacherous aspect." \
--temperature 0
...
Note that a storm surge is what forecasters consider a hurricane's most dangerous part.
Models available
These are all the available formats. Weight file is named {model}.gguf
and the config file is config-{base_model}.json
Model | Base model | Quantization | # Params | Size |
---|---|---|---|---|
- | large | None | 770M | 3.13 GB |
model | large | 6k | 770M | 643 MB |
model-q4k | large | 4k | 770M | 441 MB |
model-q4_0 | large | 4_0 | 770M | 441 MB |
xl | None | 3B | 11.4 GB | |
model-xl | xl | 6k | 3B | 2.34 GB |
model-xl-q4k | xl | 4k | 3B | 1.6 GB |
model-xl-q4_0 | xl | 4_0 | 3B | 1.6 GB |
- | xxl | None | 11B | 44.5 GB |
model-xxl | xxl | 6k | 11B | 9.14 GB |
model-xxl-q4k | xxl | 4k | 11B | 6.27 GB |
model-xxl-q4_0 | xxl | 4_0 | 11B | 6.27 GB |
Model generation
The weights were quantized using candle:
cargo run --example tensor-tools --release -- quantize \
--quantization q6k \
/path/to/coedit-<version>/model.safetensors \
--out-file model<version>.gguf