--- license: cc-by-nc-4.0 datasets: - grammarly/coedit language: - en tags: - text-generation-inference - candle widget: - text: >- Fix the grammar: When I grow up, I start to understand what he said is quite right. example_title: Fluency - text: >- Make this text coherent: Their flight is weak. They run quickly through the tree canopy. example_title: Coherence - text: >- Rewrite to make this easier to understand: A storm surge is what forecasters consider a hurricane's most treacherous aspect. example_title: Simplification - text: >- Paraphrase this: Do you know where I was born? example_title: Paraphrase - text: >- Write this more formally: omg i love that song im listening to it right now example_title: Formalize - text: >- Write in a more neutral way: The authors' exposé on nutrition studies. example_title: Neutralize --- # Quantized candle weights for the CoEdIT model Quantized weights of [CoEdIT](https://github.com/vipulraheja/coedit) for inference with [candle](https://github.com/huggingface/candle/tree/main/candle-examples/examples/quantized-t5). ## Usage You can run the smaller models directly from the browser using this [space](https://huggingface.co/spaces/jbochi/Candle-CoEdIT-Wasm). Clone [candle](https://github.com/huggingface/candle), and run the `quantized-t5` example: ```shell $ cargo run --example quantized-t5 --release -- \ --model-id "jbochi/candle-coedit-quantized" \ --prompt "Make this text coherent: Their flight is weak. They run quickly through the tree canopy." \ --temperature 0 ... Although their flight is weak, they run quickly through the tree canopy. ``` By default, it will use CoEdIT-large with q6k quantization (770M params, 643 MB). To use CoEdIT-xl (3B params, 2.34 GB), or any other provided model, specify the weight-file and config-file: ```shell $ cargo run --example quantized-t5 --release -- \ --model-id "jbochi/candle-coedit-quantized" \ --weight-file "model-xl.gguf" \ --config-file "config-xl.json" \ --prompt "Rewrite to make this easier to understand: Note that a storm surge is what forecasters consider a hurricane's most treacherous aspect." \ --temperature 0 ... Note that a storm surge is what forecasters consider a hurricane's most dangerous part. ``` ## Models available These are all the available formats. Weight file is named `{model}.gguf` and the config file is `config-{base_model}.json` | Model | Base model | Quantization | # Params | Size | | ----- | ---------- | ------------ | ------ | ---- | | - | [large](https://huggingface.co/grammarly/coedit-large) | None | 770M | 3.13 GB | | model | large | 6k | 770M | 643 MB | | model-q4k | large | 4k | 770M | 441 MB | | model-q4_0 | large | 4_0 | 770M | 441 MB | | | [xl](https://huggingface.co/grammarly/coedit-xl) | None | 3B | 11.4 GB | | model-xl | xl | 6k | 3B | 2.34 GB | | model-xl-q4k | xl | 4k | 3B | 1.6 GB | | model-xl-q4_0 | xl | 4_0 | 3B | 1.6 GB | | - | [xxl](https://huggingface.co/grammarly/coedit-xxl) | None | 11B | 44.5 GB | | model-xxl | xxl | 6k | 11B | 9.14 GB | | model-xxl-q4k | xxl | 4k | 11B | 6.27 GB | | model-xxl-q4_0 | xxl | 4_0 | 11B | 6.27 GB | ## Model generation The weights were quantized using candle: ```shell cargo run --example tensor-tools --release -- quantize \ --quantization q6k \ /path/to/coedit-/model.safetensors \ --out-file model.gguf ```