jbochi
/

candle-coedit-quantized

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

candle-coedit-quantized / README.md

jbochi's picture

Fix license to match base model

11f8f25 about 1 year ago

|

3.5 kB

	---
	license: cc-by-nc-4.0
	datasets:
	- grammarly/coedit
	language:
	- en
	tags:
	- text-generation-inference
	- candle
	widget:
	- text: >-
	Fix the grammar: When I grow up,
	I start to understand what he said is
	quite right.
	example_title: Fluency
	- text: >-
	Make this text coherent: Their flight
	is weak. They run quickly through
	the tree canopy.
	example_title: Coherence
	- text: >-
	Rewrite to make this easier to understand: A storm surge is what
	forecasters consider a hurricane's most treacherous aspect.
	example_title: Simplification
	- text: >-
	Paraphrase this: Do you know where I was born?
	example_title: Paraphrase
	- text: >-
	Write this more formally: omg i love that song im
	listening to it right now
	example_title: Formalize
	- text: >-
	Write in a more neutral way: The authors' exposé on nutrition studies.
	example_title: Neutralize
	---
	# Quantized candle weights for the CoEdIT model

	Quantized weights of [CoEdIT](https://github.com/vipulraheja/coedit) for inference with [candle](https://github.com/huggingface/candle/tree/main/candle-examples/examples/quantized-t5).

	## Usage

	You can run the smaller models directly from the browser using this [space](https://huggingface.co/spaces/jbochi/Candle-CoEdIT-Wasm).

	Clone [candle](https://github.com/huggingface/candle), and run the `quantized-t5` example:

	```shell
	$ cargo run --example quantized-t5 --release -- \
	--model-id "jbochi/candle-coedit-quantized" \
	--prompt "Make this text coherent: Their flight is weak. They run quickly through the tree canopy." \
	--temperature 0
	...
	Although their flight is weak, they run quickly through the tree canopy.
	```

	By default, it will use CoEdIT-large with q6k quantization (770M params, 643 MB).

	To use CoEdIT-xl (3B params, 2.34 GB), or any other provided model, specify the weight-file and config-file:

	```shell
	$ cargo run --example quantized-t5 --release -- \
	--model-id "jbochi/candle-coedit-quantized" \
	--weight-file "model-xl.gguf" \
	--config-file "config-xl.json" \
	--prompt "Rewrite to make this easier to understand: Note that a storm surge is what forecasters consider a hurricane's most treacherous aspect." \
	--temperature 0
	...
	Note that a storm surge is what forecasters consider a hurricane's most dangerous part.
	```

	## Models available

	These are all the available formats. Weight file is named `{model}.gguf` and the config file is `config-{base_model}.json`

	\| Model \| Base model \| Quantization \| # Params \| Size \|
	\| ----- \| ---------- \| ------------ \| ------ \| ---- \|
	\| - \| [large](https://huggingface.co/grammarly/coedit-large) \| None \| 770M \| 3.13 GB \|
	\| model \| large \| 6k \| 770M \| 643 MB \|
	\| model-q4k \| large \| 4k \| 770M \| 441 MB \|
	\| model-q4_0 \| large \| 4_0 \| 770M \| 441 MB \|
	\| \| [xl](https://huggingface.co/grammarly/coedit-xl) \| None \| 3B \| 11.4 GB \|
	\| model-xl \| xl \| 6k \| 3B \| 2.34 GB \|
	\| model-xl-q4k \| xl \| 4k \| 3B \| 1.6 GB \|
	\| model-xl-q4_0 \| xl \| 4_0 \| 3B \| 1.6 GB \|
	\| - \| [xxl](https://huggingface.co/grammarly/coedit-xxl) \| None \| 11B \| 44.5 GB \|
	\| model-xxl \| xxl \| 6k \| 11B \| 9.14 GB \|
	\| model-xxl-q4k \| xxl \| 4k \| 11B \| 6.27 GB \|
	\| model-xxl-q4_0 \| xxl \| 4_0 \| 11B \| 6.27 GB \|


	## Model generation

	The weights were quantized using candle:

	```shell
	cargo run --example tensor-tools --release -- quantize \
	--quantization q6k \
	/path/to/coedit-<version>/model.safetensors \
	--out-file model<version>.gguf
	```