metadata

license: apache-2.0
datasets:
  - grammarly/coedit
language:
  - en
tags:
  - text-generation-inference
  - candle
widget:
  - text: >-
      Fix the grammar: When I grow up, I start to understand what he said is
      quite right.
    example_title: Fluency
  - text: >-
      Make this text coherent: Their flight is weak. They run quickly through
      the tree canopy.
    example_title: Coherence
  - text: >-
      Rewrite to make this easier to understand: A storm surge is what
      forecasters consider a hurricane's most treacherous aspect.
    example_title: Simplification
  - text: 'Paraphrase this: Do you know where I was born?'
    example_title: Paraphrase
  - text: >-
      Write this more formally: omg i love that song im listening to it right
      now
    example_title: Formalize
  - text: 'Write in a more neutral way: The authors'' exposé on nutrition studies.'
    example_title: Neutralize

Quantized candle weights for the CoEdIT model

Quantized weights of CoEdIT for inference with candle.

Usage

Clone candle, and run the quantized-t5 example:

$ cargo run --example quantized-t5 --release  -- \
  --model-id "jbochi/candle-coedit-quantized" \
  --prompt "Make this text coherent: Their flight is weak. They run quickly through the tree canopy." \
  --temperature 0
...
 Although their flight is weak, they run quickly through the tree canopy.

By default, it will use CoEdIT-large (770M params, 643 MB).

To use CoEdIT-xl (3B params, 2.34 GB), specify the weight-file and config-file:

$ cargo run --example quantized-t5 --release  -- \
  --model-id "jbochi/candle-coedit-quantized" \
  --weight-file "model-xl.gguf" \
  --config-file "config-xl.json" \
  --prompt "Rewrite to make this easier to understand: Note that a storm surge is what forecasters consider a hurricane's most treacherous aspect." \
  --temperature 0
...
 Note that a storm surge is what forecasters consider a hurricane's most dangerous part.

Models available

These are all the available formats. Weight file is named {name}_{quant}.gguf and config-file config-{base}.json

Model	Base model	Quantization	# Params	Size
-	large	None	770M	3.13 GB
model	large	6k	770M	643 MB
model-4k	large	4k	770M	441 MB
model-4_0	large	4_0	770M	441 MB
	xl	None	3B	11.4 GB
model-xl	xl	6k	3B	2.34 GB
model-xl-4k	xl	4k	3B	1.6 GB
model-xl-4_0	xl	4_0	3B	1.6 GB
-	xxl	None	11B	44.5 GB
model-xxl	xxl	6k	11B	9.14 GB
model-xxl-4k	xxl	4k	11B	WIP
model-xxl-4_0	xxl	4_0	11B	WIP

Model generation

The weights were quantized using candle:

cargo run --example tensor-tools --release -- quantize \
  --quantization q6k \
  /path/to/coedit-<version>/model.safetensors \
  --out-file model<version>.gguf