ethzanalytics
/

dolly-v2-12b-sharded-8bit

Text Generation

text-generation-inference

8-bit precision

Model card Files Files and versions Community

pszemraj commited on Apr 29, 2023

Commit

3ca9d76

•

1 Parent(s): 03e1550

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -28,7 +28,7 @@ Refer to the [original model](https://huggingface.co/databricks/dolly-v2-12b) fo
 - total model size is only ~12.5 GB!
 - this enables low-RAM loading, i.e. Colab :)
 ## Basic Usage

 - total model size is only ~12.5 GB!
 - this enables low-RAM loading, i.e. Colab :)
+- **update**: generation speed can be greatly improved by setting `use_cache=True` and generating via contrastive search. [example notenook here](https://colab.research.google.com/gist/pszemraj/12c832952c88d77f6924c0718a2d257d/dolly-v2-12b-8bit-use_cache-bettertransformer.ipynb)
 ## Basic Usage