pszemraj commited on
Commit
ed41f8e
1 Parent(s): 5c2b2e5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -0
README.md CHANGED
@@ -1,3 +1,44 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ datasets:
4
+ - databricks/databricks-dolly-15k
5
+ language:
6
+ - en
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - dolly
10
+ - dolly-v2
11
+ - instruct
12
+ - sharded
13
+ inference: False
14
  ---
15
+
16
+ # dolly-v2-12b: sharded **8bit** checkpoint
17
+
18
+
19
+ This is a sharded checkpoint (with ~4GB shards) of the `databricks/dolly-v2-12b` model **in `8bit` precision** using `bitsandbytes`.
20
+
21
+ Refer to the [original model](https://huggingface.co/databricks/dolly-v2-12b) for all details w.r.t. to the model. For more info on loading 8bit models, refer to the [example repo](https://huggingface.co/ybelkada/bloom-1b7-8bit) and/or the `4.28.0` [release info](https://github.com/huggingface/transformers/releases/tag/v4.28.0).
22
+
23
+
24
+ - this enables low-RAM loading, i.e. Colab :)
25
+
26
+ ## Basic Usage
27
+
28
+
29
+ install `transformers`, `accelerate`, and `bitsandbytes`.
30
+
31
+ ```bash
32
+ pip install -U -q transformers bitsandbytes accelerate
33
+ ```
34
+
35
+ Load the model. As it is serialized in 8bit you don't need to do anything special.
36
+
37
+ ```python
38
+ from transformers import AutoTokenizer, AutoModelForCausalLM
39
+
40
+ model_name = "ethzanalytics/dolly-v2-12b-sharded-8bit"
41
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
42
+
43
+ model = AutoModelForCausalLM.from_pretrained(model_name)
44
+ ```