Update README.md
Browse files
README.md
CHANGED
@@ -44,6 +44,38 @@ An experimental fine-tune of yi-34b-200k using [bagel](https://github.com/jondur
|
|
44 |
|
45 |
This is the model after the SFT phase, before DPO has been applied. DPO performs better on benchmarks, but this version is likely better for creative writing, roleplay, etc.
|
46 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
47 |
### Data sources
|
48 |
|
49 |
*Yes, you will see benchmark names in the list, but this only uses the train splits, and a decontamination by cosine similarity is performed at the end as a sanity check*
|
|
|
44 |
|
45 |
This is the model after the SFT phase, before DPO has been applied. DPO performs better on benchmarks, but this version is likely better for creative writing, roleplay, etc.
|
46 |
|
47 |
+
## How to easily download and use this model
|
48 |
+
|
49 |
+
[Massed Compute](https://massedcompute.com/?utm_source=huggingface&utm_creative_format=model_card&utm_content=creator_jon) has created a Virtual Machine (VM) pre-loaded with TGI and Text Generation WebUI.
|
50 |
+
|
51 |
+
1) For this model rent the [Jon Durbin 2xA6000](https://shop.massedcompute.com/products/jon-durbin-2x-a6000?utm_source=huggingface&utm_creative_format=model_card&utm_content=creator_jon) Virtual Machine
|
52 |
+
2) After you start your rental you will receive an email with instructions on how to Login to the VM
|
53 |
+
3) Once inside the VM, open the terminal and run `conda activate text-generation-inference`
|
54 |
+
4) Then `cd Desktop/text-generation-inference/`
|
55 |
+
5) Run `volume=$PWD/data`
|
56 |
+
6) Run`model=jondurbin/bagel-34b-v0.2`
|
57 |
+
7) `sudo docker run --gpus '"device=0,1"' --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.3 --model-id $model`
|
58 |
+
8) The model will take some time to load...
|
59 |
+
9) Once loaded the model will be available on port 8080
|
60 |
+
|
61 |
+
Sample command within the VM
|
62 |
+
```
|
63 |
+
curl 0.0.0.0:8080/generate \
|
64 |
+
-X POST \
|
65 |
+
-d '{"inputs":"[INST] <</SYS>>\nYou are a friendly chatbot.\n<</SYS>>\n\nWhat type of model are you? [/INST]","parameters":{"do_sample": true, "max_new_tokens": 100, "repetition_penalty": 1.15, "temperature": 0.7, "top_k": 20, "top_p": 0.9, "best_of": 1}}'\
|
66 |
+
-H 'Content-Type: application/json'
|
67 |
+
```
|
68 |
+
|
69 |
+
You can also access the model from outside the VM
|
70 |
+
```
|
71 |
+
curl IP_ADDRESS_PROVIDED_BY_MASSED_COMPUTE_VM:8080/generate \
|
72 |
+
-X POST \
|
73 |
+
-d '{"inputs":"[INST] <</SYS>>\nYou are a friendly chatbot.\n<</SYS>>\n\nWhat type of model are you? [/INST]","parameters":{"do_sample": true, "max_new_tokens": 100, "repetition_penalty": 1.15, "temperature": 0.7, "top_k": 20, "top_p": 0.9, "best_of": 1}}'\
|
74 |
+
-H 'Content-Type: application/json
|
75 |
+
```
|
76 |
+
|
77 |
+
For assistance with the VM join the [Massed Compute Discord Server](https://discord.gg/Mj4YMQY3DA)
|
78 |
+
|
79 |
### Data sources
|
80 |
|
81 |
*Yes, you will see benchmark names in the list, but this only uses the train splits, and a decontamination by cosine similarity is performed at the end as a sanity check*
|