Update README.md
Browse files
README.md
CHANGED
@@ -19,8 +19,10 @@ base_model: unsloth/mistral-7b-v0.3-bnb-4bit
|
|
19 |
|
20 |
This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
|
21 |
|
|
|
|
|
22 |
```
|
23 |
-
# alpaca_prompt =
|
24 |
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
|
25 |
inputs = tokenizer(
|
26 |
[
|
@@ -36,7 +38,6 @@ text_streamer = TextStreamer(tokenizer)
|
|
36 |
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 1280)
|
37 |
```
|
38 |
|
39 |
-
|
40 |
```
|
41 |
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
|
42 |
<s>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
|
|
|
19 |
|
20 |
This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
|
21 |
|
22 |
+
## Inference Example
|
23 |
+
|
24 |
```
|
25 |
+
# alpaca_prompt = Copy from alpaca
|
26 |
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
|
27 |
inputs = tokenizer(
|
28 |
[
|
|
|
38 |
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 1280)
|
39 |
```
|
40 |
|
|
|
41 |
```
|
42 |
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
|
43 |
<s>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
|