thesven
/

Mistral-7B-Instruct-v0.3-GPTQ-4bit

@@ -23,174 +23,6 @@ Mistral-7B-v0.3 has the following changes compared to [Mistral-7B-v0.2](https://
 - Supports v3 Tokenizer
 - Supports function calling
-## Installation
-It is recommended to use `mistralai/Mistral-7B-Instruct-v0.3` with [mistral-inference](https://github.com/mistralai/mistral-inference). For HF transformers code snippets, please keep scrolling.
-```
-pip install mistral_inference
-```
-## Download
-```py
-from huggingface_hub import snapshot_download
-from pathlib import Path
-mistral_models_path = Path.home().joinpath('mistral_models', '7B-Instruct-v0.3')
-mistral_models_path.mkdir(parents=True, exist_ok=True)
-snapshot_download(repo_id="mistralai/Mistral-7B-Instruct-v0.3", allow_patterns=["params.json", "consolidated.safetensors", "tokenizer.model.v3"], local_dir=mistral_models_path)
-```
-### Chat
-After installing `mistral_inference`, a `mistral-chat` CLI command should be available in your environment. You can chat with the model using
-```
-mistral-chat $HOME/mistral_models/7B-Instruct-v0.3 --instruct --max_tokens 256
-```
-### Instruct following
-```py
-from mistral_inference.model import Transformer
-from mistral_inference.generate import generate
-from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
-from mistral_common.protocol.instruct.messages import UserMessage
-from mistral_common.protocol.instruct.request import ChatCompletionRequest
-tokenizer = MistralTokenizer.from_file(f"{mistral_models_path}/tokenizer.model.v3")
-model = Transformer.from_folder(mistral_models_path)
-completion_request = ChatCompletionRequest(messages=[UserMessage(content="Explain Machine Learning to me in a nutshell.")])
-tokens = tokenizer.encode_chat_completion(completion_request).tokens
-out_tokens, _ = generate([tokens], model, max_tokens=64, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
-result = tokenizer.instruct_tokenizer.tokenizer.decode(out_tokens[0])
-print(result)
-```
-### Function calling
-```py
-from mistral_common.protocol.instruct.tool_calls import Function, Tool
-from mistral_inference.model import Transformer
-from mistral_inference.generate import generate
-from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
-from mistral_common.protocol.instruct.messages import UserMessage
-from mistral_common.protocol.instruct.request import ChatCompletionRequest
-tokenizer = MistralTokenizer.from_file(f"{mistral_models_path}/tokenizer.model.v3")
-model = Transformer.from_folder(mistral_models_path)
-completion_request = ChatCompletionRequest(
-    tools=[
-        Tool(
-            function=Function(
-                name="get_current_weather",
-                description="Get the current weather",
-                parameters={
-                    "type": "object",
-                    "properties": {
-                        "location": {
-                            "type": "string",
-                            "description": "The city and state, e.g. San Francisco, CA",
-                        },
-                        "format": {
-                            "type": "string",
-                            "enum": ["celsius", "fahrenheit"],
-                            "description": "The temperature unit to use. Infer this from the users location.",
-                        },
-                    },
-                    "required": ["location", "format"],
-                },
-            )
-        )
-    ],
-    messages=[
-        UserMessage(content="What's the weather like today in Paris?"),
-        ],
-)
-tokens = tokenizer.encode_chat_completion(completion_request).tokens
-out_tokens, _ = generate([tokens], model, max_tokens=64, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
-result = tokenizer.instruct_tokenizer.tokenizer.decode(out_tokens[0])
-print(result)
-```
-## Generate with `transformers`
-If you want to use Hugging Face `transformers` to generate text, you can do something like this.
-```py
-from transformers import pipeline
-messages = [
-    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
-    {"role": "user", "content": "Who are you?"},
-]
-chatbot = pipeline("text-generation", model="mistralai/Mistral-7B-Instruct-v0.3")
-chatbot(messages)
-```
-## Function calling with `transformers`
-To use this example, you'll need `transformers` version 4.42.0 or higher. Please see the
-[function calling guide](https://huggingface.co/docs/transformers/main/chat_templating#advanced-tool-use--function-calling)
-in the `transformers` docs for more information.
-```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-import torch
-model_id = "mistralai/Mistral-7B-Instruct-v0.3"
-tokenizer = AutoTokenizer.from_pretrained(model_id)
-def get_current_weather(location: str, format: str):
-    """
-    Get the current weather
-    Args:
-        location: The city and state, e.g. San Francisco, CA
-        format: The temperature unit to use. Infer this from the users location. (choices: ["celsius", "fahrenheit"])
-    """
-    pass
-conversation = [{"role": "user", "content": "What's the weather like in Paris?"}]
-tools = [get_current_weather]
-# render the tool use prompt as a string:
-tool_use_prompt = tokenizer.apply_chat_template(
-            conversation,
-            tools=tools,
-            tokenize=False,
-            add_generation_prompt=True,
-)
-inputs = tokenizer(tool_use_prompt, return_tensors="pt")
-model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")
-outputs = model.generate(**inputs, max_new_tokens=1000)
-print(tokenizer.decode(outputs[0], skip_special_tokens=True))
-```
-Note that, for reasons of space, this example does not show a complete cycle of calling a tool and adding the tool call and tool
-results to the chat history so that the model can use them in its next generation. For a full tool calling example, please
-see the [function calling guide](https://huggingface.co/docs/transformers/main/chat_templating#advanced-tool-use--function-calling),
-and note that Mistral **does** use tool call IDs, so these must be included in your tool calls and tool results. They should be
-exactly 9 alphanumeric characters.
 ## Limitations

 - Supports v3 Tokenizer
 - Supports function calling
 ## Limitations