Jeronymous commited on
Commit
3b2e99d
1 Parent(s): 9bfab89

Update sample code, and fix a wrong link

Browse files
Files changed (1) hide show
  1. README.md +21 -34
README.md CHANGED
@@ -14,8 +14,6 @@ base_model: OpenLLM-France/Claire-7B-0.1
14
 
15
  ## Model Details
16
 
17
- ### Model Description
18
-
19
  This is the instruction-finetuned model based on [OpenLLM-France/Claire-7B-0.1](https://huggingface.co/OpenLLM-France/Claire-7B-0.1), using the [Vigogne dataset](https://github.com/bofenghuang/vigogne).
20
  Note: This is not a chat model. The finetuning was done on instruction-following data, and the model should be used with the template as shown in "How to Get Started with the Model".
21
 
@@ -24,11 +22,6 @@ Note: This is not a chat model. The finetuning was done on instruction-following
24
  - **License:** CC-BY-NC-SA 4.0
25
  - **Finetuned from model: [OpenLLM-France/Claire-7B-0.1](https://huggingface.co/OpenLLM-France/Claire-7B-0.1)
26
 
27
- ### Model Sources
28
-
29
- - **Repository:** [OpenLLM-France/Claire-7B-0.1](https://huggingface.co/OpenLLM-France/Claire-7B-EN-0.1)
30
- - **Paper:** [Claire: Large Language Models for Spontaneous French Dialogue](https://aclanthology.org/2024.jeptalnrecital-taln.36/)
31
-
32
 
33
  ## Uses
34
 
@@ -46,40 +39,34 @@ This model may reflect biases present in the data it was trained on, potentially
46
  Use the code below to get started with the model.
47
 
48
  ```python
 
49
  import torch
50
- from peft import PeftModel
51
- from transformers import AutoModelForCausalLM, AutoTokenizer, LlamaTokenizer, StoppingCriteria, StoppingCriteriaList, TextIteratorStreamer
52
 
53
- from transformers import (
54
- AutoConfig,
55
- AutoModelForCausalLM,
56
- AutoTokenizer,
57
- )
58
-
59
- model_name = 'OpenLLM-France/Claire-7B-FR-Instruct-0.1'
60
 
61
- model= AutoModelForCausalLM.from_pretrained(
62
- model_name,
63
- device_map="cuda:0",
64
- trust_remote_code=True,
 
65
  )
66
 
67
- tokenizer = AutoTokenizer.from_pretrained(model_name)
68
- tokenizer.pad_token = tokenizer.eos_token
69
-
70
-
71
- new_prompt = """Utilisateur: {instruction}
72
-
73
- Assistant:"""
74
-
75
- inputs = tokenizer([new_prompt], return_tensors = "pt")
76
- inputs = {k:v.to('cuda') for k, v in inputs.items()}
77
-
78
- outputs = model.generate(**inputs, max_new_tokens = 400, use_cache = True, do_sample=True, top_k=50, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
79
 
80
- decoded_output = tokenizer.batch_decode(outputs)
81
- print(decoded_output[0])
 
82
 
 
 
 
83
  ```
84
 
85
  ## Training Details
 
14
 
15
  ## Model Details
16
 
 
 
17
  This is the instruction-finetuned model based on [OpenLLM-France/Claire-7B-0.1](https://huggingface.co/OpenLLM-France/Claire-7B-0.1), using the [Vigogne dataset](https://github.com/bofenghuang/vigogne).
18
  Note: This is not a chat model. The finetuning was done on instruction-following data, and the model should be used with the template as shown in "How to Get Started with the Model".
19
 
 
22
  - **License:** CC-BY-NC-SA 4.0
23
  - **Finetuned from model: [OpenLLM-France/Claire-7B-0.1](https://huggingface.co/OpenLLM-France/Claire-7B-0.1)
24
 
 
 
 
 
 
25
 
26
  ## Uses
27
 
 
39
  Use the code below to get started with the model.
40
 
41
  ```python
42
+ import transformers
43
  import torch
 
 
44
 
45
+ model_name = "OpenLLM-France/Claire-7B-FR-Instruct-0.1"
 
 
 
 
 
 
46
 
47
+ tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
48
+ model = transformers.AutoModelForCausalLM.from_pretrained(model_name,
49
+ device_map="auto",
50
+ torch_dtype=torch.bfloat16,
51
+ load_in_4bit=True # For efficient inference, if supported by the GPU card
52
  )
53
 
54
+ pipeline = transformers.pipeline("text-generation", model=model, tokenizer=tokenizer)
55
+ generation_kwargs = dict(
56
+ num_return_sequences=1, # Number of variants to generate.
57
+ return_full_text= False, # Do not include the prompt in the generated text.
58
+ max_new_tokens=200, # Maximum length for the output text.
59
+ do_sample=True, top_k=10, temperature=1.0, # Sampling parameters.
60
+ pad_token_id=tokenizer.eos_token_id, # Just to avoid a harmless warning.
61
+ )
 
 
 
 
62
 
63
+ prompt = "Utilisateur: {}\n\nAssistant: ".format(
64
+ "Qui était le président Français en 1995 ?"
65
+ )
66
 
67
+ completions = pipeline(prompt, **generation_kwargs)
68
+ for completion in completions:
69
+ print(prompt + " […]" + completion['generated_text'])
70
  ```
71
 
72
  ## Training Details