arynkiewicz commited on
Commit
ae553a7
1 Parent(s): 379f8d5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +74 -0
README.md ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: Universal-NER/UniNER-7B-all
3
+ tags:
4
+ - named entity recognition
5
+ - ner
6
+ model-index:
7
+ - name: daisd-ai/UniNER-W4A16
8
+ results: []
9
+ license: cc-by-nc-4.0
10
+ inference: false
11
+ ---
12
+
13
+ ## Introduction
14
+
15
+ This model is quantized version of [Universal-NER/UniNER-7B-all](https://huggingface.co/Universal-NER/UniNER-7B-all).
16
+
17
+ ## Quantization
18
+
19
+ The quantization was applied using [LLM Compressor](https://github.com/vllm-project/llm-compressor) with 512 random examples from [Universal-NER/Pile-NER-definition](https://huggingface.co/datasets/Universal-NER/Pile-NER-definition) dataset.
20
+
21
+ The recipe for quantization:
22
+ ```python
23
+ recipe = [
24
+ SmoothQuantModifier(smoothing_strength=0.8),
25
+ GPTQModifier(targets="Linear", scheme="W4A16", ignore=["lm_head"]),
26
+ ]
27
+ ```
28
+
29
+ ## Inference
30
+
31
+ We added chat template for the tokenizer, thus it can be directly used with vLLM without any other preprocessing compered to original model.
32
+
33
+ Example:
34
+ ```python
35
+ import json
36
+
37
+ from vllm import LLM, SamplingParams
38
+
39
+ # Loading model
40
+ llm = LLM(model="daisd-ai/UniNER-W4A16")
41
+ sampling_params = SamplingParams(temperature=0, max_tokens=256)
42
+
43
+ # Define text and entities types
44
+ text = "Some long text with multiple entities"
45
+ entities_types = ["entity type 1", "entity type 2"]
46
+
47
+ # Applying tokenizer
48
+ prompts = []
49
+ for entity_type in entities_types:
50
+ messages = [
51
+ {
52
+ "role": "user",
53
+ "content": f"Text: {text}",
54
+ },
55
+ {"role": "assistant", "content": "I've read this text."},
56
+ {"role": "user", "content":f"What describes {entity_type} in the text?"},
57
+ ]
58
+ prompt = self.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
59
+ prompts.append(prompt)
60
+
61
+ # Run inference
62
+ outputs = llm.generate(prompts, self.sampling_params)
63
+ outputs = [output.outputs[0].text for output in outputs]
64
+
65
+ # Results are returned is JSON format, parse it to python list
66
+ results = []
67
+ for lst in outputs:
68
+ try:
69
+ entities = list(set(json.loads(lst)))
70
+ except Exception:
71
+ entities = []
72
+
73
+ results.append(entities)
74
+ ```