daisd-ai
/

UniNER-W4A16

named entity recognition

compressed-tensors

Model card Files Files and versions Community

arynkiewicz commited on 29 days ago

Commit

ae553a7

•

1 Parent(s): 379f8d5

Create README.md

Files changed (1) hide show

README.md +74 -0

README.md ADDED Viewed

	@@ -0,0 +1,74 @@

+---
+base_model: Universal-NER/UniNER-7B-all
+tags:
+- named entity recognition
+- ner
+model-index:
+- name: daisd-ai/UniNER-W4A16
+  results: []
+license: cc-by-nc-4.0
+inference: false
+---
+## Introduction
+This model is quantized version of [Universal-NER/UniNER-7B-all](https://huggingface.co/Universal-NER/UniNER-7B-all).
+## Quantization
+The quantization was applied using [LLM Compressor](https://github.com/vllm-project/llm-compressor) with 512 random examples from [Universal-NER/Pile-NER-definition](https://huggingface.co/datasets/Universal-NER/Pile-NER-definition) dataset.
+The recipe for quantization:
+```python
+recipe = [
+    SmoothQuantModifier(smoothing_strength=0.8),
+    GPTQModifier(targets="Linear", scheme="W4A16", ignore=["lm_head"]),
+]
+```
+## Inference
+We added chat template for the tokenizer, thus it can be directly used with vLLM without any other preprocessing compered to original model.
+Example:
+```python
+import json
+from vllm import LLM, SamplingParams
+# Loading model
+llm = LLM(model="daisd-ai/UniNER-W4A16")
+sampling_params = SamplingParams(temperature=0, max_tokens=256)
+# Define text and entities types
+text = "Some long text with multiple entities"
+entities_types = ["entity type 1", "entity type 2"]
+# Applying tokenizer
+prompts = []
+for entity_type in entities_types:
+    messages = [
+        {
+            "role": "user",
+            "content": f"Text: {text}",
+        },
+        {"role": "assistant", "content": "I've read this text."},
+        {"role": "user", "content":f"What describes {entity_type} in the text?"},
+    ]
+    prompt = self.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+    prompts.append(prompt)
+# Run inference
+outputs = llm.generate(prompts, self.sampling_params)
+outputs = [output.outputs[0].text for output in outputs]
+# Results are returned is JSON format, parse it to python list
+results = []
+for lst in outputs:
+    try:
+        entities = list(set(json.loads(lst)))
+    except Exception:
+        entities = []
+    results.append(entities)
+```