Dataset |
Task/Source |
Size |
Scenario |
DISC-LawLLM-SFT-Pair |
Legal information extraction |
32K |
Legal professional assistant |
Legal event detection |
27K |
Legal case classification |
20K |
Legal judgement prediction |
11K |
Legal case matching |
8K |
Legal text summarization |
9K |
Judicial public opinion summarization |
6K |
Legal question answering |
93K |
Legal consultation services |
Legal reading comprehension |
38K |
Judicial examination assistant |
Judicial examination |
12K |
DISC-LawLLM-SFT-Triple |
Legal judgement prediction |
16K |
Legal professional assistant |
Legal question answering |
23K |
Legal consultation services |
General |
Alpaca-GPT4 |
48K |
General scenarios |
Firefly |
60K |
Total |
403K |
# Using through hugging face transformers
```python
>>>import torch
>>>>>>from transformers import AutoModelForCausalLM, AutoTokenizer
>>>from transformers.generation.utils import GenerationConfig
>>>tokenizer = AutoTokenizer.from_pretrained("ShengbinYue/DISC-LawLLM", use_fast=False, trust_remote_code=True)
>>>model = AutoModelForCausalLM.from_pretrained("ShengbinYue/DISC-LawLLM", device_map="auto", torch_dtype=torch.float16, trust_remote_code=True)
>>>model.generation_config = GenerationConfig.from_pretrained("ShengbinYue/DISC-LawLLM")
>>>messages = []
>>>messages.append({"role": "user", "content": "生产销售假冒伪劣商品罪如何判刑?"})
>>>response = model.chat(tokenizer, messages)
>>>print(response)
```
## Disclaimer
DISC-LawLLM comes with issues and limitations that current LLMs have yet to overcome. While it can provide Chinese legal services in many a wide variety of tasks and scenarios, the model should be used for reference purposes only and cannot replace professional lawyers and legal experts. We encourage users of DISC-LawLLM to evaluate the model critically. We do not take responsibility for any issues, risks, or adverse consequences that may arise from the use of DISC-LawLLM.
## Citation
If our work is helpful for your, please kindly cite our work as follows:
```
@misc{yue2023disclawllm,
title={DISC-LawLLM: Fine-tuning Large Language Models for Intelligent Legal Services},
author={Shengbin Yue and Wei Chen and Siyuan Wang and Bingxuan Li and Chenchen Shen and Shujun Liu and Yuxuan Zhou and Yao Xiao and Song Yun and Wei Lin and Xuanjing Huang and Zhongyu Wei},
year={2023},
eprint={2309.11325},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
## License
DISC-LawLLM is available under the Apache License.