mpt-7b-gsm8k-dummy / README.md
yujiepan's picture
Update README.md
2d7044b verified
metadata
pipeline_tag: text-generation
tags:
  - openvino
  - mpt
  - sparse
  - quantization
library_name: OpenVINO

See the benchmark scripts in this repo.

pip install deepsparse-nightly[llm]==1.6.0.20231120
pip install openvino==2023.3.0

Benchmarking

  1. Clone this repo
  2. Concatenate the big fp32 IR model:
cd ./models/neuralmagic/mpt-7b-gsm8k-pt/fp32
cat openvino_model.bin.part-a* > openvino_model.bin
  1. Reproduce NM paper: deepsparse_reproduce.bash
  2. OV benchmarkapp: benchmarkapp_*.bash

Generating these IRs

https://github.com/yujiepan-work/24h1-sparse-quantized-llm-ov