需要从 cos 获取加速后的 GLM6B 模型存放: wget https://chuangxin-research-1258344705.cos.ap-guangzhou.myqcloud.com/cfs-4a8cd28be/vanewu/glm6b-kv-cache-dy-bs8.ftm?q-sign-algorithm=sha1&q-ak=AKIDBF6i7GCtKWS8ZkgOtACzX3MQDl37xYty&q-sign-time=1680756811;1689396811&q-key-time=1680756811;1689396811&q-header-list=&q-url-param-list=&q-signature=95924587fc3c8268c386db06bdb2bdb537074149