File size: 4,520 Bytes
cc9034a a1acaf5 cc9034a a1acaf5 3035108 d3b6733 3035108 a1acaf5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 |
---
<<<<<<< HEAD
frameworks:
- Pytorch
license: other
tasks:
- image-to-video
#model-type:
##如 gpt、phi、llama、chatglm、baichuan 等
#- gpt
#domain:
##如 nlp、cv、audio、multi-modal
#- nlp
#language:
##语言代码列表 https://help.aliyun.com/document_detail/215387.html?spm=a2c4g.11186623.0.0.9f8d7467kni6Aa
#- cn
#metrics:
##如 CIDEr、Blue、ROUGE 等
#- CIDEr
#tags:
##各种自定义,包括 pretrained、fine-tuned、instruction-tuned、RL-tuned 等训练方法和其他
#- pretrained
#tools:
##如 vllm、fastchat、llamacpp、AdaSeq 等
#- vllm
---
### 当前模型的贡献者未提供更加详细的模型介绍。模型文件和权重,可浏览“模型文件”页面获取。
#### 您可以通过如下git clone命令,或者ModelScope SDK来下载模型
SDK下载
```bash
#安装ModelScope
pip install modelscope
```
```python
#SDK模型下载
from modelscope import snapshot_download
model_dir = snapshot_download('ZhipuAI/CogVideoX1.1-5B-SAT')
```
Git下载
```
#Git模型下载
git clone https://www.modelscope.cn/ZhipuAI/CogVideoX1.1-5B-SAT.git
```
<p style="color: lightgrey;">如果您是本模型的贡献者,我们邀请您根据<a href="https://modelscope.cn/docs/ModelScope%E6%A8%A1%E5%9E%8B%E6%8E%A5%E5%85%A5%E6%B5%81%E7%A8%8B%E6%A6%82%E8%A7%88" style="color: lightgrey; text-decoration: underline;">模型贡献文档</a>,及时完善模型卡片内容。</p>
=======
license: other
language:
- en
base_model:
- THUDM/CogVideoX-5b
- THUDM/CogVideoX-5b-I2V
pipeline_tag: image-to-image
---
# CogVideoX1.1-5B-SAT
<p style="text-align: center;">
<div align="center">
<img src=https://modelscope.oss-cn-beijing.aliyuncs.com/resource/cogvideologo.svg width="50%"/>
</div>
<p align="center">
<a href="https://huggingface.co/THUDM/CogVideoX1.1-5B-SAT/blob/main/README_zh.md">📄 中文阅读</a> |
<a href="https://github.com/THUDM/CogVideo">🌐 Github </a> |
<a href="https://arxiv.org/pdf/2408.06072">📜 arxiv </a>
</p>
<p align="center">
📍 Visit <a href="https://chatglm.cn/video?lang=en?fr=osm_cogvideo">QingYing</a> and <a href="https://open.bigmodel.cn/?utm_campaign=open&_channel_track_key=OWTVNma9">API Platform</a> to experience commercial video generation models.
</p>
CogVideoX is an open-source video generation model originating from [Qingying](https://chatglm.cn/video?fr=osm_cogvideo). CogVideoX1.1 is the upgraded version of the open-source CogVideoX model.
The CogVideoX1.1-5B series model supports **10-second** videos and higher resolutions. The `CogVideoX1.1-5B-I2V` variant supports **any resolution** for video generation.
This repository contains the SAT-weight version of the CogVideoX1.1-5B model, specifically including the following modules:
## Transformer
Includes weights for both I2V and T2V models. Specifically, it includes the following modules:
```
├── transformer_i2v
│ ├── 1000
│ │ └── mp_rank_00_model_states.pt
│ └── latest
└── transformer_t2v
├── 1000
│ └── mp_rank_00_model_states.pt
└── latest
```
Please select the corresponding weights when performing inference.
## VAE
The VAE part is consistent with the CogVideoX-5B series and does not require updating. You can also download it directly from here. Specifically, it includes the following modules:
```
└── vae
└── 3d-vae.pt
```
## Text Encoder
Consistent with the diffusers version of CogVideoX-5B, no updates are necessary. You can also download it directly from here. Specifically, it includes the following modules:
```
├── t5-v1_1-xxl
├── added_tokens.json
├── config.json
├── model-00001-of-00002.safetensors
├── model-00002-of-00002.safetensors
├── model.safetensors.index.json
├── special_tokens_map.json
├── spiece.model
└── tokenizer_config.json
0 directories, 8 files
```
## Model License
This model is released under the [CogVideoX LICENSE](LICENSE).
## Citation
```
@article{yang2024cogvideox,
title={CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer},
author={Yang, Zhuoyi and Teng, Jiayan and Zheng, Wendi and Ding, Ming and Huang, Shiyu and Xu, Jiazheng and Yang, Yuanming and Hong, Wenyi and Zhang, Xiaohan and Feng, Guanyu and others},
journal={arXiv preprint arXiv:2408.06072},
year={2024}
}
```
>>>>>>> d3b67333b542bd8c97cb01bbd5e89088b27a5ae6
|