aidealab
/

AIdeaLab-VideoJP

+---
+datasets:
+- HuggingFaceFV/finevideo
+- LanguageBind/Open-Sora-Plan-v1.0.0
+language:
+- ja
+- en
+library_name: diffusers
+license: apache-2.0
+pipeline_tag: text-to-video
+tags:
+- art
+---
+# Model Card for CommonVideo
+This is a text-to-video model learning from CC-BY, CC-0 like images.
+## Model Details
+### Model Description
+At AI Picasso, we develop AI technology through active dialogue with creators, aiming for mutual understanding and cooperation.
+We strive to solve challenges faced by creators and grow together.
+One of these challenges is that some creators and fans want to use image generation but can't, likely due to the lack of permission to use certain images for training.
+To address this issue, we have developed CommonVideo.
+#### Features of CommonVideo
+- Principally uses images with obtained learning permissions
+- Understands both Japanese and English text inputs directly
+- Minimizes the risk of exact reproduction of training images
+- Utilizes cutting-edge technology for high quality and efficiency
+### Misc.
+- **Developed by:** alfredplpl, matty
+- **Funded by:** AIdeaLab, Inc.
+- **Shared by:** AI Picasso, Inc.
+- **Model type:** Rectified Flow Transformer
+- **Language(s) (NLP):** Japanese, English
+- **License:** Apache-2.0
+### Model Sources
+- **Repository:** TBA
+- **Paper :** TBA
+## How to Get Started with the Model
+- diffusers for 16GB+ VRAM GPU
+1. Install libraries.
+```bash
+pip install transformers diffusers
+```
+2. Run the following script
+```python
+TBA
+```
+## Uses
+### Direct Use
+- Assistance in creating illustrations, manga, and anime
+  - For both commercial and non-commercial purposes
+  - Communication with creators when making requests
+- Commercial provision of image generation services
+  - Please be cautious when handling generated content
+- Self-expression
+  - Using this AI to express "your" uniqueness
+- Research and development
+  - Fine-tuning (also known as additional training) such as LoRA
+  - Merging with other models
+  - Examining the performance of this model using metrics like FID
+- Education
+  - Graduation projects for art school or vocational school students
+  - University students' graduation theses or project assignments
+  - Teachers demonstrating the current state of image generation AI
+- Uses described in the Hugging Face Community
+  - Please ask questions in Japanese or English
+### Out-of-Scope Use
+- Generate misinfomation or disinformation.
+## Bias, Risks, and Limitations
+TBA
+## Training Details
+### Training Data
+We used these dataset to train the transformer:
+- [Pixabay](https://huggingface.co/datasets/LanguageBind/Open-Sora-Plan-v1.0.0)
+- [FineVideo](https://huggingface.co/datasets/HuggingFaceFV/finevideo)
+## Technical Specifications
+### Model Architecture and Objective
+## Model Architecture
+[CogVideoX based architecture](https://github.com/THUDM/CogVideo)
+## Objective
+[]
+### Compute Infrastructure
+Google Cloud (Tokyo Region).
+#### Hardware
+We used NVIDIA L4x8 instance 4 nodes. (Total: L4x32)
+#### Software
+[Finetrainers based code](https://github.com/a-r-r-o-w/finetrainers)
+## Model Card Contact
+- [Contact page](https://aidealab.com/contact)
+# Acknowledgement
+We approciate the video providers.
+So, we are **standing on the shoulders of giants**.