---
license: openrail
pipeline_tag: text-to-image
datasets:
- HuggingFaceTB/everyday-conversations-llama3.1-2k
language:
- ab
metrics:
- bertscore
base_model: microsoft/Phi-3.5-vision-instruct
library_name: espnet
tags:
- art
---
# ImageDream Model Card

This model card focuses on the model associated with the [ImageDream paper](https://image-dream.github.io/)

See also: https://github.com/ByteDance/ImageDream for code base.

## Description of Files

sd-v2.1-base-4view-ipmv.pt

- the ImageDream-Pixel diffusion model fine-tuned from [MVDream v2.1](https://huggingface.co/MVDream/MVDream)

sd-v2.1-base-4view-ipmv-local.pt

- the ImageDream diffusion model without pixel-controller tuned from [MVDream v2.1](https://huggingface.co/MVDream/MVDream)

## Citation

```
@article{wang2023imagedream,
  title={ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation},
  author={Wang, Peng and Shi, Yichun},
  journal={arXiv preprint arXiv:2312.02201},
  year={2023}
}
```

## Misuse, Malicious Use, and Out-of-Scope Use

The model should not be used to intentionally create or disseminate images that create hostile or alienating environments for people. This includes generating images that people would foreseeably find disturbing, distressing, or offensive; or content that propagates historical or current stereotypes.