Papers
arxiv:2406.16273

YouDream: Generating Anatomically Controllable Consistent Text-to-3D Animals

Published on Jun 24
· Submitted by oindrila13saha on Jun 26
#2 Paper of the day
Authors:

Abstract

3D generation guided by text-to-image diffusion models enables the creation of visually compelling assets. However previous methods explore generation based on image or text. The boundaries of creativity are limited by what can be expressed through words or the images that can be sourced. We present YouDream, a method to generate high-quality anatomically controllable animals. YouDream is guided using a text-to-image diffusion model controlled by 2D views of a 3D pose prior. Our method generates 3D animals that are not possible to create using previous text-to-3D generative methods. Additionally, our method is capable of preserving anatomic consistency in the generated animals, an area where prior text-to-3D approaches often struggle. Moreover, we design a fully automated pipeline for generating commonly found animals. To circumvent the need for human intervention to create a 3D pose, we propose a multi-agent LLM that adapts poses from a limited library of animal 3D poses to represent the desired animal. A user study conducted on the outcomes of YouDream demonstrates the preference of the animal models generated by our method over others. Turntable results and code are released at https://youdream3d.github.io/

Community

Paper author Paper submitter

TLDR;

YouDream: Generating Anatomically Controllable Consistent Text-to-3D Animals

YouDream achieves multi-view consistency without being trained on 3D datasets and generates imaginary assets that are impossible to make using prior methods.

A 2D TetraPose ControlNet guides the 3D generation, thus implicitly encoding both pose and camera angle in the control image. The ControlNet is trained to generate tetrapod animals such as birds, reptiles, amphibians, and mammals given an input 2D pose image.

Excitingly, YouDream can generate imaginary animals never before seen based on an artist’s designed 3D pose. Previous methods do not faithfully follow the text in the case of these low-represented scenarios, while YouDream can generate compelling creative 3D assets.

Code will be released soon at this link: https://github.com/YouDream3D/YouDream/
For more results and visual comparisons check out our webpage: https://youdream3d.github.io

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2406.16273 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2406.16273 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2406.16273 in a Space README.md to link it from this page.

Collections including this paper 4