Spaces:

Doubiiu
/

ViewCrafter

Running on Zero

App Files Files Community

ViewCrafter / docs /config_help.md

Drexubery

update

df13f4b 2 months ago

preview code

raw

history blame

3.21 kB

	## Important configuration options for [inference.py](../inference.py):

	### 1. General configs
	\| Configuration \| default \| Explanation \|
	\|:------------- \|:----- \| :------------- \|
	\| `--image_dir` \| './test/images/fruit.png' \| Image file path \|
	\| `--out_dir` \| './output' \| Output directory \|
	\| `--device` \| 'cuda:0' \| The device to use \|
	\| `--exp_name` \| None \| Experiment name, use image file name by default \|
	### 2. Point cloud render configs
	#### The definition of world coordinate system and tips for adjusting point cloud render configs are illustrated in [render document](./render_help.md).
	\| Configuration \| default \| Explanation \|
	\|:------------- \|:----- \| :------------- \|
	\| `--mode` \| 'single_view_txt' \| Currently we support 'single_view_txt' and 'single_view_target' \|
	\| `--traj_txt` \| None \| Required for 'single_view_txt' mode, a txt file that specify camera trajectory \|
	\| `--elevation` \| 5. \| The elevation angle of the input image in degree. Estimate a rough value based on your visual judgment \|
	\| `--center_scale` \| 1. \| Scale factor for the spherical radius (r). By default, r is set to the depth value of the center pixel (H//2, W//2) of the reference image \|
	\| `--d_theta` \| 10. \| Required for 'single_view_target' mode, specify target theta angle as (theta + d_theta) \|
	\| `--d_phi` \| 30. \| Required for 'single_view_target' mode, specify target phi angle as (phi + d_phi) \|
	\| `--d_r` \| -.2 \| Required for 'single_view_target' mode, specify target radius as (r + r*dr) \|
	### 3. Diffusion configs
	\| Configuration \| default \| Explanation \|
	\|:------------- \|:----- \| :------------- \|
	\| `--ckpt_path` \| './checkpoints/ViewCrafter_25.ckpt' \| Checkpoint path \|
	\| `--config` \| './configs/inference_pvd_1024.yaml' \| Config (yaml) path \|
	\| `--ddim_steps` \| 50 \| Steps of ddim if positive, otherwise use DDPM, reduce to 10 to speed up inference \|
	\| `--ddim_eta` \| 1.0 \| Eta for ddim sampling (0.0 yields deterministic sampling) \|
	\| `--bs` \| 1 \| Batch size for inference, should be one \|
	\| `--height` \| 576 \| Image height, in pixel space \|
	\| `--width` \| 1024 \| Image width, in pixel space \|
	\| `--frame_stride` \| 10 \| Fixed \|
	\| `--unconditional_guidance_scale` \| 7.5 \| Prompt classifier-free guidance \|
	\| `--seed` \| 123 \| Seed for seed_everything \|
	\| `--video_length` \| 25 \| Inference video length, change to 16 if you use 16 frame model \|
	\| `--negative_prompt` \| False \| Unused \|
	\| `--text_input` \| False \| Unused \|
	\| `--prompt` \| 'Rotating view of a scene' \| Fixed \|
	\| `--multiple_cond_cfg` \| False \| Use multi-condition cfg or not \|
	\| `--cfg_img` \| None \| Guidance scale for image conditioning \|
	\| `--timestep_spacing` \| "uniform_trailing" \| The way the timesteps should be scaled. Refer to Table 2 of the [Common Diffusion Noise Schedules and Sample Steps are Flawed](https://huggingface.co/papers/2305.08891) for more information. \|
	\| `--guidance_rescale` \| 0.7 \| Guidance rescale in [Common Diffusion Noise Schedules and Sample Steps are Flawed](https://huggingface.co/papers/2305.08891) \|
	\| `--perframe_ae` \| True \| If we use per-frame AE decoding, set it to True to save GPU memory, especially for the model of 576x1024 \|
	\| `--n_samples` \| 1 \| Num of samples per prompt \|

	## Important configuration options for [inference.py](../inference.py):

	### 1. General configs
	\| Configuration \| default \| Explanation \|
	\|:------------- \|:----- \| :------------- \|
	\| `--image_dir` \| './test/images/fruit.png' \| Image file path \|
	\| `--out_dir` \| './output' \| Output directory \|
	\| `--device` \| 'cuda:0' \| The device to use \|
	\| `--exp_name` \| None \| Experiment name, use image file name by default \|
	### 2. Point cloud render configs
	#### The definition of world coordinate system and tips for adjusting point cloud render configs are illustrated in [render document](./render_help.md).
	\| Configuration \| default \| Explanation \|
	\|:------------- \|:----- \| :------------- \|
	\| `--mode` \| 'single_view_txt' \| Currently we support 'single_view_txt' and 'single_view_target' \|
	\| `--traj_txt` \| None \| Required for 'single_view_txt' mode, a txt file that specify camera trajectory \|
	\| `--elevation` \| 5. \| The elevation angle of the input image in degree. Estimate a rough value based on your visual judgment \|
	\| `--center_scale` \| 1. \| Scale factor for the spherical radius (r). By default, r is set to the depth value of the center pixel (H//2, W//2) of the reference image \|
	\| `--d_theta` \| 10. \| Required for 'single_view_target' mode, specify target theta angle as (theta + d_theta) \|
	\| `--d_phi` \| 30. \| Required for 'single_view_target' mode, specify target phi angle as (phi + d_phi) \|
	\| `--d_r` \| -.2 \| Required for 'single_view_target' mode, specify target radius as (r + r*dr) \|
	### 3. Diffusion configs
	\| Configuration \| default \| Explanation \|
	\|:------------- \|:----- \| :------------- \|
	\| `--ckpt_path` \| './checkpoints/ViewCrafter_25.ckpt' \| Checkpoint path \|
	\| `--config` \| './configs/inference_pvd_1024.yaml' \| Config (yaml) path \|
	\| `--ddim_steps` \| 50 \| Steps of ddim if positive, otherwise use DDPM, reduce to 10 to speed up inference \|
	\| `--ddim_eta` \| 1.0 \| Eta for ddim sampling (0.0 yields deterministic sampling) \|
	\| `--bs` \| 1 \| Batch size for inference, should be one \|
	\| `--height` \| 576 \| Image height, in pixel space \|
	\| `--width` \| 1024 \| Image width, in pixel space \|
	\| `--frame_stride` \| 10 \| Fixed \|
	\| `--unconditional_guidance_scale` \| 7.5 \| Prompt classifier-free guidance \|
	\| `--seed` \| 123 \| Seed for seed_everything \|
	\| `--video_length` \| 25 \| Inference video length, change to 16 if you use 16 frame model \|
	\| `--negative_prompt` \| False \| Unused \|
	\| `--text_input` \| False \| Unused \|
	\| `--prompt` \| 'Rotating view of a scene' \| Fixed \|
	\| `--multiple_cond_cfg` \| False \| Use multi-condition cfg or not \|
	\| `--cfg_img` \| None \| Guidance scale for image conditioning \|
	\| `--timestep_spacing` \| "uniform_trailing" \| The way the timesteps should be scaled. Refer to Table 2 of the [Common Diffusion Noise Schedules and Sample Steps are Flawed](https://huggingface.co/papers/2305.08891) for more information. \|
	\| `--guidance_rescale` \| 0.7 \| Guidance rescale in [Common Diffusion Noise Schedules and Sample Steps are Flawed](https://huggingface.co/papers/2305.08891) \|
	\| `--perframe_ae` \| True \| If we use per-frame AE decoding, set it to True to save GPU memory, especially for the model of 576x1024 \|
	\| `--n_samples` \| 1 \| Num of samples per prompt \|