Spaces:
Running
on
Zero
Running
on
Zero
Important configuration options for inference.py:
1. General configs
Configuration | default | Explanation |
---|---|---|
--image_dir |
'./test/images/fruit.png' | Image file path |
--out_dir |
'./output' | Output directory |
--device |
'cuda:0' | The device to use |
--exp_name |
None | Experiment name, use image file name by default |
2. Point cloud render configs
The definition of world coordinate system and tips for adjusting point cloud render configs are illustrated in render document.
Configuration | default | Explanation |
---|---|---|
--mode |
'single_view_txt' | Currently we support 'single_view_txt' and 'single_view_target' |
--traj_txt |
None | Required for 'single_view_txt' mode, a txt file that specify camera trajectory |
--elevation |
5. | The elevation angle of the input image in degree. Estimate a rough value based on your visual judgment |
--center_scale |
1. | Scale factor for the spherical radius (r). By default, r is set to the depth value of the center pixel (H//2, W//2) of the reference image |
--d_theta |
10. | Required for 'single_view_target' mode, specify target theta angle as (theta + d_theta) |
--d_phi |
30. | Required for 'single_view_target' mode, specify target phi angle as (phi + d_phi) |
--d_r |
-.2 | Required for 'single_view_target' mode, specify target radius as (r + r*dr) |
3. Diffusion configs
Configuration | default | Explanation |
---|---|---|
--ckpt_path |
'./checkpoints/ViewCrafter_25.ckpt' | Checkpoint path |
--config |
'./configs/inference_pvd_1024.yaml' | Config (yaml) path |
--ddim_steps |
50 | Steps of ddim if positive, otherwise use DDPM, reduce to 10 to speed up inference |
--ddim_eta |
1.0 | Eta for ddim sampling (0.0 yields deterministic sampling) |
--bs |
1 | Batch size for inference, should be one |
--height |
576 | Image height, in pixel space |
--width |
1024 | Image width, in pixel space |
--frame_stride |
10 | Fixed |
--unconditional_guidance_scale |
7.5 | Prompt classifier-free guidance |
--seed |
123 | Seed for seed_everything |
--video_length |
25 | Inference video length, change to 16 if you use 16 frame model |
--negative_prompt |
False | Unused |
--text_input |
False | Unused |
--prompt |
'Rotating view of a scene' | Fixed |
--multiple_cond_cfg |
False | Use multi-condition cfg or not |
--cfg_img |
None | Guidance scale for image conditioning |
--timestep_spacing |
"uniform_trailing" | The way the timesteps should be scaled. Refer to Table 2 of the Common Diffusion Noise Schedules and Sample Steps are Flawed for more information. |
--guidance_rescale |
0.7 | Guidance rescale in Common Diffusion Noise Schedules and Sample Steps are Flawed |
--perframe_ae |
True | If we use per-frame AE decoding, set it to True to save GPU memory, especially for the model of 576x1024 |
--n_samples |
1 | Num of samples per prompt |