Spaces:

AmitIsraeli
/

PopYou

Running

App Files Files Community

AmitIsraeli commited on 22 days ago

Commit

7a7d1a1

•

1 Parent(s): f6d4208

change some stuff

Browse files

Files changed (1) hide show

app.py +3 -6

app.py CHANGED Viewed

@@ -88,7 +88,7 @@ class InferenceTextVAR(nn.Module):
         self.var = get_peft_model(self.var, lora_config)
     @torch.no_grad()
-    def generate_image(self, text, beta=1, seed=None, more_smooth=False, top_k=0, top_p=0.9):
         if seed is None:
             seed = random.randint(0, 2**32 - 1)
         inputs = self.text_processor([text], padding="max_length", return_tensors="pt").to(self.device)
@@ -159,9 +159,6 @@ if __name__ == '__main__':
         - **Model Fine-tuning:** Fine-tuned the [Visual AutoRegressive (VAR)](https://arxiv.org/abs/2404.02905) model, pretrained on ImageNet, to adapt it for Funko Pop! generation by injecting a custom embedding representing the "doll" class.
         - **Adapter Training:** Trained an adapter with the frozen [SigLIP image encoder](https://github.com/FoundationVision/VAR) and a lightweight LoRA module to map image embeddings to text representation in a large language model.
         - **Text-to-Image Generation:** Enabled text-to-image generation by replacing the SigLIP image encoder with its text encoder, retaining frozen components such as the VAE and generator for efficiency and quality.
-        ![VAR Explained](VAR_explained.png)
         ## Generate Your Own Funko Pop!
         """)
@@ -226,9 +223,9 @@ if __name__ == '__main__':
             image = model.generate_image(prompt)
             return image
-        famous_name_input = gr.Dropdown(choices=["None", "Donald Trump", "Johnny Depp", "Oprah Winfrey"], label="Famous Name", value="None")
         character_input = gr.Dropdown(choices=["None", "Alien", "Robot"], label="Character", value="None")
-        action_input = gr.Dropdown(choices=["None", "Playing the Guitar", "Holding the Sword"], label="Action", value="None")
         custom_generate_button = gr.Button("Generate Custom Funko Pop!")
         custom_image_output = gr.Image(label="Custom Funko Pop!")

         self.var = get_peft_model(self.var, lora_config)
     @torch.no_grad()
+    def generate_image(self, text, beta=1, seed=None, more_smooth=False, top_k=0, top_p=0.5):
         if seed is None:
             seed = random.randint(0, 2**32 - 1)
         inputs = self.text_processor([text], padding="max_length", return_tensors="pt").to(self.device)
         - **Model Fine-tuning:** Fine-tuned the [Visual AutoRegressive (VAR)](https://arxiv.org/abs/2404.02905) model, pretrained on ImageNet, to adapt it for Funko Pop! generation by injecting a custom embedding representing the "doll" class.
         - **Adapter Training:** Trained an adapter with the frozen [SigLIP image encoder](https://github.com/FoundationVision/VAR) and a lightweight LoRA module to map image embeddings to text representation in a large language model.
         - **Text-to-Image Generation:** Enabled text-to-image generation by replacing the SigLIP image encoder with its text encoder, retaining frozen components such as the VAE and generator for efficiency and quality.
         ## Generate Your Own Funko Pop!
         """)
             image = model.generate_image(prompt)
             return image
+        famous_name_input = gr.Dropdown(choices=["None", "Donald Trump", "Johnny Depp", "Oprah Winfrey,Lebron James"], label="Famous Name", value="None")
         character_input = gr.Dropdown(choices=["None", "Alien", "Robot"], label="Character", value="None")
+        action_input = gr.Dropdown(choices=["None", "Playing the Guitar", "Holding the Sword","wearing headphone"], label="Action", value="None")
         custom_generate_button = gr.Button("Generate Custom Funko Pop!")
         custom_image_output = gr.Image(label="Custom Funko Pop!")