Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it work in new Forge (Gradio 4) #215

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

altoiddealer
Copy link

Resolves #211

  • The Preview Segmentation button was using a custom instance of Toolbutton, but there is some odd issue with inheritance in new Forge. To resolve this, we just import Toolbutton from modules.ui_components found in all WebUIs. Result: works correctly in all WebUIs.
  • gr.Label has more limited value types in Gradio 4... used by sam_dummy_component, a "hidden" component. This is resolved by using another component which supports more value types, gr.Textbox. The caveat with this is the point_coords (set in sam_dummy_component) are captured as strings, so a helper function now converts these to proper lists of lists.
  • gr.Gallery now expects different input/output types... used by the update_mask function. This is resolved by identifying the Gradio version on startup, then using separate methods depending on the Gradio version.

I've tested all the functions in the main tab, in both A1111 and new Forge with these changes, and all is working expectedly.

In Gradio 4, Label only expects input types that are str, int, float, or a dict[str, float]. Therefore, we change this hidden element to a Textbox which supports more input types.  As a consequence, the points data needs a conversion to be interpreted correctly.
In Gradio 4, Gallery has different values that need to be processed differently.  The solution is to check the Gradio version and use a dedicated method.
scripts/sam.py Outdated
mask_image = image_tuple[0] # Get the image part of the tuple
if isinstance(mask_image, str): # If it's a file path
mask_image = Image.open(mask_image)
elif isinstance(mask_image, np.ndarray): # If it's an ndarray
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you seen a case where while using this extension we're stumbling upon np.ndarray or PIL images ? I've only filename strings passed here.

@chnccy
Copy link

chnccy commented Nov 13, 2024 via email

scripts/sam.py Outdated

return [blended_image, mask_image, Image.fromarray(matted_image)]

def update_mask_gr_three(mask_gallery, chosen_mask, dilation_amt, input_image):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory we could just use update_mask without splitting the functions, or the additional get_gradio_version code. Something like this.

Suggested change
def update_mask_gr_three(mask_gallery, chosen_mask, dilation_amt, input_image):
def update_mask(mask_gallery, chosen_mask, dilation_amt, input_image):
print("Dilation Amount: ", dilation_amt)
if isinstance(mask_gallery, list):
image_data = mask_gallery[chosen_mask + 3]
# In Gradio 4 or above, this is an (image, caption) tuple
if isinstance(image_data, tuple) and len(image_data) == 2:
mask_image = Image.open(image_data[0])
# In Gradio 3 this is a dict with 'name', 'data', 'is_file' keys
elif isinstance(image_data, dict) and 'name' in image_data:
mask_image = Image.open(image_data['name'])
else:
raise TypeError("Cannot locate mask image while expanding mask")
else:
mask_image = mask_gallery

This implies that we would crash if np.ndarray or PIL images are passed from a Gradio 4 gallery, but I actually haven't seen this case while using the extension.

scripts/sam.py Outdated
@@ -183,9 +222,16 @@ def create_mask_batch_output(
output_blend.save(os.path.join(dino_batch_dest_dir, f"{filename}_{idx}_blend{ext}"))


def parse_points(points):
if isinstance(points, str):
points = eval(points) # Convert string representation of list back to list
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any specific reason for the eval ? It's not great for security reasons. In theory this should be json compliant and json.loads should do the trick.

Suggested change
points = eval(points) # Convert string representation of list back to list
points = json.loads(points) # Convert string representation of list back to list

@iddl
Copy link

iddl commented Nov 13, 2024

Hey @continue-revolution ! Is there any chance you could give a review to this PR ? The extension is currently completely broken in Forge, I think getting Forge to work is important because it has a fairly decent usage.

Btw @altoiddealer, thanks so much for putting up PR ! Huge help.

@altoiddealer
Copy link
Author

altoiddealer commented Nov 13, 2024

Hey @continue-revolution ! Is there any chance you could give a review to this PR ? The extension is currently completely broken in Forge, I think getting Forge to work is important because it has a fairly decent usage.

Btw @altoiddealer, thanks so much for putting up PR ! Huge help.

Heya @iddl - about the changes that you proposed, did you test on both A1111 and Forge? I had done extensive testing back when I figured this out… lots of print statements to check value types, etc…

@iddl
Copy link

iddl commented Nov 14, 2024

@altoiddealer I tested on both A1111 and Forge.

For classic A1111 I've never been able to get gallery_values that don't look like the dictionaries with file paths (printed below)

[{'name': '/tmp/gradio/tmphf6ummqy.png', 'data': 'http://127.0.0.1:7860/file=/tmp/gradio/tmphf6ummqy.png', 'is_file': True}, ... ]

while for Forge

[('/tmp/gradio/tmpnvly1lrh.png', None),...]

If you got PIL images or nd.arrays instead of file paths, then my bad, I didn't stumble upon that scenario.

For parse_points, I've always been getting strings like this on both Forge and A1111, which should be valid JSON

[[725.3776908023484, 709.2592093933463]]

Let me know if you want any help testing specific things.

@altoiddealer
Copy link
Author

Truth is, I can’t quite remember - it’s been awhile now lol.

I’m also not an amazing coder, I had lots of help from ChatGPT trying to figure this one out.

I’ll try out those changes you proposed. If @continue-revolution doesn’t reply in a week or so, I might just fork this as a Forge variant and request Forge team to add it to their extensions tab

@iddl
Copy link

iddl commented Nov 14, 2024

@altoiddealer sounds good !

@iddl
Copy link

iddl commented Nov 20, 2024

Hey @altoiddealer, I'm happy to make the changes to your branch if you invite me as a collaborator to your fork.

If the author doesn't come online we can ask the maintainers of https://github.com/AUTOMATIC1111/stable-diffusion-webui-extensions to point to this fork, there is a note there that points out they can redirect projects to forks if the original projects stop being maintained.

@altoiddealer
Copy link
Author

@iddl thank you very much for the suggested changes. I applied them to a separate branch on my repo, then merged them to my main branch.

I tested in both A1111 and Forge, and it is working without any issues.

I requested for Forge to update their Extensions list. It really doesn't matter if A1111 does or not as the extension is working fine there.

lllyasviel/stable-diffusion-webui-forge#1754 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gradio 4. forge
3 participants