Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add option to download default models #200

Closed
wants to merge 2 commits into from
Closed

add option to download default models #200

wants to merge 2 commits into from

Conversation

ljchang
Copy link
Member

@ljchang ljchang commented Jan 2, 2024

This PR adds an option when installing py-feat via pip to also download the default models.

pip install py-feat[default_models]

@ejolly Let's make sure this works first before merging as I havent really tested it yet.

@ljchang ljchang requested a review from ejolly January 2, 2024 17:53
@ejolly
Copy link
Contributor

ejolly commented Mar 29, 2024

@ljchang Unfortunately this isn't going to work because you can only include package names in extras_require.
In fact pip doesn't seem to support any kind of post-install that isn't simply installing other packages due to security issues. That's also why they suggest including any package data within the package if you need it. I don't think that makes sense for us as our pip installs would be huge and would tie model weights to package versions.

I've added an alternative solution, which is a compromise, but still a little annoying:

  1. User pip install py-feat
  2. User runs feat_get_models command their terminal which will be automatically setup after they pip install

It's not that different that simply downloading the models on first run of Detector, so I'm torn about whether it's worth adding. What do you think?

@ljchang
Copy link
Member Author

ljchang commented Jul 18, 2024

@ejolly, I've only scratched the surface of my deep dive into hugging face repositories, but I definitely think this is the way to go. I'm going to keep adding notes here as I learn more.

  • I've created an organization for the lab to host datasets or model repositories.
  • models and datasets can be public or private and can solicit community feedback or block it.
  • models and datasets can be versioned
  • webhooks are possible . One thing I've wanted for a long time is to build a benchmarking server, which I think will be possible with hugging face. We can post our test data as private to hugging face (our EULAs prevent us from making it public). Everytime a model is updated or a new one is added, we can add a webhook to run our benchmarking tests on that model or all of them. Honestly, I don't care if we have to pay for compute time on one of their spaces, this would be amazing and would enable a living benchmark for py-feat.
  • there is a python cli for working with repositories and model i/o.
  • models can be standalone and dowloaded, OR they can be integrated into a code repository . I think we would want to do this so you can download models from py-feat, just like you can from the transformers library.
  • jupyter notebooks can be rendered and linked to colab. This could be nice for demos or tutorials
  • We should do a deepdive into the possibility of porting py-feat to be an integrated library
  • There are widgets to create live demos for each model. Not sure this will work for us or not.

@ljchang ljchang self-assigned this Jul 18, 2024
@ljchang
Copy link
Member Author

ljchang commented Aug 5, 2024

this is addressed in issue #221

@ejolly
Copy link
Contributor

ejolly commented Oct 19, 2024

Subsumed by #228

@ejolly ejolly closed this Oct 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants