Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Soft import #9561

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Soft import #9561

wants to merge 9 commits into from

Conversation

scott-huberty
Copy link

@scott-huberty scott-huberty commented Sep 30, 2024

I haven't replaced all individual instances of try: import optional_dep ... except: raise, but wanted to get this on the board earlier rather than later so folks have time to give feedback!

  • Added a function named soft_import to xarray.core.utils. It makes use of a function you already have named module_available.
  • Anytime I needed to use soft_import more than once or twice for the same optional dependency, I created a helper function for it that is just a thin wrapper around soft_import
  • Added a test

Copy link

welcome bot commented Sep 30, 2024

Thank you for opening this pull request! It may take us a few days to respond here, so thank you for being patient.
If you have questions, some answers may be found in our contributing guidelines.

Copy link
Collaborator

@max-sixty max-sixty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems excellent!

(though others know this code better, so will wait for them to comment)

Copy link
Member

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a great contribution!

@max-sixty
Copy link
Collaborator

Can we resolve merge conflicts and then merge?

resolved conflict stemming from pydata#9505 by accepting this branches changes (pydata#9505 was just a slight edit on the old try import .. except pattern)
@scott-huberty
Copy link
Author

Can we resolve merge conflicts and then merge?

@max-sixty sorry about that! I just pushed a commit ( 3c5d35d ) to resolve the merge conflict, which stemmed from #9505. That PR had touched one of the try: import ... except lines in xarray.plot.utils

@scott-huberty scott-huberty marked this pull request as draft October 3, 2024 00:29
@scott-huberty
Copy link
Author

Marking as draft until I can look into the failures. I think the proposed soft_import mechanism trips up the static type checking..

pinging @hoechenberger as the person driving the static type checking adoption in MNE-Python.. Do you recall whether MNE's _soft_import function ever caused problems with mypy (and if so, was there a work around?) 🙂

@hoechenberger
Copy link
Contributor

@scott-huberty I don't remember details, but I do recall those soft imports to be an absolute nightmare to work with when using static type checkers, and I gave up trying to add those type hints at one point.

@headtr1ck
Copy link
Collaborator

I think it is basically impossible to have such imports work properly with static type checking. The imported modules will all be Any (ModuleType is effectively Any).

@hoechenberger
Copy link
Contributor

Yes. It also means that you won't receive useful help from your IDE during development. I'd advise against this change, but that's just my very personal opinion without knowing any of the considerations that went into this work.

@scott-huberty
Copy link
Author

Ah.. OK, thanks @hoechenberger ! Yes agreed if this proposal breaks Xarray functionality then maybe it's not a good idea after all.. I'll let the devs make the call on that.

@max-sixty
Copy link
Collaborator

Agree it doesn't make sense if we really lose all type checking. but is it really not possible? Couldn't we even have a if TYPE_CHECKING vs if not TYPE_CHECKING as a hammer?

@headtr1ck
Copy link
Collaborator

In the strict=True case one could just check if available and raise a nice error if not and then import directly afterwards.

Something like

assert_xxx_installed()
import xxx

But this ofc doesn't protect against import errors.

@hoechenberger
Copy link
Contributor

Couldn't we even have a if TYPE_CHECKING vs if not TYPE_CHECKING as a hammer?

I tried that and never got it to work. I think the language just doesn't allow for this.

@scott-huberty
Copy link
Author

scott-huberty commented Oct 3, 2024

@headtr1ck I think I see what you are suggesting. So for example if we just start out with the optional dependencies that get imported a lot across the codebase (and thus could benefit from a dedicated function; e.g. cftime), this would look like:

def check_installed(name, purpose, strict=True):
    is_installed = module_available(name)
    if not is_installed and strict:
        raise ImportError(
            f"For {purpose}, {name} is required. Please install it via pip or conda."
        )
    return is_installed

def check_cftime_installed(strict=True):
    """Check if cftime is installed, otherwise raise an ImportError."""
    purpose = "working with dates with non-standard calendars"
    return check_installed("cftime", purpose, strict=strict)


def some_function_that_needs_cftime():
    check_cftime_intalled()
    import cftime
    ...

It's basically just a wrapper around the module_available function you added. Is this what you are thinking / does this still seem worth adding?

@headtr1ck
Copy link
Collaborator

It's basically just a wrapper around the module_available function you added. Is this what you are thinking / does this still seem worth adding?

You are right. Maybe it makes more sense to add the new functionality to module_available?

Maybe the TYPE_CHECKING idea is indeed better?

if TYPE_CHECKING:
    import xxx
else:
    xxx = check_xxx_installed()

But somehow all of this is adding a lot of boilerplate for just an improved error message, but I guess that's what the issue is for...

@max-sixty
Copy link
Collaborator

One proposal, small change to the current version, retaining type checking and integration with module_available:

  • attempt_import(module, reason=None) -> Module
    • so don't have a strict flag, use module_available for those cases
    • possibly have a few common reasons which get looked up in the function so we don't need to pass "plotting" each time etc
  • surround with if TYPE_CHECKING as above

I agree it's a bit of boilerplate, but I do think the experience is much better for newer users. And if we do this across the library it'll be an easily recognizable pattern.

@scott-huberty
Copy link
Author

Thanks both! I think the TYPE_CHECKING idea would work fine as long as you all agree that the informative error is worth the extra boilerplate 🙂 I tend to err on the side of being very explicit but others may disagree.

If folks don't want the boilerplate, one last idea is something like:

def import_matplotlib():
    """Raise an ImportError if matplotlib is not installed."""
    if module_available("matplotlib"):
        import matplotlib as mpl
        return mpl
    raise ImportError("For plotting, matplotlib is required. Please install it via pip or conda.")

Which would work fine with mypy since we are using a direct import. But the downside to that is that you would need to define one of these functions for every package / submodule that you want to install. So maybe it's not the most scalable.

@headtr1ck
Copy link
Collaborator

Which would work fine with mypy since we are using a direct import.

Unfortunately that doesn't work, because the return type is simply ModuleType. One cannot annotate specific modules (at least last time I have checked).

@scott-huberty
Copy link
Author

Which would work fine with mypy since we are using a direct import.

Unfortunately that doesn't work, because the return type is simply ModuleType. One cannot annotate specific modules (at least last time I have checked).

Ah ok, you are right that annotating import_matplotlib_installed to return ModuleType, prevents a static type checker from accessing the modules attributes. FWIW When I tested the import_matplotlib example I gave, I didn't include a return type annotation - and in that case my static type checker (pylance) was able to access the attributes of the modules (and mypy wasn't throwing any errors locally..)

Screenshot 2024-10-05 at 8 51 29 AM

Anyways I am still new to mypy and so could be missing something. Happy to move forward with the proposed IF TYPE_CHECKING approach.

Per the proposal at pydata#9561 (comment)

This pairs any use of (a now simplified) `attempt_import` with a direct import of the same module, guarded by an `if TYPE_CHECKING` block.
I was able resolve the conflicts in the listed files below by accepting a combination of the changes from the main branch and this PR. Basically just stacked the changes that were recently merged to main with the changed in this PR. I did not modify the content of the changes from main at all.
@scott-huberty
Copy link
Author

scott-huberty commented Oct 25, 2024

Hello, apologies for the delay here.

Per @max-sixty 's proposal, I've re-implemented (a now simplified) attempt_import function so that it is paired with a direct import of the module (guarded by an if TYPE_CHECKING clause). At least locally, it seems to be playing well with mypy now..

Feel free to provide feedback on the current pattern, and I'll mark as ready for review once the CI's are green!

EDIT:


Here is a MWE of the new behavior, assuming you only have the base dependencies installed:

import xarray as xr

ds = xr.tutorial.scatter_example_dataset(seed=42)
ds["A"].plot()
ImportError: The matplotlib package is required for plotting but could not be imported. Please install it with your package manager (e.g. conda or pip).

Copy link
Collaborator

@max-sixty max-sixty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super!

Any objections? Otherwise let's merge...

-----
Static type checkers will not be able to infer the type of the returned module,
so it is recommended to precede this function with a direct import of the module,
guarded by an ``if TYPE_CHECKING`` block, to preserve type checker functionality.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we copy & paste an example into the docstring?

@max-sixty max-sixty added the plan to merge Final call for comments label Oct 26, 2024
Copy link
Collaborator

@headtr1ck headtr1ck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good thanks.
Only a couple of minor remarks.

if TYPE_CHECKING:
import seaborn as sns
else:
sns = attempt_import("seaborn")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before the ImportError was caught, maybe we should restore that functionality?

return importlib.import_module(module)
install_name = install_mapping.get(package_name, package_name)
reason = package_purpose.get(package_name, "")
raise ImportError(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we should try to import the package and then add the traceback to this error.
Might have some additional useful information.

xarray/tests/test_utils.py Outdated Show resolved Hide resolved
@headtr1ck headtr1ck marked this pull request as ready for review October 26, 2024 07:53
@scott-huberty
Copy link
Author

Thanks for the review and for your patience @max-sixty and @headtr1ck ! I haven't forgotten about this, I just unfortunately had to focus on a grant deadline. I should be able to return to this in the next few days.

@headtr1ck headtr1ck removed the plan to merge Final call for comments label Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Throw informative error if missing optional dependency
6 participants