Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Micro optimize dataset.isel for speed on large datasets #9003

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Commits on Jun 22, 2024

  1. Micro optimize dataset.isel for speed on large datasets

    This targets optimization for datasets with many "scalar" variables
    (that is variables without any dimensions). This can happen in the
    context where you have many pieces of small metadata that relate to
    various facts about an experimental condition.
    
    For example, we have about 80 of these in our datasets (and I want to
    incrase this number)
    
    Our datasets are quite large (On the order of 1TB uncompresed) so we
    often have one dimension that is in the 10's of thousands.
    
    However, it has become quite slow to index in the dataset.
    
    We therefore often "carefully slice out the matadata we need" prior to
    doing anything with our dataset, but that isn't quite possible with you
    want to orchestrate things with a parent application.
    
    These optimizations are likely "minor" but considering the results of
    the benchmark, I think they are quite worthwhile:
    
    * main (as of pydata#9001) - 2.5k its/s
    * With pydata#9002 - 4.2k its/s
    * With this Pull Request (on top of pydata#9002) -- 6.1k its/s
    
    Thanks for considering.
    hmaarrfk committed Jun 22, 2024
    Configuration menu
    Copy the full SHA
    1f9eb68 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    53dc76e View commit details
    Browse the repository at this point in the history
  3. Add more comment

    hmaarrfk committed Jun 22, 2024
    Configuration menu
    Copy the full SHA
    b016dce View commit details
    Browse the repository at this point in the history
  4. Add whats-new

    hmaarrfk committed Jun 22, 2024
    Configuration menu
    Copy the full SHA
    4e0701a View commit details
    Browse the repository at this point in the history
  5. update release note

    hmaarrfk committed Jun 22, 2024
    Configuration menu
    Copy the full SHA
    5bcf40c View commit details
    Browse the repository at this point in the history
  6. cleanup

    dcherian authored and hmaarrfk committed Jun 22, 2024
    Configuration menu
    Copy the full SHA
    f7945a3 View commit details
    Browse the repository at this point in the history
  7. Revert "cleanup"

    This reverts commit f7945a3.
    hmaarrfk committed Jun 22, 2024
    Configuration menu
    Copy the full SHA
    1c2e5c6 View commit details
    Browse the repository at this point in the history

Commits on Sep 18, 2024

  1. Configuration menu
    Copy the full SHA
    fac4178 View commit details
    Browse the repository at this point in the history