Partial Loading PR1: Tidy ModelCache #7492

RyanJDick · 2024-12-23T16:46:33Z

Summary

This PR tidies up the model cache code in preparation for further refactoring to support partial loading of models onto the GPU. These code changes should not change the functional behavior in any way.

Changes:

Remove the ModelCacheBase class. ModelCache is the only implementation, so there is no benefit to the separate abstract class.
Split CacheRecord and CacheStats out into their own files.
Remove the ModelLocker class. This extra layer of indirection was not providing any benefit. Locking is now done directly with the ModelCache.
Tidy up relative imports that were contributing to circular import issues.
Pull the 'submodel' concern out of the ModelCache. The ModelCache should not need to be aware of the model manager submodel system.
Delete unused properties from the ModelCache (e.g. .lazy_offloading, .storage_device, etc.)

QA Instructions

I ran smoke tests with a variety of SD1, SDXL and FLUX models. No change to behavior is expected.

Merge Plan

Checklist

The PR has a short but descriptive title, suitable for a changelog
Tests added / updated (if applicable)
Documentation added / updated (if applicable)
Updated What's New copy (if doing a release after this PR)

…d not be concerned with implementation details like the submodel_type.

…he and make a bunch of ModelCache properties/methods private.

…y().

github-actions bot added api python PRs that change python files backend PRs that change backend files services PRs that change app services python-tests PRs that change python tests docs PRs that change docs labels Dec 23, 2024

RyanJDick mentioned this pull request Dec 23, 2024

Partial Loading PR2: Add utils to support partial loading of models from CPU to GPU #7494

Open

8 tasks

RyanJDick marked this pull request as ready for review December 23, 2024 23:32

RyanJDick requested review from lstein, blessedcoolant, brandonrising, hipsterusername and psychedelicious as code owners December 23, 2024 23:32

hipsterusername approved these changes Dec 24, 2024

View reviewed changes

RyanJDick enabled auto-merge (rebase) December 24, 2024 14:21

RyanJDick disabled auto-merge December 24, 2024 14:22

RyanJDick added 9 commits December 24, 2024 14:23

Rip out ModelLockerBase.

e48dee4

Move CacheRecord out to its own file.

ce11a19

Move CacheStats to its own file.

83ea642

Remove ModelCacheBase.

e0bfa61

Rename model_cache_default.py -> model_cache.py.

d30a9ce

Pull get_model_cache_key(...) out of ModelCache. The ModelCache shoul…

a7c7299

…d not be concerned with implementation details like the submodel_type.

Move lock(...) and unlock(...) logic from ModelLocker to the ModelCac…

a39bcf7

…he and make a bunch of ModelCache properties/methods private.

Get rid of ModelLocker. It was an unnecessary layer of indirection.

7dc3e0f

(minor) Add TODO comment regarding the location of get_model_cache_ke…

55b13c1

…y().

RyanJDick force-pushed the ryan/model-offload-1-tidy branch from 6b2a3b2 to 55b13c1 Compare December 24, 2024 14:23

RyanJDick merged commit d3916db into main Dec 24, 2024
15 checks passed

RyanJDick deleted the ryan/model-offload-1-tidy branch December 24, 2024 14:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partial Loading PR1: Tidy ModelCache #7492

Partial Loading PR1: Tidy ModelCache #7492

RyanJDick commented Dec 23, 2024 •

edited

Loading

Partial Loading PR1: Tidy ModelCache #7492

Partial Loading PR1: Tidy ModelCache #7492

Conversation

RyanJDick commented Dec 23, 2024 • edited Loading

Summary

QA Instructions

Merge Plan

Checklist

RyanJDick commented Dec 23, 2024 •

edited

Loading