Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix finding cached files for flavor in load() #471

Merged
merged 5 commits into from
Dec 6, 2024

Conversation

hagenw
Copy link
Member

@hagenw hagenw commented Nov 29, 2024

Closes #324

Ensures that media files are found in cache when using audb.load(..., format=format) and format is different from the original format of the media files. This is achieved by replacing the file extension of the original file (original file names are given by dependency table) by the given format when checking if the file exists in the cache. The pull request also adds a test for the expected behavior, which is failing for the current main branch.

/cc @ureichel

@hagenw hagenw marked this pull request as ready for review November 29, 2024 13:11
@audeering audeering deleted a comment from sourcery-ai bot Nov 29, 2024
Copy link
Contributor

sourcery-ai bot commented Nov 29, 2024

Reviewer's Guide by Sourcery

The PR fixes a bug in the cache file lookup logic when loading media files with a different format than the original. The implementation introduces a new helper function is_cached() that handles different file type scenarios and properly considers format conversion when checking for cached media files.

Class diagram for updated cache file lookup logic

classDiagram
    class Load {
        +_missing_files(files, files_type, db_root, flavor, verbose)
    }
    class Load {
        +is_cached(file) bool
    }
    Load --|> is_cached : uses

    note for is_cached "Helper function to check if a file is cached, considering format conversion for media files."
Loading

File-Level Changes

Change Details Files
Refactored file existence checking logic into a dedicated function
  • Introduced new is_cached() helper function to encapsulate file existence logic
  • Simplified main function by using list comprehension with the new helper
  • Removed explicit missing_files list manipulation
audb/core/load.py
Added support for checking cached files with different formats
  • Added special handling for media files when flavor format is specified
  • Uses audeer.replace_file_extension to check for files with converted format
audb/core/load.py
Added test coverage for format conversion cache scenario
  • Added test case to verify cache lookup with different media formats
  • Tests conversion from WAV to FLAC format
tests/test_load.py

Assessment against linked issues

Issue Objective Addressed Explanation
#324 Fix caching behavior when loading media files with a different format than the original

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @hagenw - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@hagenw hagenw requested a review from ChristianGeng November 29, 2024 13:13
Copy link
Member

@ChristianGeng ChristianGeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

The changes are quite localized: the private function missing_files is exclusively used by another private function _load_files.

Therefore, for reviewing it makes sense to focus on the tests.
As the method is directly callable it also can be tested in isolation.

The only concern one might raise is that
audb.core.load._missing_files is only called with one parameter combination.
I am assuming that this is ok and will approve the changes.

@hagenw
Copy link
Member Author

hagenw commented Dec 6, 2024

The other parameter combinations are tested implicitly by other functions. Usually, we do not write extra tests for private functions, but the exceptions are if there concrete bugs that need to be fixed as it is the case here. Hence, I added an explicite test for audb.core.load._missing_files.

@hagenw hagenw merged commit 675c485 into main Dec 6, 2024
8 checks passed
@hagenw hagenw deleted the fix-finding-missing-files-for-flavor branch December 6, 2024 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Media conversion in audb.load() does not load from cache
2 participants