Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add archive reading support #192

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

yggdrasil75
Copy link
Contributor

This is for 2 purposes: saving space, and just in general adding support for comics. This is a draft as I want to figure out how calibre stores comic metadata first and add that properly when embed option is enabled.

in addition: this changes the settings screen quite a bit because I thought I was adding too many options to leave it as 1 screen.
I can probably split the 2 changes if you want better tracking of those changes, and with the settings I do plan to move some things in auto tagging to there that wont change more than once a session (ie: captioning device wont be changed often)

@jhc13
Copy link
Owner

jhc13 commented Jun 11, 2024

Splitting the settings into multiple tabs could be useful in the future.

However, support for comics is definitely out of scope for the program.

@yggdrasil75
Copy link
Contributor Author

I will split the prs then.

however on comics: would a more generic "index archives" option be more in the scope, and users could include comic book archives in their list of index archive formats? (since cbz is zip renamed, etc). this request is for 2 reasons: archives of comics (especially ones with consistent formatting) are much smaller on disk even with minimal compression, and it will help quite a bit organizing a calibre comic library or similar ebook library.

@yggdrasil75
Copy link
Contributor Author

also I did definitely grab the wrong source revision when making this pr. so many conflicts.

included extraneous files by accident
@jhc13
Copy link
Owner

jhc13 commented Jun 11, 2024

however on comics: would a more generic "index archives" option be more in the scope, and users could include comic book archives in their list of index archive formats? (since cbz is zip renamed, etc).

If they are simple zip files containing images (without special processing required for comic formats), it could be in scope. But I'm not sure if displaying the images, adding caption text files, etc., without decompressing the archives will be easy to achieve while also not being too slow.

and it will help quite a bit organizing a calibre comic library or similar ebook library.

As mentioned in the README, TagGUI is designed for managing image datasets for generative AI models. Other use cases are not specifically supported.

@yggdrasil75
Copy link
Contributor Author

comics require no special processing. its literally just zip archive but .cbz instead of .zip. (same with cbr and rar, cb7 and 7z, cbt and tar.gz)
the archive could be decompressed in memory (depending on available ram and size of archive) with minimal writing to disk (if I use the proper library) then drop all extra files from memory via automatic garbage collection (ie: unsupported .comicinfo file and thumbs.db and so on)
the only real issue would be writing. especially if the archive is solid or heavily compressed. if its base windows .zip then anything newer than a i5 4th gen can probably do it at near the same rate as uncompressed, but solid block archive methods (manual settings on a 7z) would start costing processing time. leaving it as an option would allow users to disable it if it slows down everything.

@yggdrasil75 yggdrasil75 changed the title add comics add archive reading support Jun 11, 2024
@jhc13
Copy link
Owner

jhc13 commented Jun 11, 2024

Alright. If everything works smoothly, it could be a useful addition.

I'm just slightly worried that it might end up being too slow and it would be a lot of wasted effort.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants