Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching of assets? #707

Open
siefkenj opened this issue Mar 19, 2024 · 10 comments
Open

Caching of assets? #707

siefkenj opened this issue Mar 19, 2024 · 10 comments

Comments

@siefkenj
Copy link
Contributor

Currently there is some support for rebuilding assets only if they've changed, but it seems to rely on document structure. Since assets are extracted and them compiled in isolation, I imagine if you stored <md5sum>.svg files in some .cache folder, you could just detect if the asset contents was the same and copy over the cached version instead of running compile again. This method would not rely on document structure at all.

@StevenClontz
Copy link
Member

+1

So we have an element like <latex-image xml:id="bar">FOO</latex-image>, we checksum FOO to abc123, then save the result to .cache/latex-image/abc123.svg as well as generated-assets/latex-image/bar.svg. Then on future builds, we simply copy .cache/latex-image/abc123.svg to generated-assets/latex-image/bar.svg (or wherever it should be, in case the filename changes.

@rbeezer
Copy link
Collaborator

rbeezer commented Mar 19, 2024 via email

@oscarlevin
Copy link
Member

I'm not sure I understand what issue this resolves. Currently, If you have an asset with xml:id="bar" (or if bar is the id of the youngest ancestor of the asset that has an xml:id), then we store the hash of the asset with the xml:id. If the author changes the asset, then the hashes won't match, so we ask for the asset to be regenerated (and put into the generated-assets).

With this proposal, we keep a copy of the generated asset in .cache. If the author changes the asset, the hash will no longer match, so we regenerate the asset (an put it in .cache and generated-assets).

In both cases, if the asset isn't changed, nothing gets regenerated.

Last case: the asset isn't changed, but the xml:id is changed. Now, the asset is regenerated. Under the proposal, the asset isn't regenerated, but a new copy is made with the new name. I see there is an advantage here, but the disadvantage is keeping every version of the generated asset in the cache and copying over every asset from the cache to generated-assets.

What am I missing?

@StevenClontz
Copy link
Member

Another potential use-case: user has <latex-image xml:id="foo">BAR</latex-image> and later <latex-image xml:id="baz">BAR</latex-image>. Maybe it's an anti-pattern that should have been solved with an xref but this would avoid building the same image twice.

@siefkenj
Copy link
Contributor Author

This would also mean images are cached without assigning an ID to them.

@StevenClontz
Copy link
Member

StevenClontz commented Jun 16, 2024

I'm waiting on https://github.com/TeamBasedInquiryLearning/precalculus/actions/runs/9538778663 and I'm seeing a lot of duplication of assets being generated. This could probably be avoided through cleverer configuration of the action, but I still think having a .generated-cache directory that contains a bunch of ELEMENT/FORMAT/HASH.FMT files that is checked before every build and copied over (barring some kind of --force-regenerate) would be excellent.

Another use case: I change my sageplot from blue to green, then hate it, then change it back to blue. The old blue version is still cached so I get it immediately.

@oscarlevin
Copy link
Member

I am coming around to really liking this idea. I think this would be handled by core though, correct? So definitely something we will want to collaborate on.

@StevenClontz
Copy link
Member

I think this would be handled by core though, correct?

💯 - and this is a good week to do it

@StevenClontz
Copy link
Member

Caching should be used in tandem with https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows to speed up CI/CD for PreTeXt projects

@StevenClontz
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants