Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Performance on Letterpress and Other Augmentations Relying on Noise Generation #214

Open
cs-mshah opened this issue Dec 4, 2022 · 2 comments
Assignees

Comments

@cs-mshah
Copy link

cs-mshah commented Dec 4, 2022

Augmentations that rely on perlin noise generation are particularly slow, including Letterpress and others.

It would be great if the augmentations taking more time can be made more efficient/leverage GPU as it is too slow to practically use the bottom ones in the list for training.

I tried to train a model using letterpress and found that its one epoch was taking 12x more time than without applying the augmentation. I timed most augmentations on augmenting 7 images and here are the results:

Screenshot from 2022-12-04 12-45-04

Here is the code for timing:

aug_list = [
        DirtyDrum(line_concentration=0.5, noise_intensity=1.0, direction=2),
        BleedThrough(intensity_range=(0.6, 1.0), offsets=(7, 7), alpha=0.5),
        DirtyRollers(),
        Dithering(),
        Faxify(),
        InkBleed(severity=(0.5, 0.8)),
        Letterpress(),
        LowInkRandomLines(count_range=(10,15)),
        Markup(),
        PencilScribbles(size_range=(250, 400), count_range=(1, 10), stroke_count_range=(1, 6)),
        BrightnessTexturize(),
        ColorPaper(),
        Gamma(),
        Geometric(rotate_range=(-3,3)),
        LightingGradient(),
        PageBorder(width_range=(5,10)),
        SubtleNoise(subtle_range=25),
        BadPhotoCopy(),
        BindingsAndFasteners(ntimes=(2, 4)),
        Folding(fold_count=4),
        Jpeg(),
        NoiseTexturize()
        ]

    times = []

    for aug in aug_list:
        start_time = time.time()
        aug_imgs = []
        for img in imgs:
            aug_imgs.append(aug(img))
        end_time = time.time()
        times.append(end_time - start_time)
@kwcckw
Copy link
Collaborator

kwcckw commented Dec 4, 2022

Thanks for the feedback. Right now the performance improvement is in our improvement roadmap and it should be included in the next major update.

@jboarman jboarman changed the title performance improvement needed for a few augmentations Performance Improvement Needed on a Few Augmentations Mar 18, 2023
@jboarman jboarman moved this to Todo in Augraphy Roadmap Mar 18, 2023
@jboarman jboarman moved this from Todo to In Progress in Augraphy Roadmap Mar 18, 2023
@jboarman jboarman changed the title Performance Improvement Needed on a Few Augmentations Improve Performance on Letterpress and Other Augmentations Relying on Noise Generation Apr 10, 2023
@jboarman jboarman moved this from In Progress to Todo in Augraphy Roadmap Apr 10, 2023
@jboarman
Copy link
Member

jboarman commented Apr 11, 2023

The key issue with these slower augmentations is the noise generation process. So, we will use this issue to focus on approaches to speed noise generation while retaining an essential level of random variation in the distortions.

These augmentations should all be improved once we can improve the noise generation process:

  • Letterpress
  • BleedThrough
  • BadPhotoCopy
  • LightingGradient
  • PageBorder
  • NoiseTexturize
  • DirtyDrum
  • InkBleed
  • Faxify

We've recently released a performance improvement via #270 which included use of Numba to optimize loops. However, we found there remain a lot of opportunity to improve the noise generation processes which most heavily impact augmentation performance.

See greater than 100% performance improvements from recent Augraphy updates:
#270 (comment)

@jboarman jboarman assigned ss756 and unassigned kwcckw May 13, 2023
@ss756 ss756 moved this from Todo to In Progress in Augraphy Roadmap May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

No branches or pull requests

4 participants