Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maybe get rid of Cython #741

Open
amueller opened this issue Nov 10, 2023 · 4 comments
Open

Maybe get rid of Cython #741

amueller opened this issue Nov 10, 2023 · 4 comments

Comments

@amueller
Copy link
Owner

amueller commented Nov 10, 2023

It would be really nice to get rid of the Cython code. The equivalent in numpy is

def query_integral_image(integral_image, size_x, size_y, random_state):
    x, y = integral_image.shape

    # calculate all areas
    i, j = np.ogrid[:x-size_x, :y-size_y]
    area = integral_image[i, j] + integral_image[i+size_x, j+size_y] - integral_image[i+size_x, j] - integral_image[i, j+size_y]
    zero_areas = np.argwhere(area == 0)

    if zero_areas.size == 0:
        # no room left
        return None

    # pick a location at random
    goal = random_state.choice(zero_areas.shape[0])
    return tuple(zero_areas[goal])

however, that seems to be much slower.
Not sure if there's a faster way to do it. cc @thomasjpfan

@thomasjpfan
Copy link
Collaborator

thomasjpfan commented Nov 26, 2023

Here is the best I can do with pure NumPy:

def query_integral_image_np(integral_image, size_x, size_y, random_state):
    area = integral_image[:-size_x, :-size_y]
    area += integral_image[size_x:,size_y:]
    area -= integral_image[size_x:, :-size_y]
    area -= integral_image[:-size_x, size_y:]
    flat_area_zero = np.flatnonzero(area == 0)
    hits = flat_area_zero.shape[0]
    
    if hits == 0:
        # no room left
        return None

    goal = random_state.randint(0, hits)
    flat_idx = flat_area_zero[goal]
    
    w, h = integral_image.shape
    row = flat_idx // w
    col = flat_idx - row * w
    return row, col

The Cython code still is ~ 2.6 times faster than this.

@ccbogel
Copy link

ccbogel commented Nov 27, 2023

@thomasjpfan I kept getting an IndexError:

flat_idx = flat_area_zero[goal]

IndexError: index 635 is out of bounds for axis 0 with size 600

@amueller
Copy link
Owner Author

amueller commented Nov 27, 2023

@thomasjpfan hm if there's now a way to do version-independent binary wheels maybe that's the better way forward? Also clearly your numpy skills are better than GPT4 ;)
But maybe 2.6 times is fine also given the maintenance overhead?

@thomasjpfan
Copy link
Collaborator

@amueller I tried using ABI3 with word_cloud, but it will not work until the minimum support Python version is 3.11. query_integral_image uses the Buffer protocol, which became stable in 3.11: https://docs.python.org/3/c-api/buffer.html#buffer-structure

Although even with 3.11 as minimum version, Cython still uses non-stable CAPI calls. I suspect Cython's limited ABI support is not mature enough yet.

But maybe 2.6 times is fine also given the maintenance overhead?

I cannot tell without running benchmarks to see how much slower generate_from_frequencies gets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants