-
Notifications
You must be signed in to change notification settings - Fork 255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent instances for mapnik rebased #447
Concurrent instances for mapnik rebased #447
Conversation
8aa27d0
to
00c8ef9
Compare
In #401 (comment) I was asked to add a test for this, but I need some support:
|
This pull-request speeds up mapnik-operations by at least factor 4, and I would love to see it merged in mapproxy so others can benefit from the improvement, too. |
It’s been another 6 months now, so I’d like to ask for support again: Could you give me a hint where to best locate these tests so that they fit your test structure? Adding mapnik for the travis-ci is likely outside scope for me since I don’t have the experience with configuring CI for github, so I would have too much of experimentation to do to get finished. Is there already a test that uses mapnik? |
In a recent project starting to use MP with Mapnik and ran into performance issues when serving tiles. Direct Mapnik So tried the
Sometimes when flooding tile requests via But never saw these errors when seeding (concurrency 8), which seemed much faster than serving via So I guess the errors seen could come from a complex multiprocessing/threading interplay of Anyway I hope we can move this PR forward somehow, as its gains are significant. Maybe as a config option ( |
Thanks @ArneBab for your contribution so far and sorry for the silence. Regarding the tests, you can find an existing mapnik test over here (maybe it helps): |
Thank you for your feedback @weskamm , and I’m happy that this helps you, @justb4 ! We are currently only using this for seeding. I don’t know yet when I’ll be able to do the config, but I have it on my todo list now (can’t do it with high priority yet). I’m trying to make the time to do this, but it might take a while. |
Yes, I see/understand @ArneBab. Been using the Even more optimization: when developing a Mapnik style it is beneficial to immediately see results, as there are no direct tools like Tilemill here and we're not yet using CartoCSS. For this I integrated Python Watchdog in this PR's |
I realise my proposed addition will require additional dependencies (pypi Watchdog) and rigorous testing. But for me this has been a life-saver: in a development and CI/CD environment: not only caching Mapnik objects (as the purpose of this PR) but also not needing to restart MapProxy, thanks to Watchdog, triggering reloading the Mapnik config on any change. When in (map-)development phase and just saving any file in the Mapnik tree, magically reloading and seeing the effect in seconds. Or just requesting tiles over various zoom levels in different areas. It gets almost interactive then... The codeis now public under the (ISC) license: |
48f9503
to
f41fe8a
Compare
@weskamm I have rights to approve the workflow and merge, but you have a better understanding of MapProxy internals. Making this feature optional and rebasing is a big step forward. Once merged, I can start PR-ing my "Hot Mapnik conf reloading" (see above and here ) in a similar, i.e. making it optional, fashion. And @ArneBab thanks for patience! |
I created a third branch that’s rebased where all the workflows pass. (though the workflows do not run the mapnik-tests, so this only shows that the change does not break something else) Would you like me to file a PR from this here, too? Or would you prefer that I force-push into the current branch so this PR gets updated? @justb4 thank you for getting this moving again! |
This sidesteps missing theadsafety in mapnik, because the threadpool can and does re-use per-thread-and-process mapnik Map objects. To keep the memory requirement bounded, this patch also adds a simple garbage collector. It kicks in when there are far more cached Map objects than active threads.
This avoids delays after periods of inactivity due to reaped threads by preparing Map objects for the least recently used mapfile.
This reverts commit e58fdeb.
f41fe8a
to
0624c56
Compare
I now force-pushed the rebased changes into concurrent-instances-for-mapnik-rebased. |
I will have a final look now |
so tests still missing? |
I did not create additional tests, because I did not get the existing mapnik tests to work on a clean master (without these changes — I spent far more time than expected trying to get a mapnik+seeding docker-setup working for testing all tests in test/system/test_mapnik.py) and they are skipped in the github workflow, so I would have had to write these blind. But with the config defaulting to Others have tested the output of seeding yesterday and today. Over the weekend we will be running a large seeding to detect potential problems. |
If a test is a strict requirement (with concurrency, because that’s the only place where this can go wrong), I’ll see what I can do the next weeks, but this is a very slow process, because I have to ping-pong with those who have a full setup in which the existing mapnik tests work. @justb4 can you check whether the speed improvements persist with the refactoring I did? |
Test should usually be included if possible, but i understand the troubles you have here, so it should be ok. So maybe lets wait for final feedback from @justb4 and then we can merge this in. |
@justb4 do you have a chance to test this again? |
I am merging this now without further feedback. As its configurable and defaults to the old behaviour, we should not expect any issues for default installations. Thanks for contribution! |
Thank you! |
Very sorry, missed this completely! I can only say that I am revisiting |
No worries, I know too well how easy it is to miss a PR. Thank you for the info! |
This pull-request recovers caching of the mapnik object by segmenting the cache by process_id and thread_id, both for seeding and for serving. To avoid delays in serving after threads are reaped, mapnik.Map(…) objects are pregenerated for the least recently used mapfile.
The segmented cache speeds up the seeding process in our setup by at least factor 4 and pregeneration of the maps reduces the delay for serving not-yet-cached map tiles is reduced from ~5 seconds to ~1 second.
Licensing under Apache license is OK.
I’d be glad if you could merge these changes and hope they prove useful!
rebase of #401 on top of current master.