Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to debug index usage crashes? #220

Open
laurikoobas opened this issue Jun 11, 2018 · 6 comments
Open

How to debug index usage crashes? #220

laurikoobas opened this issue Jun 11, 2018 · 6 comments

Comments

@laurikoobas
Copy link

My code was successfully running with 350 million points and 300 polygons.
Now the number of polygons went up to 450 and it started crashing. I did some tests and it still crashes with 10 points (not 10 million, just 10) and those 450 polygons. It's still fine if I limit the number of polygons to 300 though.

Right now I just disabled the index use, but I'd like to get to the root of the issue. Could the problem be in a weird polygon? The largest polygon we have has 174 points.

During my tests, these were some of the error messages:

WARN BlockManagerMasterEndpoint: No more replicas available for rdd_77_0 !
WARN BlockManagerMasterEndpoint: No more replicas available for rdd_61_0 !
ERROR YarnScheduler: Lost executor 2 on blaah: Container killed by YARN for exceeding memory limits. 5.5 GB of 5.5 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.
...
java.lang.OutOfMemoryError: Java heap space

@harsha2010
Copy link
Owner

@laurikoobas how big a cluster are you using? and what is the node configuration?
if you can share the polygon dataset it would be easier to debug this.. otherwise one thing you can do is collect a heap dump during the execution and send it over

@laurikoobas
Copy link
Author

Running it as an AWS Glue Job on 40 DPUs. It makes sense that the polygon dataset is the cause of this, but I can't share it. What would be something in the polygons that would make the index use an issue though?

@harsha2010
Copy link
Owner

I'm not familiar with Glue, but I think the amount of memory you need for these polygons might be tipping you over the 5GB limit you have set for the YARN job... what index precision are you using?

@laurikoobas
Copy link
Author

Used just the 30 that's in the example. Do you have guidelines or documentation on what it means and which values make sense for which use cases?

@harsha2010
Copy link
Owner

You want to pick a precision that can eliminate a large fraction of polygons..eg if your polygons are US states and you pick say precision of 10/15 each polygon roughly falls into O(1) grids at that precision

If you pick precision 30 that still holds true but we not spend more time computing the grids that overlap with the polygon and more space storing those grids since there will be a lot more of them now
Each time you subdivide you get 4x more grids so if you pick too fine a precision you will pay for it in storage and time

@harsha2010
Copy link
Owner

harsha2010 commented Jul 8, 2018

precision is nothing but the geohash precision https://gis.stackexchange.com/questions/115280/what-is-the-precision-of-a-geohash

instead of characters, we are using the bit size (so to convert to geohash character length simply divide by 5). eg, precision of 35 = 7 character geohash

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants