How to debug index usage crashes? #220

laurikoobas · 2018-06-11T15:27:28Z

My code was successfully running with 350 million points and 300 polygons.
Now the number of polygons went up to 450 and it started crashing. I did some tests and it still crashes with 10 points (not 10 million, just 10) and those 450 polygons. It's still fine if I limit the number of polygons to 300 though.

Right now I just disabled the index use, but I'd like to get to the root of the issue. Could the problem be in a weird polygon? The largest polygon we have has 174 points.

During my tests, these were some of the error messages:

WARN BlockManagerMasterEndpoint: No more replicas available for rdd_77_0 !
WARN BlockManagerMasterEndpoint: No more replicas available for rdd_61_0 !
ERROR YarnScheduler: Lost executor 2 on blaah: Container killed by YARN for exceeding memory limits. 5.5 GB of 5.5 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.
...
java.lang.OutOfMemoryError: Java heap space

harsha2010 · 2018-07-08T14:47:29Z

@laurikoobas how big a cluster are you using? and what is the node configuration?
if you can share the polygon dataset it would be easier to debug this.. otherwise one thing you can do is collect a heap dump during the execution and send it over

laurikoobas · 2018-07-08T15:13:28Z

Running it as an AWS Glue Job on 40 DPUs. It makes sense that the polygon dataset is the cause of this, but I can't share it. What would be something in the polygons that would make the index use an issue though?

harsha2010 · 2018-07-08T15:19:59Z

I'm not familiar with Glue, but I think the amount of memory you need for these polygons might be tipping you over the 5GB limit you have set for the YARN job... what index precision are you using?

laurikoobas · 2018-07-08T15:40:24Z

Used just the 30 that's in the example. Do you have guidelines or documentation on what it means and which values make sense for which use cases?

harsha2010 · 2018-07-08T16:36:03Z

You want to pick a precision that can eliminate a large fraction of polygons..eg if your polygons are US states and you pick say precision of 10/15 each polygon roughly falls into O(1) grids at that precision

If you pick precision 30 that still holds true but we not spend more time computing the grids that overlap with the polygon and more space storing those grids since there will be a lot more of them now
Each time you subdivide you get 4x more grids so if you pick too fine a precision you will pay for it in storage and time

harsha2010 · 2018-07-08T16:37:15Z

precision is nothing but the geohash precision https://gis.stackexchange.com/questions/115280/what-is-the-precision-of-a-geohash

instead of characters, we are using the bit size (so to convert to geohash character length simply divide by 5). eg, precision of 35 = 7 character geohash

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to debug index usage crashes? #220

How to debug index usage crashes? #220

laurikoobas commented Jun 11, 2018

harsha2010 commented Jul 8, 2018

laurikoobas commented Jul 8, 2018

harsha2010 commented Jul 8, 2018

laurikoobas commented Jul 8, 2018

harsha2010 commented Jul 8, 2018

harsha2010 commented Jul 8, 2018 •

edited

Loading

How to debug index usage crashes? #220

How to debug index usage crashes? #220

Comments

laurikoobas commented Jun 11, 2018

harsha2010 commented Jul 8, 2018

laurikoobas commented Jul 8, 2018

harsha2010 commented Jul 8, 2018

laurikoobas commented Jul 8, 2018

harsha2010 commented Jul 8, 2018

harsha2010 commented Jul 8, 2018 • edited Loading

harsha2010 commented Jul 8, 2018 •

edited

Loading