Figure out how match flattening affects us #113

goldfirere · 2015-04-09T16:40:30Z

In the master branch, all pattern-matches are "flattened" before further processing. This is done from scLetDec in squashWildcards, called in singTopLevelDecs. This is a drastic change, meaning that no patterns are overlapped anymore. How is this change affecting us? Is the code bloat truly terrible? How are compilation times? Pattern-match flattening allows some nice things -- in particular, it allows singletonization of functions with overlapping patterns. But is the cost too high? If the cost is high, is there a way to detect when match flattening is a good idea? Should we ask the user for assistance?

The text was updated successfully, but these errors were encountered:

jstolarek · 2015-04-19T07:55:29Z

It looks like match flattening was the cause of huge memory usage during compilation of Data.Singletons.Prelude.List module. See #96.

jstolarek · 2015-04-19T08:23:19Z

It also seems that there might have been a bug in the implementation of match flattening, which caused failure of Singletons/Nat test. See #97.

This fails because takeWhile can't be singletonized, which in turn is because of #113 and friends. Oh well.

goldfirere · 2015-10-31T15:25:49Z

There is lots of good data on this issue in #96. But it's not exactly the data that we need.

The big question I have about the performance blowup is where, precisely, it's happening.

First, we need to know what the trigger is. The data in #96 suggests that the blowup happens with both zipWithN and unzipN. unzipN is simpler, so let's focus on that. Question 1: why unzipN and not zipN?

Question 2: Where, precisely, is the blowup happening? Here are some possibilities:

th-desugar takes a long time to process the code, but its output is reasonable.
th-desugar produces lots and lots of code. Way more than it needs to. A good basis of comparison here is to see what GHC produces when it translates normal (that is, with no TH silliness) definitions of unzipN to Core. The match-flattening algorithm is copied directly out of GHC, so the output should be similar.
singletons takes a very long time processing the output of th-desugar. By "very long", I mean worse than linear in the length of the th-desugar output (do not compare against the N in unzipN -- we need to look at the size of the AST that th-desugar produces).
singletons produces lots and lots of code during promotion. Way more than it needs to. A good basis of comparison is to take the output of th-desugar and promote by hand. Is the trickery that singletons is doing to get promotion working producing the blowup? Or is it doing a reasonably efficient thing?
singletons produces lots and lots of code during singletonization. Way more than it needs to. A good basis of comparison is to take the output of th-desugar and singletonize by hand. Is the trickery that singletons is doing to deal with binders producing the blowup? Or is it doing a reasonably efficient thing?
GHC's typechecker is choking on the output from singletons. That is, GHC's performance is worse-than-linear in the size of the code passed to it. But the output of the typechecker (as seen by -ddump-ds) is reasonable. Does it matter if you write the code by hand vs. by TH? If you were to write unzipN in a natural way (without all the silliness that singletons does), is GHC still slow? Dealing with the promoted code or the singleton code? Can we isolate the possibility that the problem is simply type family reduction? (Type family reduction has been a bottleneck in the past.)
GHC's translation to GHC code is choking. That is, the size of the -ddump-ds output is growing too fast. Way faster than it should.

Perhaps the blowup is happening at multiple places! But we need to discover which pass is causing problems before we can start addressing them. I don't think it's terribly hard to do this, but it would take some time and attention to details. Once we're sure that unzipN causes the problem, it may just be easiest to work with unzip7 and observe what happens manually -- I'm not convinced that finding the growth rate is going to unlock this problem as much as a higher-level understanding of what's wrong.

And then there's question 3: even when things aren't terrible, how bad is match-flattening in the average case? Does it increase all compilations by 20%? Code size? Is it worth the cost?

If you want to explore this, you'll want to enable match-flattening again. Take a look at commit 87aa901. Restore the Data.Singletons.Single.Squash module and the few lines of change in Data.Singletons.Single. There's a lot of other stuff in that commit, but don't worry about it -- none of it should matter for testing purposes.

Do let me know if you're taking this on! :)

jstolarek · 2015-11-02T13:32:05Z

Do let me know if you're taking this on! :)

Not in the next weeks - I'm totally swamped with teaching and once this is over (in two weeks) I plan to take on generalized injectivity.

goldfirere · 2015-11-02T15:03:29Z

To be clear: my "you" was addressed to any reader out there in cyberspace. Not you, Janek, specifically. :)

jstolarek · 2015-11-02T15:08:39Z

I admit that this possibility crossed my mind when writing my comment :-)

This was referenced Apr 9, 2015

Tests are horribly out-of-date #103

Closed

Merge zips back into List #114

Closed

goldfirere mentioned this issue Apr 19, 2015

Memory leak when compiling Data.Singletons.Prelude.List with match flattening #96

Open

goldfirere mentioned this issue Sep 16, 2015

Export numeric functions that are in Prelude #81

Open

goldfirere pushed a commit that referenced this issue Sep 16, 2015

Attempt to add support for Rational & friends.

ec0ec27

This fails because takeWhile can't be singletonized, which in turn is because of #113 and friends. Oh well.

goldfirere pushed a commit that referenced this issue Sep 16, 2015

Attempt to add support for Rational & friends.

936c33c

This fails because takeWhile can't be singletonized, which in turn is because of #113 and friends. Oh well.

goldfirere added the Match flattening label Sep 16, 2015

goldfirere mentioned this issue Oct 31, 2015

Compile regression with Singletons-2.0 #131

Open

goldfirere mentioned this issue Jun 17, 2018

Singling partiality in do-notation is troublesome #340

Open

RyanGlScott mentioned this issue Jun 5, 2024

How aggressive should desugaring be? goldfirere/th-desugar#16

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Figure out how match flattening affects us #113

Figure out how match flattening affects us #113

goldfirere commented Apr 9, 2015

jstolarek commented Apr 19, 2015

jstolarek commented Apr 19, 2015

goldfirere commented Oct 31, 2015

jstolarek commented Nov 2, 2015

goldfirere commented Nov 2, 2015

jstolarek commented Nov 2, 2015

Figure out how match flattening affects us #113

Figure out how match flattening affects us #113

Comments

goldfirere commented Apr 9, 2015

jstolarek commented Apr 19, 2015

jstolarek commented Apr 19, 2015

goldfirere commented Oct 31, 2015

jstolarek commented Nov 2, 2015

goldfirere commented Nov 2, 2015

jstolarek commented Nov 2, 2015