-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transient "Sample is empty" runtime error. #161
Comments
I suspect this might have a symptom in common in #160, but again, it's hard for me to tell since I can't reproduce this on my machine (answers to the questions in #160 (comment) would be helpful). One hunch is that there's |
I can replicate this bug consistently using the code here: https://github.com/patrickdoc/criterion-bug I did some tracing, and found the issue to be with this block of code When I run the benchmark, the overhead adjustment makes all of the measurements negative. They are then filtered out causing the |
Thanks for digging into this, @patrickdoc. I wasn't able to reproduce the bug using your repo with an unmodified From
|
I might be a bit out of my element in here in trying to debug this, so I'll ask for help. @harendra-kumar, I noticed that you've changed this sort of code in vincenthz/hs-gauge@92436a4#diff-5ac48df8dda31e28f3b275e7dd76177c. Would it make sense to adopt the same change in |
@patrickdoc is correct, this error is due to all samples getting filtered out. There are two issues here. The first one is the incorrect use of overhead, it does not make sense, maybe it is leftover of some old code, we do not need this that is what I did in that gauge commit. The second one is the use of To answer your question, I think it makes sense to get rid of the overhead stuff, I guess you will have to make changes in |
Although I can't comment on the correctness of removing the I also removed On
On the
I don't really have an explanation, but this does seem like an improvement. The |
Yes it is expected to make results more consistent because we are incorrectly deducting overhead from the timing measurements (using fixTime) which makes the measurements incorrect. Removing it actually makes them sane and should reduce variance. Also, the mean becomes saner with this fix, if you notice without this fix the mean provided by criterion does not even make sense all the time, sometimes it could even be outside the range. This overhead stuff is absolutely nonsensical. |
@RyanGlScott I have good news! This fix solves all three of #160, #161, and #162. It turns out, they are all special cases of this overhead code. In short, it comes down to how many samples are left after (incorrectly) filtering. When 0 samples are left, you get this issue. When 1 sample is left, you get #160. Because there is only 1 sample, the When 2 samples are left, the bootstrap algorithm breaks due to this bug in Statistics and you get #162. It tries to index a vector with INT_MIN causing the Vector When more than 2 samples remain, things operate as normal. This would also explain why you had trouble reproducing it. It is very tight on timings, which will be different from computer to computer. Patching out the overhead code solves all of these issues. I'm getting around 15 samples where the old code produced 0/1/2. And the mean is much more reasonable. |
Wonderful! Can you open a pull request for this? |
Done! |
This should be fixed after we merged #178. @kindaro, can you verify if the issue still persists with I'll optimistically close this issue, since @patrickdoc reports not being able to reproduce it anymore, but feel free to re-open if that is not the case. |
- using `cabal sandbox add-source criterion` to - get the fix from master branch - see haskell/criterion#161 - should be removed after the new criterion release
The issue has not been fixed for me. |
Good to know, @kapralVV. It seems that there is more to the story, which @patrickdoc has been investigating in #179 and #180. |
@kapralVV I really don't think it should be possible for this to happen with HEAD. There are checks in place that should make everything run smoothly (though not 100% correctly). I took a glance at your Travis logs (like this most recent failure) and I am worried that your script somehow isn't using the downloaded Criterion. Perhaps you are running into this? You might be able to check by adding I tried running your benchmarks locally, but got:
|
@patrickdoc
So you may see that Facing the issue as well: |
Last 44 build showed the following issues:
|
Aha, I was able to reproduce it on my machine. It turns out the check I was referring to is actually slightly different than I had thought. Thank you for the report, it also turned up another bug! I can submit a band-aid patch for this, but I also have bigger changes that will cover this coming down the pike. If you would like one now though, let me know. |
I'd be fine with whatever is more convenient. I'm not in a rush to put out a new release, given the multitude of other bugs you've found :) |
- `criterion` bench will be always `green` even it fails - should be removed later - Issues : haskell/criterion#161
Let me start with the terminal output:
This issue is transient: I launched my script about 10 times tonight and 3 out of the attempts ended like this.
The code I use can be reviewed here: github.com/kindaro/criterion-wut.
The text was updated successfully, but these errors were encountered: