Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add deepseq on the input for nf and whnf #227

Open
nikita-volkov opened this issue Jul 24, 2020 · 3 comments
Open

Add deepseq on the input for nf and whnf #227

nikita-volkov opened this issue Jul 24, 2020 · 3 comments

Comments

@nikita-volkov
Copy link

This is to ensure that the input evaluation is not involved in the execution of the benchmark.

Here is the patch that I'm suggesting:

-nf :: NFData b => (a -> b) -> a -> Benchmarkable
-nf f x = toBenchmarkable (nf' rnf f x)
+nf :: (NFData a, NFData b) => (a -> b) -> a -> Benchmarkable
+nf f x = deepseq x (toBenchmarkable (nf' rnf f x))
@RyanGlScott
Copy link
Member

The env function is usually recommended for inputs that take a long time to compute. Moreover, I'm not sure if using deepseq in nf will actually avoid the input from taking up execution time, since the payload of a Benchmarkable (which nf returns) is only ever invoked until after criterion's timer starts.

@nikita-volkov
Copy link
Author

What do you think about forcing the Benchmarkable in the place where it's going to be executed prior to execution? In effect this would force the evaluation of deepseq x.

@RyanGlScott
Copy link
Member

What do you think about forcing the Benchmarkable in the place where it's going to be executed prior to execution?

Forcing the Benchmarkable itself probably isn't going to do much good here. That is because a Benchmarkable stores the computation to be run as a higher-order function of type a -> Int64 -> IO (). In the particular case of nf, that higher-order function is const (nf' rnf f x). Forcing this function won't accomplish what you seek, since the argument x is suspended in the closure for this higher-order function.

More generally, I'm skeptical that this is worthwhile overall. The Haddocks for env mention this:

Motivation. In earlier versions of criterion, all benchmark inputs were always created when a program started running. By deferring the creation of an environment when its associated benchmarks need the its, we avoid two problems that this strategy caused:

  • Memory pressure distorted the results of unrelated benchmarks. If one benchmark needed e.g. a gigabyte-sized input, it would force the garbage collector to do extra work when running some other benchmark that had no use for that input. Since the data created by an environment is only available when it is in scope, it should be garbage collected before other benchmarks are run.
  • The time cost of generating all needed inputs could be significant in cases where no inputs (or just a few) were really needed. This occurred often, for instance when just one out of a large suite of benchmarks was run, or when a user would list the collection of benchmarks without running any.

Note I wasn't the maintainer of criterion when the documentation above was written, so I don't have an exact sense for how severe these problems were at the time. But I'm hesitant to retread this ground without a compelling reason to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants