Add input for stestr run --concurrency flag for fractional nprocs #202

mtreinish · 2018-11-07T01:53:46Z

Right now the --concurrency flag for stestr run takes in an integer, if that value is 0 it runs nproc workers, if that value is greater than 0 it runs that many workers. But we don't have a way to specify a value that is a fraction of nproc easily (like nproc/2 or nproc/4). This can be useful for some test suites, especially if they're very resource intensive, where running nproc workers isn't viable but you still want the number of workers to scale dynamically based on the size of the machine.

There are 2 ways I can see handling this first we also take a string in for concurrency and define an input for the fractions. Like stestr run --concurrency 'n/2' will run with nprocs / 2 workers. Or we could use the negative space and stick with integers to define the same thing, so something like stest run --concurrency -2 will run with nprocs / 2 workers.

The text was updated successfully, but these errors were encountered:

mtreinish · 2018-11-07T01:53:55Z

@masayukig do you have any thoughts on this?

masayukig · 2018-11-07T05:19:30Z

yeah, I think it's a good feature. I actually don't need like that option so far personally. But it makes sense to me.

So, regarding the option argument, I'm fine with either way. But I prefer -2 style because it's simpler.

I was also think like --concurrency '0.5' but I feel that would cause a confusion and inconsistency.

mtreinish · 2018-11-07T15:07:34Z

Well it mostly came up for me because of: Qiskit/qiskit#1228 where we're hitting occasionally memory issues. I'd rather set it to nproc/2 or nproc/4 rather than hard coding it to 2 workers.

mtreinish · 2018-11-07T15:10:13Z

As for the specific input definition I'm fine with anything as long as we make it clear. I only suggested '-2' or 'n/2' because I thought the implementation would be simpler. If we take a float my concern is that someone will do something like --concurrency 0.124332678212341. Granted we will always have the cast the result as an int so maybe that isn't a problem.

mtreinish · 2018-11-16T15:28:09Z

I had a thought instead of doing it as a fraction we could also just do subtraction. So --concurrency=-2 is nproc - 2 workers. Not sure if that's better though

masayukig · 2018-12-11T09:23:20Z

Hrm. I actually don't have a strong opinion here. If we know the nproc already, the subtraction might work well. But I also think, if we know that, can we specify the concurrency directly? So, I guess we use this feature when the nproc is ambiguous. If so, I feel treating the concurrency as a fraction might be better.

And, I'm curious about a corner case of the subtraction. What will happen when nproc =< abs(concurrency) && 0 > concurrency? For example, nproc=32 and --concurrency=-32.
Of course, I think we can treat this just no concurrency in this case, though.

mtreinish · 2018-12-13T04:48:40Z

Yeah, I think treating it as a fraction is probably better. I just wrote that down because it was an idea I had related to this, after thinking more I agree with subtraction is not a good idea.

As for that corner case yeah there are probably 2 ways to handle it, either we exit with an error code and say it's not a valid value, or we treat it as running serially. Actually, this made me think what happens today if we pass a negative number to --concurrency, so I ran a test:

stestr run --concurrency -23
list index out of range

We probably should change that error message to be more descriptive.

masayukig · 2018-12-14T07:40:33Z

heh, yeah. Oh, did you already start to implement this?

This commit adds a detailed error message when a user passes in a concurrency outside the expected range (which is >= 0). Previously we'd raise an unhandled IndexError because of going outside the list bounds deeper in the code, which wasn't super useful or informative for users. This fixes that by detecting when we have an invalid input up front and print that out before we run anything. Related to: #202

In Qiskit#1228 we decreased the concurrency used for the test runner in CI to avoid memory pressure and failing jobs. In that commit we set it to 2 as a quick fix so we could unblock CI and continue landing patches. A followup was to do the analysis and find the sweet spot for concurrency and stability. At the same time we opened mtreinish/stestr#202 to add support to the test runner for running fractional nprocs. That is still under discussion, but in the meantime this commit adds support to the makefile for using fractional nprocs concurrency. It adjusts are default concurrency from 2 to be nprocs / 2 if nprocs > 3. This should enable us to increase the number of test workers we run on travis linux jobs but still maintain reliability.

* Make test running concurrency dynamic based on nprocs In #1228 we decreased the concurrency used for the test runner in CI to avoid memory pressure and failing jobs. In that commit we set it to 2 as a quick fix so we could unblock CI and continue landing patches. A followup was to do the analysis and find the sweet spot for concurrency and stability. At the same time we opened mtreinish/stestr#202 to add support to the test runner for running fractional nprocs. That is still under discussion, but in the meantime this commit adds support to the makefile for using fractional nprocs concurrency. It adjusts are default concurrency from 2 to be nprocs / 2 if nprocs > 3. This should enable us to increase the number of test workers we run on travis linux jobs but still maintain reliability. * Hard code osx to 2 workers * Add debug statement about core count and worker count * Use printf to round floats to int

* Make test running concurrency dynamic based on nprocs In Qiskit#1228 we decreased the concurrency used for the test runner in CI to avoid memory pressure and failing jobs. In that commit we set it to 2 as a quick fix so we could unblock CI and continue landing patches. A followup was to do the analysis and find the sweet spot for concurrency and stability. At the same time we opened mtreinish/stestr#202 to add support to the test runner for running fractional nprocs. That is still under discussion, but in the meantime this commit adds support to the makefile for using fractional nprocs concurrency. It adjusts are default concurrency from 2 to be nprocs / 2 if nprocs > 3. This should enable us to increase the number of test workers we run on travis linux jobs but still maintain reliability. * Hard code osx to 2 workers * Add debug statement about core count and worker count * Use printf to round floats to int

mtreinish added enhancement good first issue labels Nov 7, 2018

mtreinish mentioned this issue Dec 18, 2018

Add error message for invalid concurrency value #208

Merged

mtreinish mentioned this issue Mar 5, 2019

Ensure concurrency is always an int #232

Merged

mtreinish mentioned this issue Mar 16, 2019

Make test running concurrency dynamic based on nprocs Qiskit/qiskit#1973

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add input for stestr run --concurrency flag for fractional nprocs #202

Add input for stestr run --concurrency flag for fractional nprocs #202

mtreinish commented Nov 7, 2018

mtreinish commented Nov 7, 2018

masayukig commented Nov 7, 2018

mtreinish commented Nov 7, 2018

mtreinish commented Nov 7, 2018

mtreinish commented Nov 16, 2018

masayukig commented Dec 11, 2018

mtreinish commented Dec 13, 2018

masayukig commented Dec 14, 2018

Add input for stestr run --concurrency flag for fractional nprocs #202

Add input for stestr run --concurrency flag for fractional nprocs #202

Comments

mtreinish commented Nov 7, 2018

mtreinish commented Nov 7, 2018

masayukig commented Nov 7, 2018

mtreinish commented Nov 7, 2018

mtreinish commented Nov 7, 2018

mtreinish commented Nov 16, 2018

masayukig commented Dec 11, 2018

mtreinish commented Dec 13, 2018

masayukig commented Dec 14, 2018