-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feat] IDSSE-306: round_half_away utility #7
Conversation
Adding test coverage report here, since our CI doesn't have it replicating to the README or anything yet: (.venv) mackenzie.grimes@macgrimes-lt idsse-engine-commons % cd python/idsse_common/idsse/common
(.venv) mackenzie.grimes@macgrimes-lt common % pytest ../../test --cov=. --cov-report=term-missing
Name Stmts Miss Cover Missing
--------------------------------------------------
__init__.py 0 0 100%
aws_utils.py 78 6 92% 136, 145-146, 168-169, 187
config.py 74 9 88% 72-74, 84, 89-91, 100, 126
json_message.py 22 22 0% 12-70
log_util.py 38 38 0% 12-76
path_builder.py 127 10 92% 41, 44, 95, 100, 253, 256, 259, 323, 343-344
publish_confirm.py 141 141 0% 4-345
utils.py 82 11 87% 57, 60, 67, 70-71, 92-94, 97, 100-101
--------------------------------------------------
TOTAL 562 237 58% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -166,3 +167,19 @@ def datetime_gen(dt_: datetime, | |||
for i in range(0, max_num): | |||
logger.debug('dt generator %d/%d', i, max_num) | |||
yield dt_ + time_delta * i | |||
|
|||
|
|||
def round_half_up(number: float, precision: int = 0) -> Union[int, float]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a required task, but do you know the timing for this function vs the ceil/floor approach vs the built in round? I understand that the ceil/floor only works with precision=0, but that is way well use round for GridProj. Most important is that we have function that behaves the way we need, I'm just curious to know if there will be any real impact in speed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decimal
has been called out to be a bit slower, yes (for example in this Stack Overflow discussion). Although I didn't find it to be 100x slower than round()
like it may have once been; in python 3.11 it seems to be only 1.5-2x times slower on average.
- This was with 1 million randomly generated floats between negative and positive 1 million, and I timed each function rounding all the floats to 6 decimal places. Code snippet here
% py round_benchmark.py
[INFO] Running benchmark with dataset size 100000
[INFO]
[INFO] | Method | Avg (ns) | Min (ns) | Max (ns) | Std Dev (ns)
[INFO] | ------------ | ------------ | ------------ | ------------ | ------------
[INFO] | round | 1020 | 708 | 19208 | 245
[INFO] | math.ceil | 667 | 459 | 33417 | 338
[INFO] | decimal | 1431 | 1167 | 26542 | 325
For the sake of having the precision
parameter functionality I think it might be acceptable, unless there's a way to use math.ceil
and also support decimal precision, which looks to be faster than anything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: added a hybrid function that uses both Decimal
if we need precision and math.ceil
/floor
if we don't. I also tweaked my benchmarking to round to 0 decimal places 80% of the time--I had previously had the functions always round to 6 decimal places.
The hybrid option is now even faster than the built-in round
, on average:
[INFO] Running benchmark with dataset size 100000
[INFO] Dataset was rounded to precision 0: 80.12%, precision 6: 19.88% of tests
[INFO]
[INFO] | Method | Avg (ns) | Min (ns) | Max (ns) | Std Dev (ns)
[INFO] | ------------ | ------------ | ------------ | ------------ | ------------
[INFO] | round | 608 | 375 | 22625 | 272
[INFO] | math.ceil | 421 | 250 | 18833 | 266
[INFO] | decimal | 1035 | 750 | 19500 | 349
[INFO] | hybrid | 509 | 208 | 14125 | 465
This is of course "faster" due to the assumption I made--that actual usage of the function will be "round to the nearest int" more often than it will be "round to x decimal places".
If in reality we never used this function to round to the nearest int, the hybrid would be just as slow as the regular decimal function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update to the update:
Unfortunately this benchmarking was all a bit of a waste. I figured out how to round with precision using only math.ceiling() and math.floor().
The pseudocode is essentially this:
1. Multiply by 10 ^ <decimal_places> to slide decimal point over
2. If decimal place is now at least 0.5, round away from zero. Otherwise round towards zero
3. Divide by 10 ^ < decimal_places> to slide the decimal point back to its original spot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for add enough tests to get coverage up.
Pylint is failing in the main branch (another task exists to resolve all the warnings), not the fault of these changes. |
Closing this PR since it has been made redundant by PR #5 (these changes were merged to |
Linear Task
IDSSE-306
Changes
round_half_away()
function to util.pyprecision
to control how many decimal places to preservemath.floor()
ormath.ceil()
to enforce "round half away from zero", instead of the default behavior of python's built-inround()
, which is "round half to even", a.k.a. "banker's rounding".round_half_up(2.5)
= 3, whereround(2.5)
= 2round_half_up(-2.5)
= -3, whereround(-2.5)
= 2