[feat] IDSSE-306: round_half_away utility #7

mackenzie-grimes-noaa · 2023-08-11T22:20:00Z

Linear Task

Changes

Add round_half_away() function to util.py
- Accepts optional parameter precision to control how many decimal places to preserve
- Uses math.floor() or math.ceil() to enforce "round half away from zero", instead of the default behavior of python's built-in round(), which is "round half to even", a.k.a. "banker's rounding".
- For example:
  - round_half_up(2.5) = 3, where round(2.5) = 2
  - round_half_up(-2.5) = -3, where round(-2.5) = 2
- Why? Python is attempting to reduce bias towards larger numbers when rounding.
  - Floating point numbers have innate imprecision in how they are stored in computers. Floating point guide describes the cause in more detail
Unit tests included, over 80% coverage for this file
- Had to add tests for a few pre-existing utils functions too

mackenzie-grimes-noaa · 2023-08-11T22:29:28Z

Adding test coverage report here, since our CI doesn't have it replicating to the README or anything yet:

(.venv) mackenzie.grimes@macgrimes-lt idsse-engine-commons % cd python/idsse_common/idsse/common
(.venv) mackenzie.grimes@macgrimes-lt common % pytest ../../test --cov=. --cov-report=term-missing

Name                 Stmts   Miss  Cover   Missing
--------------------------------------------------
__init__.py              0      0   100%
aws_utils.py            78      6    92%   136, 145-146, 168-169, 187
config.py               74      9    88%   72-74, 84, 89-91, 100, 126
json_message.py         22     22     0%   12-70
log_util.py             38     38     0%   12-76
path_builder.py        127     10    92%   41, 44, 95, 100, 253, 256, 259, 323, 343-344
publish_confirm.py     141    141     0%   4-345
utils.py                82     11    87%   57, 60, 67, 70-71, 92-94, 97, 100-101
--------------------------------------------------
TOTAL                  562    237    58%

Geary-Layne

LGTM

Geary-Layne · 2023-08-12T15:05:23Z

python/idsse_common/idsse/common/utils.py

@@ -166,3 +167,19 @@ def datetime_gen(dt_: datetime,
    for i in range(0, max_num):
        logger.debug('dt generator %d/%d', i, max_num)
        yield dt_ + time_delta * i
+
+
+def round_half_up(number: float, precision: int = 0) -> Union[int, float]:


This is not a required task, but do you know the timing for this function vs the ceil/floor approach vs the built in round? I understand that the ceil/floor only works with precision=0, but that is way well use round for GridProj. Most important is that we have function that behaves the way we need, I'm just curious to know if there will be any real impact in speed.

Decimal has been called out to be a bit slower, yes (for example in this Stack Overflow discussion). Although I didn't find it to be 100x slower than round() like it may have once been; in python 3.11 it seems to be only 1.5-2x times slower on average.

This was with 1 million randomly generated floats between negative and positive 1 million, and I timed each function rounding all the floats to 6 decimal places. Code snippet here

% py round_benchmark.py [INFO] Running benchmark with dataset size 100000 [INFO] [INFO] | Method | Avg (ns) | Min (ns) | Max (ns) | Std Dev (ns) [INFO] | ------------ | ------------ | ------------ | ------------ | ------------ [INFO] | round | 1020 | 708 | 19208 | 245 [INFO] | math.ceil | 667 | 459 | 33417 | 338 [INFO] | decimal | 1431 | 1167 | 26542 | 325

For the sake of having the precision parameter functionality I think it might be acceptable, unless there's a way to use math.ceil and also support decimal precision, which looks to be faster than anything else.

Update: added a hybrid function that uses both Decimal if we need precision and math.ceil/floor if we don't. I also tweaked my benchmarking to round to 0 decimal places 80% of the time--I had previously had the functions always round to 6 decimal places.

The hybrid option is now even faster than the built-in round, on average:

[INFO] Running benchmark with dataset size 100000 [INFO] Dataset was rounded to precision 0: 80.12%, precision 6: 19.88% of tests [INFO] [INFO] | Method | Avg (ns) | Min (ns) | Max (ns) | Std Dev (ns) [INFO] | ------------ | ------------ | ------------ | ------------ | ------------ [INFO] | round | 608 | 375 | 22625 | 272 [INFO] | math.ceil | 421 | 250 | 18833 | 266 [INFO] | decimal | 1035 | 750 | 19500 | 349 [INFO] | hybrid | 509 | 208 | 14125 | 465

This is of course "faster" due to the assumption I made--that actual usage of the function will be "round to the nearest int" more often than it will be "round to x decimal places".

If in reality we never used this function to round to the nearest int, the hybrid would be just as slow as the regular decimal function.

Update to the update:

Unfortunately this benchmarking was all a bit of a waste. I figured out how to round with precision using only math.ceiling() and math.floor().

The pseudocode is essentially this:

1. Multiply by 10 ^ <decimal_places> to slide decimal point over 2. If decimal place is now at least 0.5, round away from zero. Otherwise round towards zero 3. Divide by 10 ^ < decimal_places> to slide the decimal point back to its original spot

Geary-Layne · 2023-08-12T15:06:15Z

python/idsse_common/test/test_utils.py

Thanks for add enough tests to get coverage up.

mackenzie-grimes-noaa · 2023-08-16T20:23:07Z

Pylint is failing in the main branch (another task exists to resolve all the warnings), not the fault of these changes.

mackenzie-grimes-noaa · 2023-08-21T16:01:18Z

Closing this PR since it has been made redundant by PR #5 (these changes were merged to main by that PR already)

mackenzie-grimes-noaa added 2 commits August 11, 2023 14:39

add round_half_up() utility function, unit tests

89d7611

add tests for Map class, exec_cmd, dict_copy_with

9a4c6eb

mackenzie-grimes-noaa marked this pull request as ready for review August 11, 2023 22:34

mackenzie-grimes-noaa requested review from Geary-Layne, rabellino-noaa and paulhamer-noaa August 11, 2023 22:34

add unit test for Map __setattr__

e6776d9

paulhamer-noaa approved these changes Aug 11, 2023

View reviewed changes

mackenzie-grimes-noaa mentioned this pull request Aug 11, 2023

feat: IDSSE-256: GridProj #5

Merged

Geary-Layne approved these changes Aug 12, 2023

View reviewed changes

rabellino-noaa approved these changes Aug 14, 2023

View reviewed changes

mackenzie-grimes-noaa and others added 2 commits August 15, 2023 14:27

round_half_away() now uses only math ceil/floor, no Decimal

8aa1685

Merge branch 'main' into feat/idsse-306/round-half-up

1dd8a63

mackenzie-grimes-noaa changed the title ~~[feat] IDSSE-306: round_half_up utility~~ [feat] IDSSE-306: round_half_away utility Aug 15, 2023

mackenzie-grimes-noaa requested a review from Geary-Layne August 15, 2023 20:29

cleanup pylint warnings that are my fault

252f9c9

mackenzie-grimes-noaa closed this Aug 21, 2023

mackenzie-grimes-noaa deleted the feat/idsse-306/round-half-up branch September 18, 2023 18:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] IDSSE-306: round_half_away utility #7

[feat] IDSSE-306: round_half_away utility #7

mackenzie-grimes-noaa commented Aug 11, 2023 •

edited

Loading

mackenzie-grimes-noaa commented Aug 11, 2023

Geary-Layne left a comment

Geary-Layne Aug 12, 2023

mackenzie-grimes-noaa Aug 15, 2023 •

edited

Loading

mackenzie-grimes-noaa Aug 15, 2023 •

edited

Loading

mackenzie-grimes-noaa Aug 15, 2023

Geary-Layne Aug 12, 2023

mackenzie-grimes-noaa commented Aug 16, 2023

mackenzie-grimes-noaa commented Aug 21, 2023

[feat] IDSSE-306: round_half_away utility #7

[feat] IDSSE-306: round_half_away utility #7

Conversation

mackenzie-grimes-noaa commented Aug 11, 2023 • edited Loading

Linear Task

Changes

mackenzie-grimes-noaa commented Aug 11, 2023

Geary-Layne left a comment

Choose a reason for hiding this comment

Geary-Layne Aug 12, 2023

Choose a reason for hiding this comment

mackenzie-grimes-noaa Aug 15, 2023 • edited Loading

Choose a reason for hiding this comment

mackenzie-grimes-noaa Aug 15, 2023 • edited Loading

Choose a reason for hiding this comment

mackenzie-grimes-noaa Aug 15, 2023

Choose a reason for hiding this comment

Geary-Layne Aug 12, 2023

Choose a reason for hiding this comment

mackenzie-grimes-noaa commented Aug 16, 2023

mackenzie-grimes-noaa commented Aug 21, 2023

mackenzie-grimes-noaa commented Aug 11, 2023 •

edited

Loading

mackenzie-grimes-noaa Aug 15, 2023 •

edited

Loading

mackenzie-grimes-noaa Aug 15, 2023 •

edited

Loading