-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Bug in add_ncu #196
Fix Bug in add_ncu #196
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The revised logic for dupe_cols
looks good, but there's a pretty major issue in handling chosen_metrics
that could easily break NCUReader
if not addressed.
@@ -602,6 +593,15 @@ def _rep_agg_func(col): | |||
if chosen_metrics: | |||
ncu_df = ncu_df[chosen_metrics] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to have a type check before doing this. Since you don't validate the type of chosen_metrics
anywhere, you could get an input that would cause various parts of the rest of the code to error. Off the top of my head, some scenarios are:
- Passing a number (e.g.,
int
,float
) would cause this line to fail with aKeyError
iff there is not a column with an integer or float index - Passing a string would cause this line to succeed, but it would also return a
Series
instead of aDataFrame
. This would cause all the code following this line to fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@pearce8 this PR is ready for your review
@michaelmckinsey1 Can you please resolve conflicts on this PR? Apologies if I merged these in the wrong order. |
b7cc225
to
23d57ad
Compare
#178 Introduced a bug when
chosen_metrics
was not provided. Whenchosen_metrics=None
all ncu metrics are added to the perf dataframe. This PR fixes the bug by modifying the check and moving it, where we can check for duplicate columns whenchosen_metrics=None
aswell.chosen_metrics
before reading NCUchosen_metrics
isNone
we will know which metrics are present.