-
Notifications
You must be signed in to change notification settings - Fork 16
Mingshan/Adding resnet50 validation script #478
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 👍
…naSystems/ngraph-tf into mingshan/validate_resnet50
…naSystems/ngraph-tf into mingshan/validate_resnet50
def check_validation_results(norm_dict, metric): | ||
test_pass = True | ||
for norm in norm_dict: | ||
if norm_dict[norm] > 0.1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if we get ref accuracy = 75, and ng accuracy = 75.3, then is it a failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script is not comparing the accuracy. It compares the training loss value at every iteration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing for loss. If ref loss is 1, and we get 0.8, is the test passing?
return total_loss, top1_acc, top5_acc | ||
|
||
|
||
def parse_reference_file(filename): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parse_reference_file
and parse_training_output
can be a single function... I think they are separate because one parses a file, and the other parses string. Maybe we keep the string parsing function and just read the file into a string and reuse.
This PR added the validation script for resnet50 training with both synthetic data and real data.
The tf result references under tfGPU/ folder is collected running the same command in the script on TF GPU.
The patch to make the data loader for real data deterministic is also included, and also the patch to eliminate the average_loss encapsulates in the training graph.