`model(x, training=False)` and `model(x, training=True)` outputs differ #20204

ghsanti · 2024-09-03T19:45:36Z

ghsanti
Sep 3, 2024

I've just read this issue in tf-keras, and ended up testing a simple convnet with mnist to understand the problem.

I suspect the reason may be due to BatchNormalization, but don't know enough to understand why.

Here is a simple image comparing the mean absolute error for model(X, training=True) and model.predict(X) which disappears is training=False:

But why would that make the outputs different, between model(X, training=False) (or model.predict), and model(X, training=True)

I don't have a colab link, but the snippet I used is:

for step, (x_batch_test, y_batch_test) in enumerate(ds_test):

    p = model(x_batch_test, training=True)
    r = np.average(mae(p, y_batch_test))
    print("model ",r)

    p = model.predict(x_batch_test)
    r = np.average(mae(p, y_batch_test))
    print("model predict ", r)

Is there a simple explanation? I should read the code but maybe someone knows already.

Anyone that could help, please?

abhaskumarsinha · 2024-09-05T16:56:55Z

abhaskumarsinha
Sep 5, 2024

Hello @ghsanti

Thank you for pointing the issue out. Can you please set keras.layers.BatchNormalization argument epsilon= 1e-9 manually which is by default set to 0.001 and re-check the issue again?

I suspect division by 0 or near-0 numbers are the culprit here.

Best Regards,
Abhas

1 reply

ghsanti Sep 27, 2024
Author

I think I found the answer in the docs, but let me know if you think otherwise:

(...) e.g., model(x), or model(x, training=False) if you have layers such as BatchNormalization that behave differently during inference.

Source in the docs.

I'm guessing that BatchNormalization during training computes the mean and std from a batch and applies it, whereas in inference mode it just uses those parameters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`model(x, training=False)` and `model(x, training=True)` outputs differ #20204

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

model(x, training=False) and model(x, training=True) outputs differ #20204

ghsanti Sep 3, 2024

Replies: 1 comment · 1 reply

abhaskumarsinha Sep 5, 2024

ghsanti Sep 27, 2024 Author

`model(x, training=False)` and `model(x, training=True)` outputs differ #20204

ghsanti
Sep 3, 2024

Replies: 1 comment 1 reply

abhaskumarsinha
Sep 5, 2024

ghsanti Sep 27, 2024
Author