You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was following this demonstration to interpret my DistilBert-based model. I found that, in a few cases, IG does not converge even with a high n_steps value. However, if we swap the input and baseline, expecting a negative sign of the sum of IG across all dimensions with respect to the original, it converges.
The reference (baseline) input consists of only starting, ending, padding tokens.
I found that they converge to different values. Specifically, for n_steps = 300, delta_1 = -0.001 and delta_ref = 0.387. Even after increasing n_steps = 900, the deltas still remain the same. I would like to ask if there is an explanation for this.
*Note: my predict and forward_pass functions, analogous to the squad_pos_forward_func, are defined in the same way.
The text was updated successfully, but these errors were encountered:
I was following this demonstration to interpret my DistilBert-based model. I found that, in a few cases, IG does not converge even with a high
n_steps
value. However, if we swap the input and baseline, expecting a negative sign of the sum of IG across all dimensions with respect to the original, it converges.Here is a part of my code:
The reference (baseline) input consists of only starting, ending, padding tokens.
I found that they converge to different values. Specifically, for
n_steps = 300
,delta_1 = -0.001
anddelta_ref = 0.387
. Even after increasingn_steps = 900
, the deltas still remain the same. I would like to ask if there is an explanation for this.*Note: my
predict
andforward_pass
functions, analogous to thesquad_pos_forward_func
, are defined in the same way.The text was updated successfully, but these errors were encountered: