-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature add cosine proximity loss #30
base: master
Are you sure you want to change the base?
Feature add cosine proximity loss #30
Conversation
|
||
y_true = l2_normalize(y, axis=-1) | ||
y_pred = l2_normalize(y_pred, axis=-1) | ||
return 1. - np.sum(y_true * y_pred, axis=-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw 2 different implementation. First one is
return -np.sum(y_true * y_pred, axis=-1)
or
return 1. - np.sum(y_true * y_pred, axis=-1)
Which one should we choose for our implementation? They are same in nature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the first since it ranges between -1 and 1, similar to the cosine distance itself.
return 1. - np.sum(y_true * y_pred, axis=-1) | ||
|
||
@staticmethod | ||
def grad(y, y_pred, z, act_fn): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a sufficient way to do grad of cosine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing this from my phone, so please check for errors:
If f(x, y) = (x @ y) / (norm(x) * norm(y))
, then we have
df/dy = x / (norm(x) * norm(y)) - (f(x, y) * y) / (norm(y) ** 2)
where norm
is just the 2-norm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that since cosine loss == negative cosine distance, you should multiply df/dy by -1
vector_length_max = 100 | ||
|
||
for j in range(2, vector_length_max): | ||
x = np.random.uniform(0., 1., [j, ]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To generate random vector array, i set the bound from 0. to 1.
@WuZhuoran - Just ping me when this is finished and I'll take a look. |
@ddbourgin Thank you. I think I need some help with the grad of cos loss and how to test grad function? |
General comment: It looks like right now the documentation is copied directly from |
I will update documentation at next commits. Thanks |
This pull request closes #29 .
- What I did
Add
Cosine Proximity Loss Function.- How I did it
Refer class comments.
- How to verify it
This pull request adds a new feature to Numpy-ml. Ask @ddbourgin to take a look.