Skip to content
This repository has been archived by the owner on Dec 11, 2023. It is now read-only.

Smooth-BLEU bug fixed #488

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

Aktsvigun
Copy link

Hi,
the current implementation of smooth-BLEU contains a bug: it smoothes unigrams as well. Consequently, when both the reference and translation consist of totally different tokens, it anyway returns a non-zero value (please see the attached image).

This however contradicts the source paper suggesting the smooth-BLEU (Chin-Yew Lin, Franz Josef Och. ORANGE: a method for evaluating automatic evaluation metrics for machine translation. COLING 2004.) :

Add one count to the n-gram hit and total ngram count for n > 1. Therefore, for candidate translations with less than n words, they can still get a positive smoothed BLEU score from shorter n-gram matches; however if nothing matches then they will get zero scores.

This pull request aims at fixing this bug.

Снимок экрана 2022-06-29 в 17 39 42

@google-cla
Copy link

google-cla bot commented Jun 29, 2022

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant