Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
While using the W2V model, a vulnerability arises, resulting in a memory error if the input stream data contains empty lines without characters.
Cause
During the reading of stream data, if a line contains only a newline character, the
num_nnz
variable is incremented by 1. codeLater on,
num_nnz
is utilized astotal_lines
in the _sort_and_compressed_binarization() function.The values stored in the
path
file are pass to therecords
vector, and this vector is read based on thetotal_lines
. codeIf the calculation of
num_nnz
is inflated due to the newline, it exceeds the index of therecords
vector, leading to references outside the bounds.Consequently, reading unexpected values triggers a segment fault or program malfunction.
Changes
In instances where an empty line is inputted, it has been modified to be disregarded using the
continue
statement. Additionally, a typo identified during debugging has been rectified.