Skip to content

Commit

Permalink
fix: Add text truncation in ONNX encoding. (#58)
Browse files Browse the repository at this point in the history
Signed-off-by: wxywb <[email protected]>
  • Loading branch information
wxywb authored Dec 23, 2024
1 parent 2e93f00 commit ddc1268
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion milvus_model/dense/onnx.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ def _encode(self, texts: List[str]) -> List[np.array]:
return [self._to_embedding(text) for text in texts]

def _to_embedding(self, data: str, **_):
encoded_text = self.tokenizer.encode_plus(data, padding="max_length")
encoded_text = self.tokenizer.encode_plus(data, padding="max_length", truncation=True)

ort_inputs = {
"input_ids": np.array(encoded_text["input_ids"]).astype("int64").reshape(1, -1),
Expand Down

0 comments on commit ddc1268

Please sign in to comment.