You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When Mistral 7B Instruct v0.3 model is deployed as a SageMaker endpoint the tokenizer is always producing one extra token as compared to the tokenizer being loaded locally.
To reproduce
Pull the vanilla Mistral model from HuggingFace (link here) locally and upload to S3. Let's assume the model is at the following location
s3://bucket/models/mistral-7b-instruct-v0.3-hf/
Deploy endpoint with Messaging API enabled (using sagemaker==2.232.2)
Note how in the above we specify max_input_length to be 256.
Now try to run inference
import time
import json
import mmap
import sagemaker
from tqdm import tqdm
from sagemaker.huggingface.model import HuggingFacePredictor
sess = sagemaker.Session()
endpoint_name = "<endpoint name from above>"
predictor = HuggingFacePredictor(
endpoint_name=endpoint_name,
sagemaker_session=sess,
)
query = "(CNN) -- Beef from Brazil is on Iranian dinner tables. An Iranian-built hospital treats patients near Bolivia's capital. Iranian-funded factories dot the Venezuelan countryside. Iran has forged hundreds of agreements with Latin American nations and pledged billions of dollars to fund them. More deals could be in store this week as Iranian President Mahmoud Ahmadinejad embarks on a trip that starts in Venezuela on Sunday and includes stops in Nicaragua, Cuba and Ecuador. Well before the Iranian leader's arrival in Caracas, his plans for a Latin America tour grabbed global attention as tensions grow between many Western powers and Iran over the nation's nuclear program. \"As the regime feels increasing pressure, it is desperate for friends and flailing around in interesting places to find new friends,\" U.S. State Department spokeswoman Victoria Nuland told reporters Friday. But analysts say Ahmadinejad's visit is the latest step in a longstanding, calculated effort to shore up support in the region. As Iran strives to improve its image, get around stiffening sanctions, dampen America's global influence and secure a stronger foothold in the United States' backyard, relationships with Latin American countries have become increasingly important. Iran's state-run Press TV described cooperation with Latin American nations as one of the \"top priorities of the Islamic Republic's foreign policy\" in a recent article about this week's trip. \"Iran has an extremely active diplomatic move afoot,\" said Larry Birns, director of the Council on Hemispheric Affairs in Washington. 'Cultural ties' Last month, a film portraying the life of Mary and the birth of Jesus from an Islamic point of view beamed out over international airwaves -- in Spanish. The movie was the first program aired on HispanTV, according to a report in the Tehran Times. And the target audience was thousands of miles away from the government-sponsored broadcasting hub in Iran's capital. At a ceremony marking the station's official launch last month, HispanTV's managers said the new Spanish network aims to paint a true picture of Iran and link the Islamic republic with Latin America. Other Spanish-language channels are \"not independent and only serve the interest of the United States and certain allies,\" said Mohammed Sarafraz, director of Iranian broadcasting's world service, according to Press TV. \"It's all about cultural ties between Iran and the Spanish-speaking community,\" network manager Ali Ejaredar told a Press TV reporter. Online previews of upcoming programming include videos showing scenic stretches of the Iranian countryside, bustling marketplaces and Persian calligraphy. An analyst on one program criticizes Western imperialism, saying \"five countries cannot decide the destiny of the world.\" A guest on another show slams U.S. immigration laws. Spanish-language headlines on the network's website last week described Israeli spies, foreign intervention in Syria, a report that Japan plans to \"disobey\" U.S. sanctions against Iran and an allegation that airport security screening machines in the United States cause death. Stephen Johnson, who directs the Americas program at the Center for Strategic and International Studies, compared Iran's efforts to use the media to improve its image abroad to the U.S.-government-funded Voice of America radio network. \"They're taking a page out of our playbook,\" he said. Despite Iran's overtures, there are still rifts to overcome, Johnson said. Some high-profile missteps have accompanied Iran's increasing forays into Latin America, he said. A requirement that female employees wear the hijab at an Iran-funded hospital in El Alto, Bolivia, drew criticism from local officials. Uruguay's foreign minister condemned statements by an Iranian ambassador who told reporters in the South American country that figures saying that millions died in the Holocaust were false. Last year, Iran received the lowest ranking out of nine countries in the Latinobarometro public opinion survey, based on interviews of more than 20,000 residents in 18 Latin American countries (not including Cuba). Only 25% of those surveyed said they viewed Iran as \"good\" or \"very good,\" while 72% said they viewed the United States positively. \"I think with Iran, it's a question of trust as to what are they up to, and what are their nuclear objectives,\" Johnson said. Ahmadinejad's 'direct, personal role' Experts say Iran has been building relations in Latin America for decades. Cuba was one of the first countries to recognize Iran's government after the 197"
parameters = {
"model": "/opt/ml/model",
"do_sample": False,
"max_new_tokens": 64,
"return_full_text": False
}
messages = [{"role": "user", "content": query}]
payload = {"messages": messages, **parameters}
response = predictor.predict(payload)
And observe how the endpoint is throwing the following exception
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (422) from primary with message "{"error":"Input validation error: `inputs` must have less than 256 tokens. Given: 1025","error_type":"validation"}"
Claiming that the input is of length 1025 tokens.
Expected behavior
The input should be tokenized to 1024 tokens and not 1025. Using the exact same query, do the following locally
Python 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained(
... "/mistral-models/mistral-7b-instruct-v0.3-hf"
... )
>>> query = "(CNN) -- Beef from Brazil is on Iranian dinner tables. An Iranian-built hospital treats patients near Bolivia's capital. Iranian-funded factories dot the Venezuelan countryside. Iran has forged hundreds of agreements with Latin American nations and pledged billions of dollars to fund them. More deals could be in store this week as Iranian President Mahmoud Ahmadinejad embarks on a trip that starts in Venezuela on Sunday and includes stops in Nicaragua, Cuba and Ecuador. Well before the Iranian leader's arrival in Caracas, his plans for a Latin America tour grabbed global attention as tensions grow between many Western powers and Iran over the nation's nuclear program. \"As the regime feels increasing pressure, it is desperate for friends and flailing around in interesting places to find new friends,\" U.S. State Department spokeswoman Victoria Nuland told reporters Friday. But analysts say Ahmadinejad's visit is the latest step in a longstanding, calculated effort to shore up support in the region. As Iran strives to improve its image, get around stiffening sanctions, dampen America's global influence and secure a stronger foothold in the United States' backyard, relationships with Latin American countries have become increasingly important. Iran's state-run Press TV described cooperation with Latin American nations as one of the \"top priorities of the Islamic Republic's foreign policy\" in a recent article about this week's trip. \"Iran has an extremely active diplomatic move afoot,\" said Larry Birns, director of the Council on Hemispheric Affairs in Washington. 'Cultural ties' Last month, a film portraying the life of Mary and the birth of Jesus from an Islamic point of view beamed out over international airwaves -- in Spanish. The movie was the first program aired on HispanTV, according to a report in the Tehran Times. And the target audience was thousands of miles away from the government-sponsored broadcasting hub in Iran's capital. At a ceremony marking the station's official launch last month, HispanTV's managers said the new Spanish network aims to paint a true picture of Iran and link the Islamic republic with Latin America. Other Spanish-language channels are \"not independent and only serve the interest of the United States and certain allies,\" said Mohammed Sarafraz, director of Iranian broadcasting's world service, according to Press TV. \"It's all about cultural ties between Iran and the Spanish-speaking community,\" network manager Ali Ejaredar told a Press TV reporter. Online previews of upcoming programming include videos showing scenic stretches of the Iranian countryside, bustling marketplaces and Persian calligraphy. An analyst on one program criticizes Western imperialism, saying \"five countries cannot decide the destiny of the world.\" A guest on another show slams U.S. immigration laws. Spanish-language headlines on the network's website last week described Israeli spies, foreign intervention in Syria, a report that Japan plans to \"disobey\" U.S. sanctions against Iran and an allegation that airport security screening machines in the United States cause death. Stephen Johnson, who directs the Americas program at the Center for Strategic and International Studies, compared Iran's efforts to use the media to improve its image abroad to the U.S.-government-funded Voice of America radio network. \"They're taking a page out of our playbook,\" he said. Despite Iran's overtures, there are still rifts to overcome, Johnson said. Some high-profile missteps have accompanied Iran's increasing forays into Latin America, he said. A requirement that female employees wear the hijab at an Iran-funded hospital in El Alto, Bolivia, drew criticism from local officials. Uruguay's foreign minister condemned statements by an Iranian ambassador who told reporters in the South American country that figures saying that millions died in the Holocaust were false. Last year, Iran received the lowest ranking out of nine countries in the Latinobarometro public opinion survey, based on interviews of more than 20,000 residents in 18 Latin American countries (not including Cuba). Only 25% of those surveyed said they viewed Iran as \"good\" or \"very good,\" while 72% said they viewed the United States positively. \"I think with Iran, it's a question of trust as to what are they up to, and what are their nuclear objectives,\" Johnson said. Ahmadinejad's 'direct, personal role' Experts say Iran has been building relations in Latin America for decades. Cuba was one of the first countries to recognize Iran's government after the 197"
>>>
>>>
>>> messages = [
... {
... "role": "user",
... "content": query
... },
... ]
>>> a = tokenizer.apply_chat_template(
... messages,
... legacy=False
... )
>>> len(a)
1024
The same tokenization is also consistent with using mistral's own models without HuggingFace
Python 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>>
>>>
>>>
>>> from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
>>> from mistral_common.protocol.instruct.messages import UserMessage
>>> from mistral_common.protocol.instruct.request import ChatCompletionRequest
>>> query = "(CNN) -- Beef from Brazil is on Iranian dinner tables. An Iranian-built hospital treats patients near Bolivia's capital. Iranian-funded factories dot the Venezuelan countryside. Iran has forged hundreds of agreements with Latin American nations and pledged billions of dollars to fund them. More deals could be in store this week as Iranian President Mahmoud Ahmadinejad embarks on a trip that starts in Venezuela on Sunday and includes stops in Nicaragua, Cuba and Ecuador. Well before the Iranian leader's arrival in Caracas, his plans for a Latin America tour grabbed global attention as tensions grow between many Western powers and Iran over the nation's nuclear program. \"As the regime feels increasing pressure, it is desperate for friends and flailing around in interesting places to find new friends,\" U.S. State Department spokeswoman Victoria Nuland told reporters Friday. But analysts say Ahmadinejad's visit is the latest step in a longstanding, calculated effort to shore up support in the region. As Iran strives to improve its image, get around stiffening sanctions, dampen America's global influence and secure a stronger foothold in the United States' backyard, relationships with Latin American countries have become increasingly important. Iran's state-run Press TV described cooperation with Latin American nations as one of the \"top priorities of the Islamic Republic's foreign policy\" in a recent article about this week's trip. \"Iran has an extremely active diplomatic move afoot,\" said Larry Birns, director of the Council on Hemispheric Affairs in Washington. 'Cultural ties' Last month, a film portraying the life of Mary and the birth of Jesus from an Islamic point of view beamed out over international airwaves -- in Spanish. The movie was the first program aired on HispanTV, according to a report in the Tehran Times. And the target audience was thousands of miles away from the government-sponsored broadcasting hub in Iran's capital. At a ceremony marking the station's official launch last month, HispanTV's managers said the new Spanish network aims to paint a true picture of Iran and link the Islamic republic with Latin America. Other Spanish-language channels are \"not independent and only serve the interest of the United States and certain allies,\" said Mohammed Sarafraz, director of Iranian broadcasting's world service, according to Press TV. \"It's all about cultural ties between Iran and the Spanish-speaking community,\" network manager Ali Ejaredar told a Press TV reporter. Online previews of upcoming programming include videos showing scenic stretches of the Iranian countryside, bustling marketplaces and Persian calligraphy. An analyst on one program criticizes Western imperialism, saying \"five countries cannot decide the destiny of the world.\" A guest on another show slams U.S. immigration laws. Spanish-language headlines on the network's website last week described Israeli spies, foreign intervention in Syria, a report that Japan plans to \"disobey\" U.S. sanctions against Iran and an allegation that airport security screening machines in the United States cause death. Stephen Johnson, who directs the Americas program at the Center for Strategic and International Studies, compared Iran's efforts to use the media to improve its image abroad to the U.S.-government-funded Voice of America radio network. \"They're taking a page out of our playbook,\" he said. Despite Iran's overtures, there are still rifts to overcome, Johnson said. Some high-profile missteps have accompanied Iran's increasing forays into Latin America, he said. A requirement that female employees wear the hijab at an Iran-funded hospital in El Alto, Bolivia, drew criticism from local officials. Uruguay's foreign minister condemned statements by an Iranian ambassador who told reporters in the South American country that figures saying that millions died in the Holocaust were false. Last year, Iran received the lowest ranking out of nine countries in the Latinobarometro public opinion survey, based on interviews of more than 20,000 residents in 18 Latin American countries (not including Cuba). Only 25% of those surveyed said they viewed Iran as \"good\" or \"very good,\" while 72% said they viewed the United States positively. \"I think with Iran, it's a question of trust as to what are they up to, and what are their nuclear objectives,\" Johnson said. Ahmadinejad's 'direct, personal role' Experts say Iran has been building relations in Latin America for decades. Cuba was one of the first countries to recognize Iran's government after the 197"
>>>
>>>
>>> tokenizer = MistralTokenizer.from_file(f"/home/ec2-user/workspace/mistral/mistral-runs/2024_08_20_experiment_07/runs/01/checkpoints/checkpoint_000300/merged/tokenizer.model.v3")
>>> completion_request = ChatCompletionRequest(
... messages=[UserMessage(content=query)]
... )
>>> tokens = tokenizer.encode_chat_completion(completion_request).tokens
>>> len(tokens)
1024
Screenshots or logs
N/A
System information
A description of your system. Please provide:
SageMaker Python SDK version: 2.232.2
Framework name (eg. PyTorch) or algorithm (eg. KMeans): N/A
Framework version:
Python version:
CPU or GPU:
Custom Docker image (Y/N): TGI
Additional context
Is it possible to peak at the tokenized input when hitting the SageMaker endpoint?
I also tried deploying the endpoint without the messaging API enabled and tokenizing directly the following string <s>[INST] {query}[/INST] which results in the same amount of tokens (1025) as the variant with the messaging API enabled. Another option is to remove the preceding <s> but I don't want to guess what happens behind the scenes.
Could someone provide some guidance on how to get to the core of the issue. How can I ensure consistent tokenization between the local version and the SageMaker deployed variant. The underlying model is exactly the same.
The text was updated successfully, but these errors were encountered:
Describe the bug
When Mistral 7B Instruct v0.3 model is deployed as a SageMaker endpoint the tokenizer is always producing one extra token as compared to the tokenizer being loaded locally.
To reproduce
Pull the vanilla Mistral model from HuggingFace (link here) locally and upload to S3. Let's assume the model is at the following location
Deploy endpoint with Messaging API enabled (using
sagemaker==2.232.2
)Note how in the above we specify max_input_length to be 256.
Now try to run inference
And observe how the endpoint is throwing the following exception
Claiming that the input is of length 1025 tokens.
Expected behavior
The input should be tokenized to 1024 tokens and not 1025. Using the exact same query, do the following locally
The same tokenization is also consistent with using mistral's own models without HuggingFace
Screenshots or logs
N/A
System information
A description of your system. Please provide:
Additional context
Is it possible to peak at the tokenized input when hitting the SageMaker endpoint?
I also tried deploying the endpoint without the messaging API enabled and tokenizing directly the following string
<s>[INST] {query}[/INST]
which results in the same amount of tokens (1025) as the variant with the messaging API enabled. Another option is to remove the preceding<s>
but I don't want to guess what happens behind the scenes.Could someone provide some guidance on how to get to the core of the issue. How can I ensure consistent tokenization between the local version and the SageMaker deployed variant. The underlying model is exactly the same.
The text was updated successfully, but these errors were encountered: