Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for non aws models - openAI + gemini #206

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

madhurprash
Copy link
Collaborator

This PR contains the following:

  1. Added external_predictor.py for handling inferences for openAI + gemini
  2. Prompt templates for gemini + openAI
  3. configuration files for openai only + gemini only + openai/gemini/bedrock models

@madhurprash madhurprash linked an issue Oct 1, 2024 that may be closed by this pull request
Copy link
Contributor

@aarora79 aarora79 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also have to make a chance in the metrics calculation notebook because it also does some pricing related calculations.

@@ -0,0 +1,27 @@
# Benchmark non AWS models on FMBench
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is specific to OpenAI and Gemini so I would mention that directly instead of saying non AWS because any 3P or open-source model is non AWS in that sense. Change to Benchmark OpenAI and Gemini models.

@@ -0,0 +1,27 @@
# Benchmark non AWS models on FMBench

This feature enables users to benchmark non AWS models on FMBench, such as OpenAI and Gemini models. Current models that are tested with this feature are: `gpt-4o`, `gpt-4o-mini`, `gemini-1.5-pro` and `gemini-1.5-flash`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FMBench -> FMBench.

@@ -0,0 +1,27 @@
# Benchmark non AWS models on FMBench

This feature enables users to benchmark non AWS models on FMBench, such as OpenAI and Gemini models. Current models that are tested with this feature are: `gpt-4o`, `gpt-4o-mini`, `gemini-1.5-pro` and `gemini-1.5-flash`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"...users to benchmark non AWS..." -> "users to benchmark external models such as OpenAI and Gemini models on FMBench"


### Prerequisites

To benchmark a non AWS model, the configuration file requires an **API Key**. Mention your custom API key within the `inference_spec` section in the `experiments` within the configuration file. View an example below:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API Key -> a model provider provided API Key (such as an OpenAI key or a Gemini key)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not configure the key directly, but rather the path to the API key. This should be handled in the same way as we handle the hf_token.txt to read the HF token.

read_bucket: {read_bucket}
scripts_prefix: scripts ## add your own scripts in case you are using anything that is not on jumpstart
script_files:
- hf_token.txt ## add your scripts files you have in s3 (including inference files, serving stacks, if any)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add the path to openai_key.txt and gemini_key.txt in this list.

max_length_in_tokens: 6000
payload_file: payload_en_5000-6000.jsonl
- language: en
min_length_in_tokens: 305
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the 305 to 3997



metrics:
dataset_of_interest: en_500-1000 # en_5000-6000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to 3000-4000

inference_script: external_predictor.py
inference_spec:
split_input_and_parameters: no
api_key: <your-api-key>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this parameter, if an external_predictor is being used then it should automatically check if the openai_key.txt or gemini_key.txt is present then set it into env vars, do not need to have this parameter here.

@@ -213,7 +213,10 @@
# token counting logic on the client side (does not impact the tokenizer the model uses)
# NOTE: if tokenizer files are provided in the tokenizer directory then they take precedence
# if the files are not present then we load the tokenizer for this model id from Hugging Face
TOKENIZER_MODEL_ID = config['experiments'][0]['model_id']
if config['experiments'][0].get('model_id', None) is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebaseline from main

# The inference format for each option (OpenAI/Gemini) is the same using LiteLLM
# for streaming/non-streaming
# set the environment for the specific model
if 'gemini' in self.endpoint_name:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should just do this based on the presence of a file and not rely on endpoint name.

@madhurprash madhurprash force-pushed the add-support-for-non-aws-models branch from 90950d4 to daf7a2b Compare October 2, 2024 15:16
@madhurprash madhurprash force-pushed the add-support-for-non-aws-models branch from 1a00eb6 to 542179b Compare October 21, 2024 14:44
@madhurprash madhurprash force-pushed the add-support-for-non-aws-models branch from aa98452 to eefa311 Compare December 8, 2024 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for non AWS models to be benchmarked
2 participants