Overview

The project implements AI DIAL API for language models from AWS Bedrock.

Supported models

Chat completion models

The following models support POST SERVER_URL/openai/deployments/DEPLOYMENT_NAME/chat/completions endpoint along with an optional support of POST /tokenize and POST /truncate_prompt endpoints:

Note that a model supports /truncate_prompt endpoint if and only if it supports max_prompt_tokens request parameter.

Vendor	Model	Deployment name	Modality	`/tokenize`	`/truncate_prompt`, `max_prompt_tokens`	tools/functions
Anthropic	Claude 3.5 Sonnet	[us.\|eu.]anthropic.claude-3-5-sonnet-20240620-v1:0	text-to-text, image-to-text	🟡	🟡	✅
Anthropic	Claude 3.5 Sonnet 2.0	[us.]anthropic.claude-3-5-sonnet-20241022-v2:0	text-to-text, image-to-text	🟡	🟡	✅
Anthropic	Claude 3 Sonnet	[us.\|eu.]anthropic.claude-3-sonnet-20240229-v1:0	text-to-text, image-to-text	🟡	🟡	✅
Anthropic	Claude 3 Haiku	[us.\|eu.]anthropic.claude-3-haiku-20240307-v1:0	text-to-text, image-to-text	🟡	🟡	✅
Anthropic	Claude 3 Opus	[us.]anthropic.claude-3-opus-20240229-v1:0	text-to-text, image-to-text	🟡	🟡	✅
Anthropic	Claude 2.1	anthropic.claude-v2:1	text-to-text	✅	✅	✅
Anthropic	Claude 2	anthropic.claude-v2	text-to-text	✅	✅	❌
Anthropic	Claude Instant 1.2	anthropic.claude-instant-v1	text-to-text	🟡	🟡	❌
Meta	Llama 3.1 405B Instruct	meta.llama3-1-405b-instruct-v1:0	text-to-text	🟡	🟡	❌
Meta	Llama 3.1 70B Instruct	meta.llama3-1-70b-instruct-v1:0	text-to-text	🟡	🟡	❌
Meta	Llama 3.1 8B Instruct	meta.llama3-1-8b-instruct-v1:0	text-to-text	🟡	🟡	❌
Meta	Llama 3 Chat 70B Instruct	meta.llama3-70b-instruct-v1:0	text-to-text	🟡	🟡	❌
Meta	Llama 3 Chat 8B Instruct	meta.llama3-8b-instruct-v1:0	text-to-text	🟡	🟡	❌
Meta	Llama 2 Chat 70B	meta.llama2-70b-chat-v1	text-to-text	🟡	🟡	❌
Meta	Llama 2 Chat 13B	meta.llama2-13b-chat-v1	text-to-text	🟡	🟡	❌
Stability AI	SDXL 1.0	stability.stable-diffusion-xl-v1	text-to-image	❌	🟡	❌
Stability AI	SD3 Large 1.0	stability.sd3-large-v1:0	text-to-image / image-to-image	❌	🟡	❌
Stability AI	Stable Image Ultra 1.0	stability.stable-image-ultra-v1:0	text-to-image	❌	🟡	❌
Stability AI	Stable Image Core 1.0	stability.stable-image-core-v1:0	text-to-image	❌	🟡	❌
Amazon	Titan Text G1 - Express	amazon.titan-tg1-large	text-to-text	🟡	🟡	❌
AI21 Labs	Jurassic-2 Ultra	ai21.j2-jumbo-instruct	text-to-text	🟡	🟡	❌
AI21 Labs	Jurassic-2 Ultra v1	ai21.j2-ultra-v1	text-to-text	🟡	🟡	❌
AI21 Labs	Jurassic-2 Mid	ai21.j2-grande-instruct	text-to-text	🟡	🟡	❌
AI21 Labs	Jurassic-2 Mid v1	ai21.j2-mid-v1	text-to-text	🟡	🟡	❌
Cohere	Command	cohere.command-text-v14	text-to-text	🟡	🟡	❌
Cohere	Command Light	cohere.command-light-text-v14	text-to-text	🟡	🟡	❌

✅, 🟡, and ❌ denote degrees of support of the given feature:

	`/tokenize`, `/truncate_prompt`, `max_prompt_token`	tools/functions
✅	Fully supported via an official tokenization algorithm	Fully supported via native tools API or official prompts to enable tools
🟡	Partially supported, because tokenization algorithm wasn't made public by the model vendor. An approximate tokenization algorithm is used instead. It conservatively counts every byte in UTF-8 encoding of a string as a single token.	Partially supported, because the model doesn't support tools natively. Prompt engineering is used instead to emulate tools, which may not be very reliable.
❌	Not supported	Not supported

Embedding models

The following models support SERVER_URL/openai/deployments/DEPLOYMENT_NAME/embeddings endpoint:

Model	Deployment name	Modality
Titan Multimodal Embeddings Generation 1 (G1)	amazon.titan-embed-image-v1	image/text-to-embedding
Amazon Titan Text Embeddings V2	amazon.titan-embed-text-v2:0	text-to-embedding
Titan Embeddings G1 – Text v1.2	amazon.titan-embed-text-v1	text-to-embedding
Cohere Embed English	cohere.embed-english-v3	text-to-embedding
Cohere Multilingual	cohere.embed-multilingual-v3	text-to-embedding

Developer environment

This project uses Python>=3.11 and Poetry>=1.6.1 as a dependency manager.

Check out Poetry's documentation on how to install it on your system before proceeding.

To install requirements:

poetry install

This will install all requirements for running the package, linting, formatting and tests.

IDE configuration

The recommended IDE is VSCode. Open the project in VSCode and install the recommended extensions.

The VSCode is configured to use PEP-8 compatible formatter Black.

Alternatively you can use PyCharm.

Set-up the Black formatter for PyCharm manually or install PyCharm>=2023.2 with built-in Black support.

Run

Run the development server:

make serve

Open localhost:5001/docs to make sure the server is up and running.

Environment Variables

Copy .env.example to .env and customize it for your environment:

Variable	Default	Description
AWS_ACCESS_KEY_ID	NA	AWS credentials with access to Bedrock service
AWS_SECRET_ACCESS_KEY	NA	AWS credentials with access to Bedrock service
AWS_DEFAULT_REGION		AWS region e.g. `us-east-1`
AWS_ASSUME_ROLE_ARN		AWS assume role arn e.g. `arn:aws:iam::123456789012:role/RoleName`
LOG_LEVEL	INFO	Log level. Use DEBUG for dev purposes and INFO in prod
AIDIAL_LOG_LEVEL	WARNING	AI DIAL SDK log level
DIAL_URL		URL of the core DIAL server. If defined, images generated by Stability are uploaded to the DIAL file storage and attachments are returned with URLs pointing to the images. Otherwise, the images are returned as base64 encoded strings.
WEB_CONCURRENCY	1	Number of workers for the server

Load balancing

If you use DIAL Core load balancing mechanism, you can provide extraData upstream setting with different aws account credentials/regions to use different model deployments:

{
  "upstreams": [
    {
      "extraData": {
        "region": "eu-west-1",
        "aws_access_key_id": "key_id_1",
        "aws_secret_access_key": "access_key_1"
      }
    },
    {
      "extraData": {
        "region": "eu-west-1",
        "aws_access_key_id": "key_id_2",
        "aws_secret_access_key": "access_key_2"
      }
    },
    {
      "extraData": {
        "region": "eu-west-1",
        "aws_assume_role_arn": "arn:aws:iam::123456789012:role/BedrockAccessAdapterRoleName"
      }
    }
  ]
}

Supported extraData fields:

region
aws_access_key_id
aws_secret_access_key
aws_assume_role_arn

Docker

Run the server in Docker:

make docker_serve

Lint

Run the linting before committing:

make lint

To auto-fix formatting issues run:

make format

Test

Run unit tests locally:

make test

Run unit tests in Docker:

make docker_test

Run integration tests locally:

make integration_tests

Clean

To remove the virtual environment and build artifacts:

make clean

Name		Name	Last commit message	Last commit date
Latest commit History 133 Commits
.github		.github
.vscode		.vscode
aidial_adapter_bedrock		aidial_adapter_bedrock
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.flake8		.flake8
.gitignore		.gitignore
.ort.yml		.ort.yml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.test		Dockerfile.test
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
noxfile.py		noxfile.py
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml
trivy.yaml		trivy.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Supported models

Chat completion models

Embedding models

Developer environment

IDE configuration

Run

Environment Variables

Load balancing

Docker

Lint

Test

Clean

About

Releases 24

Packages

Contributors 13

Languages

License

epam/ai-dial-adapter-bedrock

Folders and files

Latest commit

History

Repository files navigation

Overview

Supported models

Chat completion models

Embedding models

Developer environment

IDE configuration

Run

Environment Variables

Load balancing

Docker

Lint

Test

Clean

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases 24

Packages 0

Contributors 13

Languages

Packages