Skip to content

Commit

Permalink
Stop limiting num-connections based on num-known-IPs (Improve S3-Expr…
Browse files Browse the repository at this point in the history
…ess performance) (#407)

**Issue:**
Disappointing S3-Express performance.

**Description of changes:**
Stop limiting num-connections based on num-known-IPs.

**Diagnosing the issue:**
We found that num-connections was never getting very high, because [num-connections scales based on the num-known-IPs](https://github.com/awslabs/aws-c-s3/blob/593c2ab24608d3e78708d51657be22f6ab99cb50/source/s3_client.c#L179). S3-Express endpoints have very few IPs, so their num-connections weren't scaling very high.

The algorithm was adding 10 connections per known-IP. On a 100Gb/s machine, this maxed out at 250 connections once 25 IPs were known. But S3-Express endpoints only have 4 unique IPs, so they never got higher than 40 connections.

This algorithm was written back when S3 returned 1 IP per DNS query. The intention was to throttle connections until more IPs were known, in order to spread load among S3's server fleet. However, as of Aug 2023 [S3 provides multiple IPs per DNS query](https://aws.amazon.com/about-aws/whats-new/2023/08/amazon-s3-multivalue-answer-response-dns-queries/). So now, we can scale up to max connections after the first DNS query and still be spreading load.

We also believed that spreading load was a key to good performance. But I found that spreading the load didn't have much impact on performance (at least now, in 2024, on the 100Gb/s machine I was using). Tests where I hard-coded a single IP and hit it with max-connections didn't differ much from tests where the load was spread among 8 IPs or 100 IPs.

I want to get this change out quickly and help S3-Express, so I picked magic numbers where the num-connections math ends up with the same result as the old algorithm. Normal S3 performance is mildly improved (max-connections is reached immediately, instead of scaling up over 30sec as it finds more IPs). S3 Express performance is MUCH improved.

**Future Work:**
Improve this algorithm further:
- expect higher throughput on connections to S3 Express
- expect lower throughput on connections transferring small objects
- dynamic scaling without a bunch of magic numbers ??? (sounds cool, but I don't have any ideas how this would work yet)
  • Loading branch information
graebm authored Feb 29, 2024
1 parent 593c2ab commit 59569e3
Show file tree
Hide file tree
Showing 4 changed files with 47 additions and 104 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ The AWS-C-S3 library is an asynchronous AWS S3 client focused on maximizing thro
### Key features:
- **Automatic Request Splitting**: Improves throughput by automatically splitting the request into part-sized chunks and performing parallel uploads/downloads of these chunks over multiple connections. There's a cap on the throughput of single S3 connection, the only way to go faster is multiple parallel connections.
- **Automatic Retries**: Increases resilience by retrying individual failed chunks of a file transfer, eliminating the need to restart transfers from scratch after an intermittent error.
- **DNS Load Balancing**: DNS resolver continuously harvests Amazon S3 IP addresses. When load is spread across the S3 fleet, overall throughput is better than if all connections were hammering the same IP simultaneously.
- **DNS Load Balancing**: DNS resolver continuously harvests Amazon S3 IP addresses. When load is spread across the S3 fleet, overall throughput more reliable than if all connections are going to a single IP.
- **Advanced Network Management**: The client incorporates automatic request parallelization, effective timeouts and retries, and efficient connection reuse. This approach helps to maximize throughput and network utilization, and to avoid network overloads.
- **Thread Pools and Async I/O**: Avoids bottlenecks associated with single-thread processing.
- **Parallel Reads**: When uploading a large file from disk, reads from multiple parts of the file in parallel. This is faster than reading the file sequentially from beginning to end.
Expand Down
9 changes: 3 additions & 6 deletions include/aws/s3/private/s3_client_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -242,8 +242,8 @@ struct aws_s3_client {
/* Throughput target in Gbps that we are trying to reach. */
const double throughput_target_gbps;

/* The calculated ideal number of VIP's based on throughput target and throughput per vip. */
const uint32_t ideal_vip_count;
/* The calculated ideal number of HTTP connections, based on throughput target and throughput per connection. */
const uint32_t ideal_connection_count;

/**
* For multi-part upload, content-md5 will be calculated if the AWS_MR_CONTENT_MD5_ENABLED is specified
Expand Down Expand Up @@ -484,10 +484,7 @@ struct aws_s3_endpoint *aws_s3_endpoint_acquire(struct aws_s3_endpoint *endpoint
void aws_s3_endpoint_release(struct aws_s3_endpoint *endpoint);

AWS_S3_API
extern const uint32_t g_max_num_connections_per_vip;

AWS_S3_API
extern const uint32_t g_num_conns_per_vip_meta_request_look_up[];
extern const uint32_t g_min_num_connections;

AWS_S3_API
extern const size_t g_expect_timeout_offset_ms;
Expand Down
66 changes: 24 additions & 42 deletions source/s3_client.c
Original file line number Diff line number Diff line change
Expand Up @@ -51,21 +51,22 @@ struct aws_s3_meta_request_work {

static const enum aws_log_level s_log_level_client_stats = AWS_LL_INFO;

/* max-requests-in-flight = ideal-num-connections * s_max_requests_multiplier */
static const uint32_t s_max_requests_multiplier = 4;

/* TODO Provide analysis on origins of this value. */
static const double s_throughput_per_vip_gbps = 4.0;

/* Preferred amount of active connections per meta request type. */
const uint32_t g_num_conns_per_vip_meta_request_look_up[AWS_S3_META_REQUEST_TYPE_MAX] = {
10, /* AWS_S3_META_REQUEST_TYPE_DEFAULT */
10, /* AWS_S3_META_REQUEST_TYPE_GET_OBJECT */
10, /* AWS_S3_META_REQUEST_TYPE_PUT_OBJECT */
10 /* AWS_S3_META_REQUEST_TYPE_COPY_OBJECT */
};
/* This is used to determine the ideal number of HTTP connections. Algorithm is roughly:
* num-connections-max = throughput-target-gbps / s_throughput_per_connection_gbps
*
* Magic value based on: match results of the previous algorithm,
* where throughput-target-gpbs of 100 resulted in 250 connections.
*
* TODO: Improve this algorithm (expect higher throughput for S3 Express,
* expect lower throughput for small objects, etc)
*/
static const double s_throughput_per_connection_gbps = 100.0 / 250;

/* Should be max of s_num_conns_per_vip_meta_request_look_up */
const uint32_t g_max_num_connections_per_vip = 10;
/* After throughput math, clamp the min/max number of connections */
const uint32_t g_min_num_connections = 10; /* Magic value based on: 10 was old behavior */

/**
* Default part size is 8 MiB to reach the best performance from the experiments we had.
Expand Down Expand Up @@ -151,32 +152,9 @@ uint32_t aws_s3_client_get_max_active_connections(
struct aws_s3_client *client,
struct aws_s3_meta_request *meta_request) {
AWS_PRECONDITION(client);
(void)meta_request;

uint32_t num_connections_per_vip = g_max_num_connections_per_vip;
uint32_t num_vips = client->ideal_vip_count;

if (meta_request != NULL) {
num_connections_per_vip = g_num_conns_per_vip_meta_request_look_up[meta_request->type];

struct aws_s3_endpoint *endpoint = meta_request->endpoint;
AWS_ASSERT(endpoint != NULL);

AWS_ASSERT(client->vtable->get_host_address_count);
size_t num_known_vips = client->vtable->get_host_address_count(
client->client_bootstrap->host_resolver, endpoint->host_name, AWS_GET_HOST_ADDRESS_COUNT_RECORD_TYPE_A);

/* If the number of known vips is less than our ideal VIP count, clamp it. */
if (num_known_vips < (size_t)num_vips) {
num_vips = (uint32_t)num_known_vips;
}
}

/* We always want to allow for at least one VIP worth of connections. */
if (num_vips == 0) {
num_vips = 1;
}

uint32_t max_active_connections = num_vips * num_connections_per_vip;
uint32_t max_active_connections = client->ideal_connection_count;

if (client->max_active_connections_override > 0 &&
client->max_active_connections_override < max_active_connections) {
Expand Down Expand Up @@ -530,7 +508,7 @@ struct aws_s3_client *aws_s3_client_new(
}
/* Setup cannot fail after this point. */

if (client_config->throughput_target_gbps != 0.0) {
if (client_config->throughput_target_gbps > 0.0) {
*((double *)&client->throughput_target_gbps) = client_config->throughput_target_gbps;
} else {
*((double *)&client->throughput_target_gbps) = s_default_throughput_target_gbps;
Expand All @@ -539,10 +517,14 @@ struct aws_s3_client *aws_s3_client_new(
*((enum aws_s3_meta_request_compute_content_md5 *)&client->compute_content_md5) =
client_config->compute_content_md5;

/* Determine how many vips are ideal by dividing target-throughput by throughput-per-vip. */
/* Determine how many connections are ideal by dividing target-throughput by throughput-per-connection. */
{
double ideal_vip_count_double = client->throughput_target_gbps / s_throughput_per_vip_gbps;
*((uint32_t *)&client->ideal_vip_count) = (uint32_t)ceil(ideal_vip_count_double);
double ideal_connection_count_double = client->throughput_target_gbps / s_throughput_per_connection_gbps;
/* round up and clamp */
ideal_connection_count_double = ceil(ideal_connection_count_double);
ideal_connection_count_double = aws_max_double(g_min_num_connections, ideal_connection_count_double);
ideal_connection_count_double = aws_min_double(UINT32_MAX, ideal_connection_count_double);
*(uint32_t *)&client->ideal_connection_count = (uint32_t)ideal_connection_count_double;
}

client->cached_signing_config = aws_cached_signing_config_new(client, client_config->signing_config);
Expand Down Expand Up @@ -1687,7 +1669,7 @@ static bool s_s3_client_should_update_meta_request(
size_t num_known_vips = client->vtable->get_host_address_count(
client->client_bootstrap->host_resolver, endpoint->host_name, AWS_GET_HOST_ADDRESS_COUNT_RECORD_TYPE_A);
if (num_known_vips == 0 && (client->threaded_data.num_requests_being_prepared +
client->threaded_data.request_queue_size) >= g_max_num_connections_per_vip) {
client->threaded_data.request_queue_size) >= g_min_num_connections) {
return false;
}

Expand Down
74 changes: 19 additions & 55 deletions tests/s3_data_plane_tests.c
Original file line number Diff line number Diff line change
Expand Up @@ -259,70 +259,46 @@ static int s_test_s3_client_get_max_active_connections(struct aws_allocator *all

struct aws_s3_client *mock_client = aws_s3_tester_mock_client_new(&tester);
*((uint32_t *)&mock_client->max_active_connections_override) = 0;
*((uint32_t *)&mock_client->ideal_vip_count) = 10;
*((uint32_t *)&mock_client->ideal_connection_count) = 100;
mock_client->client_bootstrap = &mock_client_bootstrap;
mock_client->vtable->get_host_address_count = s_test_get_max_active_connections_host_address_count;

struct aws_s3_meta_request *mock_meta_requests[AWS_S3_META_REQUEST_TYPE_MAX];

for (size_t i = 0; i < AWS_S3_META_REQUEST_TYPE_MAX; ++i) {
/* Verify that g_max_num_connections_per_vip and g_num_conns_per_vip_meta_request_look_up are set up
* correctly.*/
ASSERT_TRUE(g_max_num_connections_per_vip >= g_num_conns_per_vip_meta_request_look_up[i]);

/* Setup test data. */
mock_meta_requests[i] = aws_s3_tester_mock_meta_request_new(&tester);
mock_meta_requests[i]->type = i;
mock_meta_requests[i]->endpoint = aws_s3_tester_mock_endpoint_new(&tester);
}

/* With host count at 0, we should allow for one VIP worth of max-active-connections. */
{
s_test_max_active_connections_host_count = 0;

ASSERT_TRUE(
aws_s3_client_get_max_active_connections(mock_client, NULL) ==
mock_client->ideal_vip_count * g_max_num_connections_per_vip);

for (size_t i = 0; i < AWS_S3_META_REQUEST_TYPE_MAX; ++i) {
ASSERT_TRUE(
aws_s3_client_get_max_active_connections(mock_client, mock_meta_requests[i]) ==
g_num_conns_per_vip_meta_request_look_up[i]);
}
}

s_test_max_active_connections_host_count = 2;

/* Behavior should not be affected by max_active_connections_override since it is 0, and should just be in relation
* to ideal-vip-count and host-count. */
* to ideal-connection-count. */
{
ASSERT_TRUE(
aws_s3_client_get_max_active_connections(mock_client, NULL) ==
mock_client->ideal_vip_count * g_max_num_connections_per_vip);
ASSERT_TRUE(aws_s3_client_get_max_active_connections(mock_client, NULL) == mock_client->ideal_connection_count);

for (size_t i = 0; i < AWS_S3_META_REQUEST_TYPE_MAX; ++i) {
ASSERT_TRUE(
aws_s3_client_get_max_active_connections(mock_client, mock_meta_requests[i]) ==
s_test_max_active_connections_host_count * g_num_conns_per_vip_meta_request_look_up[i]);
mock_client->ideal_connection_count);
}
}

/* Max active connections override should now cap the calculated amount of active connections. */
{
*((uint32_t *)&mock_client->max_active_connections_override) = 3;

ASSERT_TRUE(
mock_client->max_active_connections_override <
mock_client->ideal_vip_count * g_max_num_connections_per_vip);
/* Assert that override is low enough to have effect */
ASSERT_TRUE(mock_client->max_active_connections_override < mock_client->ideal_connection_count);

ASSERT_TRUE(
aws_s3_client_get_max_active_connections(mock_client, NULL) ==
mock_client->max_active_connections_override);

for (size_t i = 0; i < AWS_S3_META_REQUEST_TYPE_MAX; ++i) {
ASSERT_TRUE(
mock_client->max_active_connections_override <
s_test_max_active_connections_host_count * g_num_conns_per_vip_meta_request_look_up[i]);
ASSERT_TRUE(mock_client->max_active_connections_override < mock_client->ideal_connection_count);

ASSERT_TRUE(
aws_s3_client_get_max_active_connections(mock_client, mock_meta_requests[i]) ==
Expand All @@ -334,22 +310,17 @@ static int s_test_s3_client_get_max_active_connections(struct aws_allocator *all
{
*((uint32_t *)&mock_client->max_active_connections_override) = 100000;

ASSERT_TRUE(
mock_client->max_active_connections_override >
mock_client->ideal_vip_count * g_max_num_connections_per_vip);
/* Assert that override is NOT low enough to have effect */
ASSERT_TRUE(mock_client->max_active_connections_override > mock_client->ideal_connection_count);

ASSERT_TRUE(
aws_s3_client_get_max_active_connections(mock_client, NULL) ==
mock_client->ideal_vip_count * g_max_num_connections_per_vip);
ASSERT_TRUE(aws_s3_client_get_max_active_connections(mock_client, NULL) == mock_client->ideal_connection_count);

for (size_t i = 0; i < AWS_S3_META_REQUEST_TYPE_MAX; ++i) {
ASSERT_TRUE(
mock_client->max_active_connections_override >
s_test_max_active_connections_host_count * g_num_conns_per_vip_meta_request_look_up[i]);
ASSERT_TRUE(mock_client->max_active_connections_override > mock_client->ideal_connection_count);

ASSERT_TRUE(
aws_s3_client_get_max_active_connections(mock_client, mock_meta_requests[i]) ==
s_test_max_active_connections_host_count * g_num_conns_per_vip_meta_request_look_up[i]);
mock_client->ideal_connection_count);
}
}

Expand Down Expand Up @@ -822,12 +793,12 @@ static int s_test_s3_update_meta_requests_trigger_prepare(struct aws_allocator *
struct aws_client_bootstrap mock_bootstrap;
AWS_ZERO_STRUCT(mock_bootstrap);

const uint32_t ideal_vip_count = 10;
const uint32_t ideal_connection_count = 100;

struct aws_s3_client *mock_client = aws_s3_tester_mock_client_new(&tester);
mock_client->client_bootstrap = &mock_bootstrap;
mock_client->vtable->get_host_address_count = s_test_s3_update_meta_request_trigger_prepare_get_host_address_count;
*((uint32_t *)&mock_client->ideal_vip_count) = ideal_vip_count;
*((uint32_t *)&mock_client->ideal_connection_count) = ideal_connection_count;
aws_linked_list_init(&mock_client->threaded_data.request_queue);
aws_linked_list_init(&mock_client->threaded_data.meta_requests);

Expand Down Expand Up @@ -872,27 +843,20 @@ static int s_test_s3_update_meta_requests_trigger_prepare(struct aws_allocator *
&mock_meta_request_with_work->client_process_work_threaded_data.node);
aws_s3_meta_request_acquire(mock_meta_request_with_work);

/* With no known addresses, the amount of requests that can be prepared should only be enough for one VIP. */
/* With no known addresses, the amount of requests that can be prepared should be lower. */
{
s_test_s3_update_meta_request_trigger_prepare_host_address_count = 0;
aws_s3_client_update_meta_requests_threaded(mock_client);

ASSERT_SUCCESS(s_validate_prepared_requests(
mock_client, g_max_num_connections_per_vip, mock_meta_request_with_work, mock_meta_request_without_work));
mock_client, g_min_num_connections, mock_meta_request_with_work, mock_meta_request_without_work));
}

/* When the number of known addresses is greater than or equal to the ideal vip count, the max number of requests
* should be reached. */
/* When the number of known addresses is 1+, the max number of requests should be reached. */
{
const uint32_t max_requests_prepare = aws_s3_client_get_max_requests_prepare(mock_client);

s_test_s3_update_meta_request_trigger_prepare_host_address_count = (size_t)(ideal_vip_count);
aws_s3_client_update_meta_requests_threaded(mock_client);

ASSERT_SUCCESS(s_validate_prepared_requests(
mock_client, max_requests_prepare, mock_meta_request_with_work, mock_meta_request_without_work));

s_test_s3_update_meta_request_trigger_prepare_host_address_count = (size_t)(ideal_vip_count + 1);
s_test_s3_update_meta_request_trigger_prepare_host_address_count = 1;
aws_s3_client_update_meta_requests_threaded(mock_client);

ASSERT_SUCCESS(s_validate_prepared_requests(
Expand Down Expand Up @@ -980,7 +944,7 @@ static int s_test_s3_client_update_connections_finish_result(struct aws_allocato
mock_client->vtable->create_connection_for_request =
s_s3_test_meta_request_has_finish_result_client_create_connection_for_request;

*((uint32_t *)&mock_client->ideal_vip_count) = 1;
*((uint32_t *)&mock_client->ideal_connection_count) = 1;

aws_linked_list_init(&mock_client->threaded_data.request_queue);

Expand Down

0 comments on commit 59569e3

Please sign in to comment.