Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several performance optimizations #100

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,13 @@ repository = "https://github.com/awslabs/coldsnap"
keywords = ["AWS", "Amazon", "EBS", "snapshot"]

[features]
default = ["rusoto-native-tls"]
default = ["rusoto-rustls"]
rusoto-native-tls = ["rusoto_core/native-tls", "rusoto_ebs/native-tls", "rusoto_ec2/native-tls"]
rusoto-rustls = ["rusoto_core/rustls", "rusoto_ebs/rustls", "rusoto_ec2/rustls"]

[dependencies]
argh = "0.1.3"
tokio = { version = "~1.8", features = ["fs", "io-util", "time"] } # LTS
tokio = { version = "~1.11", features = ["fs", "io-util", "time"] } # LTS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we're hoping to stay on tokio 1.8 as it's an LTS release.

sha2 = "0.9.6"
bytes = "1"
base64 = "0.13.0"
Expand All @@ -31,3 +31,7 @@ snafu = "0.6.9"
indicatif = "0.16.2"
tempfile = "3.1.0"
async-trait = "0.1.50"

[profile.release]
opt-level = 3
lto = true
4 changes: 3 additions & 1 deletion src/bin/coldsnap/client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@ this should cover most scenarios.
/// Create a rusoto client of the given type using the (optional) given region, endpoint, and credentials.
macro_rules! build_client {
($client_type:ty, $region_name:expr, $endpoint:expr, $profile:expr) => {{
let http_client = HttpClient::new().context(error::CreateHttpClient)?;
let mut http_config_with_bigger_buffer = HttpConfig::new();
http_config_with_bigger_buffer.read_buf_size(520 * 1024); // 512K chunk + some overhead
let http_client = HttpClient::new_with_config(http_config_with_bigger_buffer).context(error::CreateHttpClient)?;
let profile_provider = match $profile {
Some(profile) => {
let mut p = ProfileProvider::new().context(error::CreateProfileProvider)?;
Expand Down
5 changes: 2 additions & 3 deletions src/bin/coldsnap/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,17 +12,16 @@ mod client;
use argh::FromArgs;
use coldsnap::{SnapshotDownloader, SnapshotUploader, SnapshotWaiter, WaitParams};
use indicatif::{ProgressBar, ProgressStyle};
use rusoto_core::{HttpClient, Region};
use rusoto_core::{HttpClient,HttpConfig, Region};
use rusoto_credential::{ChainProvider, ProfileProvider};
use rusoto_ebs::EbsClient;
use rusoto_ec2::Ec2Client;
use snafu::{ensure, ResultExt};
use std::path::PathBuf;
use std::time::Duration;

type Result<T> = std::result::Result<T, error::Error>;

#[tokio::main]
#[tokio::main(flavor = "multi_thread", worker_threads = 512)]
Copy link
Contributor

@zmrow zmrow Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious how you calculated or determined this number!

// Returning a Result from main makes it print a Debug representation of the error, but with Snafu
// we have nice Display representations of the error, so we wrap "main" (run) and print any error.
// https://github.com/shepmaster/snafu/issues/110
Expand Down
43 changes: 24 additions & 19 deletions src/download.rs
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,10 @@ pub struct Error(error::Error);
type Result<T> = std::result::Result<T, Error>;

const GIBIBYTE: i64 = 1024 * 1024 * 1024;
const SNAPSHOT_BLOCK_WORKERS: usize = 64;
const SNAPSHOT_BLOCK_ATTEMPTS: u8 = 3;
const SNAPSHOT_BLOCK_WORKERS: usize = 2000;
const SNAPSHOT_BLOCK_ATTEMPTS: u8 = 5;
const SHA256_ALGORITHM: &str = "SHA256";
const DISABLE_SHA: bool = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't want to turn off the SHA hash checking by default. I could maybe see providing users a flag to turn this off, but a constant isn't really configurable, and isn't currently documented or particularly discoverable.


// ListSnapshotBlocks allows us to specify how many blocks are returned in each
// query, from the default of 100 to the maximum of 10000. Since we fetch all
Expand Down Expand Up @@ -302,24 +303,26 @@ impl SnapshotDownloader {
data_length,
}
);

let mut block_digest = Sha256::new();
block_digest.update(&block_data);
let hash_bytes = block_digest.finalize();
let block_hash = base64::encode(&hash_bytes);

ensure!(
block_hash == expected_hash,
error::BadBlockChecksum {
snapshot_id,
block_index,
block_hash,
expected_hash,
}
);
if !DISABLE_SHA {
let mut block_digest = Sha256::new();
block_digest.update(&block_data);
let hash_bytes = block_digest.finalize();
let block_hash = base64::encode(&hash_bytes);

ensure!(
block_hash == expected_hash,
error::BadBlockChecksum {
snapshot_id,
block_index,
block_hash,
expected_hash,
}
);
}

// Blocks of all zeroes can be omitted from the file.
let sparse = block_data.iter().all(|&byte| byte == 0u8);
// let sparse = block_data.iter().all(|&byte| byte == 0u8);
let sparse = expected_hash.eq("B4VNL+8pega6gWheZgwzLeNtXRjVRpJ9MNqtbX/aFUE="); // Known checksum for a sparse 512K block.
if sparse {
if let Some(ref progress_bar) = *context.progress_bar {
progress_bar.inc(1);
Expand Down Expand Up @@ -355,7 +358,9 @@ impl SnapshotDownloader {
.await
.context(error::WriteFileBytes { path, count })?;

f.flush().await.context(error::FlushFile { path })?;
if !DISABLE_SHA {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formatting looks a bit off here - can we make sure to run all the code through rustfmt?

f.flush().await.context(error::FlushFile { path })?;
}

if let Some(ref progress_bar) = *context.progress_bar {
progress_bar.inc(1);
Expand Down