Skip to content

Commit

Permalink
Support of P3 is added. (awslabs#9)
Browse files Browse the repository at this point in the history
* Support of P3 is added.

* regions list are updated.
  • Loading branch information
b0noI authored and nswamy committed Mar 26, 2018
1 parent 48f3b18 commit d69c365
Show file tree
Hide file tree
Showing 5 changed files with 45 additions and 19 deletions.
10 changes: 9 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,14 @@ With this template, we continue with our mission to make [distributed deep learn
## What's New?
We've updated the AWS CloudFormation Deep Learning template to add some exciting new features and capabilities.

### Mar 22 2018

* We now support 10 AWS regions - us-east-1, us-west-2, eu-west-1, us-east-2, ap-southeast-2, ap-northeast-1, ap-northeast-2, ap-south-1, eu-central-1,ap-southeast-1.

* We now support p3 instances.

### Older Release Notes

* We now support 5 AWS regions - us-east-1, us-east-2, us-west-2, eu-west-1 and ap-southeast-2.

* We've enhanced the AWS CloudFormation Deep Learning template with automation that continues stack creation even if the provisioned number of worker instances falls short of the desired count. In the previous version of the template, if one of the worker instances failed to be provisioned, for example, if it a hit account limit, AWS CloudFormation rolled back the stack and required you to adjust your desired count and restart the stack creation process. The new template includes a function that automatically adjusts the count down and proceeds with setting up the rest of the cluster (stack).
Expand All @@ -17,7 +25,7 @@ We've updated the AWS CloudFormation Deep Learning template to add some exciting
* Amazon EFS allows sharing of code, data, and results across worker instances.
* Using Amazon EFS doesn't degrade performance for densely packed files (for example, .rec files containing image data).

* We now support creating a cluster of instances running Ubuntu. See the [Ubuntu Deep Learning AMI](https://aws.amazon.com/marketplace/pp/B06VSPXKDX).
* We now support creating a cluster of instances running Ubuntu. See the [Ubuntu Deep Learning AMI](https://aws.amazon.com/marketplace/pp/B076TGJHY1).

## EC2 Cluster Architecture
The following architecture diagram shows the EC2 cluster infrastructure.
Expand Down
4 changes: 2 additions & 2 deletions cfn-bootstrap/dl_cfn_setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@
AWS_DL_DEFAULT_USER = None
EFS_MOUNT = None

AWS_GPU_INSTANCE_TYPES = [ "g2.2xlarge", "g2.8xlarge", "p2.xlarge", "p2.8xlarge", "p2.16xlarge" ]
AWS_GPU_INSTANCE_TYPES = [ "g3.4xlarge", "g3.8xlarge", "g3.16xlarge", "p2.xlarge", "p2.8xlarge", "p2.16xlarge", "p3.2xlarge", "p3.8xlarge", "p3.16xlarge" ]

'''
Setup Logger and LogLevel
Expand Down Expand Up @@ -433,4 +433,4 @@ def main():
sys.exit(1)

if __name__ =='__main__':
main()
main()
2 changes: 1 addition & 1 deletion cfn-bootstrap/dl_cfn_setup_v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@
EFS_MOUNT = None
CFN_PATH = None

AWS_GPU_INSTANCE_TYPES = [ "g2.2xlarge", "g2.8xlarge", "p2.xlarge", "p2.8xlarge", "p2.16xlarge" ]
AWS_GPU_INSTANCE_TYPES = [ "g3.4xlarge", "g3.8xlarge", "g3.16xlarge", "p2.xlarge", "p2.8xlarge", "p2.16xlarge", "p3.2xlarge", "p3.8xlarge", "p3.16xlarge" ]

'''
Setup Logger and LogLevel
Expand Down
2 changes: 1 addition & 1 deletion cfn-template/StackSetup.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ If you need to scale the number of instances beyond the [default limit](https://

7. Choose an **ImageType**, Amazon Linux or Ubuntu.

8. Choose an **InstanceType**, such as [P2.16xlarge](https://aws.amazon.com/ec2/instance-types/p2/).
8. Choose an **InstanceType**, such as [p3.2xlarge](https://aws.amazon.com/ec2/instance-types/p3/).

9. For **KeyName**, choose an EC2 key pair.

Expand Down
46 changes: 32 additions & 14 deletions cfn-template/deeplearning.template
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,14 @@
"InstanceType" : {
"Description" : "The EC2 instance type for workers.For GPUs choose g2.xx or p2.xx",
"Type" : "String",
"Default" : "p2.xlarge",
"Default" : "p3.2xlarge",
"AllowedValues" : [
"p2.16xlarge",
"p2.8xlarge",
"p2.xlarge",
"p3.2xlarge",
"p3.8xlarge",
"p3.16xlarge",
"g2.8xlarge",
"g2.2xlarge",
"t2.small",
Expand Down Expand Up @@ -107,18 +110,28 @@
},
"Mappings" : {
"AmazonLinux" : {
"us-east-1" : { "AMI" : "ami-4b44745d" },
"us-east-2" : { "AMI" : "ami-305d7c55" },
"us-west-2" : { "AMI" : "ami-296e7850" },
"eu-west-1" : { "AMI" : "ami-d36386aa" },
"ap-southeast-2" : { "AMI" : "ami-52332031" }
"us-east-1" : { "AMI" : "ami-9706e5ea" },
"us-west-2" : { "AMI" : "ami-dc70ffa4" },
"eu-west-1" : { "AMI" : "ami-8caad3f5" },
"us-east-2" : { "AMI" : "ami-f4586f91" },
"ap-southeast-2" : { "AMI" : "ami-bbd710d9" },
"ap-northeast-1" : { "AMI" : "ami-5ba3d93d" },
"ap-northeast-2" : { "AMI" : "ami-d0d67bbe" },
"ap-south-1" : { "AMI" : "ami-359ec25a" },
"eu-central-1" : { "AMI" : "ami-ca3351a5" },
"ap-southeast-1" : { "AMI" : "ami-ded39da2" }
},
"Ubuntu" : {
"us-east-1" : { "AMI" : "ami-2edccb38" },
"us-east-2" : { "AMI" : "ami-2797b642" },
"us-west-2" : { "AMI" : "ami-7fd7c906" },
"eu-west-1" : { "AMI" : "ami-19896660" },
"ap-southeast-2" : { "AMI" : "ami-b32b37d0" }
"us-east-1" : { "AMI" : "ami-173bd86a" },
"us-west-2" : { "AMI" : "ami-5a77f822" },
"eu-west-1" : { "AMI" : "ami-2fb0c956" },
"us-east-2" : { "AMI" : "ami-295b6c4c" },
"ap-southeast-2" : { "AMI" : "ami-64d51206" },
"ap-northeast-1" : { "AMI" : "ami-bcafd5da" },
"ap-northeast-2" : { "AMI" : "ami-1ad17c74" },
"ap-south-1" : { "AMI" : "ami-959fc3fa" },
"eu-central-1" : { "AMI" : "ami-3a254755" },
"ap-southeast-1" : { "AMI" : "ami-63d9971f" }
},
"SubnetConfig" : {
"VPC" : { "CIDR" : "10.0.0.0/16" },
Expand All @@ -127,10 +140,15 @@
},
"S3" : {
"us-east-1" : { "URL" : "https://s3.amazonaws.com/" },
"us-east-2" : { "URL" : "https://s3-us-east-2.amazonaws.com/" },
"us-west-2" : { "URL" : "https://s3-us-west-2.amazonaws.com/" },
"eu-west-1" : { "URL" : "https://s3-eu-west-1.amazonaws.com/" },
"ap-southeast-2" : { "URL" : "https://s3-ap-southeast-2.amazonaws.com/" }
"us-east-2" : { "URL" : "https://s3-us-east-2.amazonaws.com/" },
"ap-southeast-2" : { "URL" : "https://s3-ap-southeast-2.amazonaws.com/" },
"ap-northeast-1" : { "URL" : "https://s3-ap-northeast-1.amazonaws.com/" },
"ap-northeast-2" : { "URL" : "https://s3-ap-northeast-2.amazonaws.com/" },
"ap-south-1" : { "URL" : "https://s3-ap-south-1.amazonaws.com/" },
"eu-central-1" : { "URL" : "https://s3-eu-central-1.amazonaws.com/" },
"ap-southeast-1" : { "URL" : "https://s3-ap-southeast-1.amazonaws.com/" }
},
"Other" : {
"S3SourceBucket" : { "BucketNameSuffix" : "-aws-dl-cfn" },
Expand All @@ -150,7 +168,7 @@
"Role": { "Fn::GetAtt" : ["LambdaExecutionRole", "Arn"] },
"Code": {
"S3Bucket": {"Fn::Join" : ["", [{ "Ref" : "AWS::Region" }, { "Fn::FindInMap" : [ "Other", "S3SourceBucket", "BucketNameSuffix" ]} ] ]},
"S3Key": { "Fn::FindInMap" : [ "Other", "LambdaFunction", "FileName" ]},
"S3Key": { "Fn::FindInMap" : [ "Other", "LambdaFunction", "FileName" ]}
},
"MemorySize" : "256",
"Timeout": "60",
Expand Down

0 comments on commit d69c365

Please sign in to comment.