subcategory |
---|
Deployment |
This datasource configures a simple access policy for AWS S3 buckets, so that Databricks can access data in it.
resource "aws_s3_bucket" "this" {
bucket = "<unique_bucket_name>"
force_destroy = true
}
data "databricks_aws_bucket_policy" "this" {
bucket = aws_s3_bucket.this.bucket
}
resource "aws_s3_bucket_policy" "this" {
bucket = aws_s3_bucket.this.id
policy = data.databricks_aws_bucket_policy.this.json
}
Bucket policy with full access:
resource "aws_s3_bucket" "ds" {
bucket = "${var.prefix}-ds"
force_destroy = true
tags = merge(var.tags, {
Name = "${var.prefix}-ds"
})
}
resource "aws_s3_bucket_versioning" "ds_versioning" {
bucket = aws_s3_bucket.ds.id
versioning_configuration {
status = "Disabled"
}
}
data "aws_iam_policy_document" "assume_role_for_ec2" {
statement {
effect = "Allow"
actions = ["sts:AssumeRole"]
principals {
identifiers = ["ec2.amazonaws.com"]
type = "Service"
}
}
}
resource "aws_iam_role" "data_role" {
name = "${var.prefix}-first-ec2s3"
description = "(${var.prefix}) EC2 Assume Role role for S3 access"
assume_role_policy = data.aws_iam_policy_document.assume_role_for_ec2.json
tags = var.tags
}
data "databricks_aws_bucket_policy" "ds" {
provider = databricks.mws
full_access_role = aws_iam_role.data_role.arn
bucket = aws_s3_bucket.ds.bucket
}
// allow databricks to access this bucket
resource "aws_s3_bucket_policy" "ds" {
bucket = aws_s3_bucket.ds.id
policy = data.databricks_aws_bucket_policy.ds.json
}
bucket
- (Required) AWS S3 Bucket name for which to generate the policy document.aws_partition
- (Optional) AWS partition. The options areaws
oraws-us-gov
. Defaults toaws
full_access_role
- (Optional) Data access role that can have full access for this bucketdatabricks_e2_account_id
- (Optional) Your Databricks account ID. Used to generate restrictive IAM policies that will increase the security of your root bucket
In addition to all arguments above, the following attributes are exported:
json
- (Read-only) AWS IAM Policy JSON document to grant Databricks full access to bucket.
The following resources are used in the same context:
- Provisioning AWS Databricks workspaces with a Hub & Spoke firewall for data exfiltration protection guide.
- End to end workspace management guide
- databricks_instance_profile to manage AWS EC2 instance profiles that users can launch databricks_cluster and access data, like databricks_mount.
- databricks_mount to mount your cloud storage on
dbfs:/mnt/name
.