Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(glue): introduction of AWS Glue Workflow L2 Construct #31014

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
233 changes: 233 additions & 0 deletions packages/@aws-cdk/aws-glue-alpha/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -594,3 +594,236 @@ new glue.DataQualityRuleset(this, 'MyDataQualityRuleset', {
```

For more information, see [AWS Glue Data Quality](https://docs.aws.amazon.com/glue/latest/dg/glue-data-quality.html).

## Workflow

A `Workflow` is a collection of multiple Glue jobs and crawlers that are executed in a DAG. For example, to create a workflow:

```ts
new glue.Workflow(this, 'MyWorkflow', {
workflowName: 'my_workflow',
description: 'description',
maxConcurrentRuns: 5,
defaultRunProperties: {
key1: 'value1',
key2: 'value2',
},
});
```

### Add Triggers

A glue `Workflow` requires triggers to be appeneded to the DAG. These triggers are executed at different stages of the DAG, depending on the type of trigger enabled.

#### On Demand Triggers

On Demand triggers are executed manually by the user, or by a third-party service. For example, to add an On Demand trigger to a workflow:

```ts
declare const myWorkflow: glue.Workflow;
declare const myJob: glue.IJob;
declare const myCrawler: glueCfn.CfnCrawler;
declare const securityConfiguration: glue.ISecurityConfiguration;

myWorkflow.addOnDemandTrigger('OnDemandTrigger', {
triggerName: 'on_demand_trigger',
description: 'description',
actions: [
{
job: myJob,
delayCloudwatchEvent: cdk.Duration.minutes(5),
arguments: {
key1: 'value1',
key2: 'value2',
},
securityConfiguration,
timeout: cdk.Duration.minutes(10),
},
{
crawler: myCrawler,
}
],
});
```

#### Schedule Triggers

Schedule triggers are executed at a specified time or interval. For example, to add a Schedule trigger to a workflow:

```ts
declare const myWorkflow: glue.Workflow;
declare const myJob: glue.IJob;
declare const myCrawler: glueCfn.CfnCrawler;
declare const securityConfiguration: glue.ISecurityConfiguration;
declare const schedule: events.Schedule;

myWorkflow.addCustomScheduleTrigger('ScheduleTrigger', {
triggerName: 'schedule_trigger',
description: 'description',
enabled: true,
schedule,
actions: [
{
job: myJob,
delayCloudwatchEvent: cdk.Duration.minutes(5),
arguments: {
key1: 'value1',
key2: 'value2',
},
securityConfiguration,
timeout: cdk.Duration.minutes(10),
},
{
crawler: myCrawler,
}
],
});
```

Convinience methods are available to add triggers to a workflow, for daily, weekly, and monthly schedules:

```ts
declare const myWorkflow: glue.Workflow;
declare const myJob: glue.IJob;
declare const myCrawler: glueCfn.CfnCrawler;
declare const securityConfiguration: glue.ISecurityConfiguration;

myWorkflow.addDailyScheduleTrigger('DailyScheduleTrigger', {
triggerName: 'daily_schedule_trigger',
description: 'description',
enabled: true,
actions: [
{
job: myJob,
delayCloudwatchEvent: cdk.Duration.minutes(5),
arguments: {
key1: 'value1',
key2: 'value2',
},
securityConfiguration,
timeout: cdk.Duration.minutes(10),
},
{
crawler: myCrawler,
}
],
});

myWorkflow.addWeeklyScheduleTrigger('WeeklyScheduleTrigger', {
triggerName: 'weekly_schedule_trigger',
description: 'description',
enabled: true,
actions: [
{
job: myJob,
delayCloudwatchEvent: cdk.Duration.minutes(5),
arguments: {
key1: 'value1',
key2: 'value2',
},
securityConfiguration,
timeout: cdk.Duration.minutes(10),
},
{
crawler: myCrawler,
}
],
});

myWorkflow.addMonthlyScheduleTrigger('MonthlyScheduleTrigger', {
triggerName: 'monthly_schedule_trigger',
description: 'description',
enabled: true,
actions: [
{
job: myJob,
delayCloudwatchEvent: cdk.Duration.minutes(5),
arguments: {
key1: 'value1',
key2: 'value2',
},
securityConfiguration,
timeout: cdk.Duration.minutes(10),
},
{
crawler: myCrawler,
}
],
});
```

#### Event Triggers

Event triggers are executed after a number of events have reached. For example, to add an Event trigger to a workflow:

```ts
declare const myWorkflow: glue.Workflow;
declare const myJob: glue.IJob;
declare const myCrawler: glueCfn.CfnCrawler;
declare const securityConfiguration: glue.ISecurityConfiguration;

myWorkflow.addNotifyEventTrigger('EventTrigger', {
triggerName: 'event_trigger',
description: 'description',
batchSize: 10,
batchWindow: cdk.Duration.minutes(5),
actions: [
{
job: myJob,
delayCloudwatchEvent: cdk.Duration.minutes(5),
arguments: {
key1: 'value1',
key2: 'value2',
},
securityConfiguration,
timeout: cdk.Duration.minutes(10),
},
{
crawler: myCrawler,
}
],
});
```

#### Conditional Triggers

Conditional triggers are executed based on a condition. For example, to add a Conditional trigger to a workflow:

```ts
declare const myWorkflow: glue.Workflow;
declare const myJob: glue.IJob;
declare const myCrawler: glueCfn.CfnCrawler;
declare const predicateJob: glue.IJob;
declare const predicateCrawler: glueCfn.CfnCrawler;
declare const securityConfiguration: glue.ISecurityConfiguration;

myWorkflow.addConditionalTrigger('ConditionalTrigger', {
triggerName: 'conditional_trigger',
description: 'description',
enabled: true,
predicateCondition: glue.TriggerPredicateCondition.AND,
jobPredicates: [{
job: predicateJob,
state: glue.PredicateState.SUCCEEDED,
}],
crawlerPredicates: [{
crawler: predicateCrawler,
state: glue.PredicateState.SUCCEEDED,
}],
actions: [
{
job: myJob,
delayCloudwatchEvent: cdk.Duration.minutes(5),
arguments: {
key1: 'value1',
key2: 'value2',
},
securityConfiguration,
timeout: cdk.Duration.minutes(10),
},
{
crawler: myCrawler,
}
],
});
```
1 change: 1 addition & 0 deletions packages/@aws-cdk/aws-glue-alpha/lib/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@ export * from './security-configuration';
export * from './storage-parameter';
export * from './table-base';
export * from './table-deprecated';
export * from './workflow';
Loading