The Kinesis Scaling Utility is designed to give you the ability to scale Amazon Kinesis streams in the same way that you scale EC2 Auto Scaling groups – up or down by a count or as a percentage of the total count. You can also simply scale to an exact number of Shards. There is no requirement for you to manage the allocation of the keyspace to Shards when using this API, as it is done automatically.
You can also deploy the Web Archive (WAR) to a Java application server, and allow Scaling Utils to automatically manage the number of Shards in the Stream based on the observed PUT or GET rate of the stream.
##Manually Managing your Stream##
You can manually scale a stream from the command line by calling the ScalingClient with the following syntax:
java -cp KinesisScalingUtils.jar-complete.jar -Dstream-name=MyStream -Dscaling-action=scaleUp -Dcount=10 -Dregion=eu-west-1 ScalingClient
The command line client takes several options, specified with -Doption-name=option-value
:
Options:
stream-name
- The name of the stream to be scaledscaling-action
- The action to be taken to scale. Must be one ofscaleUp
,scaleDown
orresize
count
- Number of shards by which to absolutely scale up or down, or the target number of shards if the scaling action isresize
.pct
- Percentage of the existing number of shards by which to scale up or downregion
- The AWS region where the stream exists, such asus-east-1
oreu-west-1
(defaultus-east-1
)shard-id
- The shard which you want to target for scaling. NOTE: This will create imbalanced partitioning of the keyspace and is not required.
The client must be invoked with one of count
and pct
.
You can also integrate the StreamScaler
class with existing control systems.
##Kinesis Autoscaling##
The Kinesis Autoscaling WAR can be deployed as an Elastic Beanstalk application, or to any Java application server, and once configured will monitor the CloudWatch statistics for your Stream and scale up and down as you configure it. Below you can see a graph of how Autoscaling will keep adequate Shard capacity to deal with PUT or GET demand:
To get started, create a new Elastic Beanstalk application which is a Web Server with a Tomcat predefined configuration. Deploy the WAR by uploading from your local build; see below for build instructions.
Once deployed, you must configure the autoscaling engine by providing a JSON configuration file on an HTTP or S3 URL. The structure of this configuration file is as follows:
[streamMonitor1, streamMonitor2...streamMonitorN]
A streamMonitor
object defines the autoscaling policy applied to a Kinesis stream, and this array allows a single autoscaling application to monitor multiple streams. A streamMonitor
object is configured by:
{
"streamName": "String - name of the Stream to be Monitored",
"region": "String - a Valid AWS Region Code, such as us-east-1 or eu-west-1",
"scaleOnOperation": "String - the type of metric to be monitored, including PUT or GET. Both PutRecord and PutRecords are monitored with PUT",
"scaleUp": {
"scaleThresholdPct":Integer - at what threshold we should scale up,
"scaleAfterMins":Integer - how many minutes above the scaleThresholdPct we should wait before scaling up,
"scaleCount":Integer - number of Shards to scale up by,
"scalePct":Integer - % of current Stream capacity to scale up by
},
"scaleDown":{
"scaleThresholdPct":Integer - at what threshold we should scale down,
"scaleAfterMins":Integer - how many minutes below the scaleThresholdPct we should wait before scaling down,
"scaleCount":Integer - number of Shards to scale down by,
"scalePct":Integer - % of current Stream capacity to scale down by,
"coolOffMins":Integer - number of minutes to wait after a Stream scale down before we scale down again
}
}
Once you've built the Autoscaling configuration required, save it to an HTTP file server or to Amazon S3. Then, access your Elastic Beanstalk application, and select 'Configuration' from the left hand Navigation Menu. Then select the 'Software Configuration' panel, and add a new configuration item called 'config-file-url' that points to the URL of the configuration file. Acceptable formats are 'http://path to file' or 's3://bucket/path to file'. Save the configuration, and then check the application logs for correct operation.
When deploying the application, it must have a service access to Kinesis, Cloudwatch and S3, as well as SNS if using the SNS scaling notifications:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"kinesis:Get*",
"kinesis:List*",
"kinesis:Describe*",
"kinesis:MergeShards",
"kinesis:SplitShard"
],
"Resource": "*"
},
{
"Action": [
"sns:Publish"
],
"Effect": "Allow",
"Resource": "arn:aws:sns:us-east-1:01234567890:scaling-test-notification"
},
{
"Action": [
"cloudwatch:GetMetricStatistics"
],
"Effect": "Allow",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject"
],
"Resource": "arn:aws:s3:::config-bucket-goes-here/*"
}
]
}
To receive SNS notifications when scaling actions are executed, provide the ARN of an SNS topic in the scaling configuration object, as below:
{
"streamName": "scaling-test-stream",
"region": "us-east-1",
"scaleOnOperation": "PUT",
"scaleUp": {
"scaleThresholdPct": 80,
"scaleAfterMins": 10,
"scalePct": 20
},
"scaleDown": {
"scaleThresholdPct": 20,
"scaleAfterMins": 60,
"scaleCount": 1,
"coolOffMins": 120
},
"snsArn": "arn:aws:sns:us-east-1:01234567890:scaling-test-notification"
}
The topic will receive notification upon the completion of each scaling action. For this feature to work, the application
must have the SNS:Publish
permission for the provided SNS topic.
To build the project locally, run:
$ mvn war:war
The WAR at target/kinesis-scaling-utils-<version>.war
can be deployed to S3 and used with Elastic Beanstalk or elsewhere.
To run the project locally, run:
$ mvn jetty:run -Dconfig-file-url=s3://config-bucket-goes-here/config-file-name.json
This will start the application locally. You will need to have AWS credentials set as environment variables or in
~/.aws/config
for it to work correctly.