Skip to content

crccheck/kinesis-streams

Repository files navigation

Kinesis Streams

Build Status npm version Test Coverage

There once was a Kinesis readable stream without a home, and a Kinesis writable stream without a home, so now they're roommates.

NOTE: Kinesis was a bad idea, and we're switching to Kafka. So I won't be using my own library for much longer.

Installing

npm install kinesis-streams

Writeable stream

const AWS = require('aws-sdk')
const { KinesisWritable } = require('kinesis-streams')
const client = new AWS.Kinesis()
client.config.update({ maxRetries: 10 })
const writable = new KinesisWritable(client, 'streamName', options)
inputStream.pipe(writable)

Options

  • options.logger (optional) bunyan, winston, or logger with debug, error and info
  • options.highWaterMark (default: 16) Buffer this many records before writing to Kinesis. Equivalent to CollectionMaxCount
  • options.wait (default: 500) How many milliseconds it should periodically flush. Equivalent to RecordMaxBufferedTime

Some of these options have equivalents in the official KPL.

Custom events

These events are emitted:

  • kinesis.putRecords Fires after records are put and the response is processed. You'll get the original response from AWS. See demo.js for an example of how to interpret it

      reader.on('kinesis.putRecords', (response: {FailedRecordCount: number, Records: Record[]}) => {})
    

Setting the partition key

By default, the partition key is to a dummy value, '0'. If you have multiple shards, you need to set a partition key in a way that makes sense for your data. Here are two ways to do this:

  1. Set the getPartitionKey method of the writable stream instance:
const AWS = require('aws-sdk')
const { KinesisWritable } = require('kinesis-streams')
const client = new AWS.Kinesis()
const writable = new KinesisWritable(client, 'streamName', options)
writable.getPartitionKey = (data) => data.foo.substr(5)
inputStream.pipe(writable)
  1. Subclass KinesisWritable and provide your own getPartitionKey. See the source for reference.

Readable stream

const AWS = require('aws-sdk')
const { KinesisReadable } = require('kinesis-streams')
const client = new AWS.Kinesis()
const reader = new KinesisReadable(client, streamName, options)
reader.pipe(yourDestinationHere)

Options

  • options.logger (optional) bunyan, winston, or logger with debug, error and info

  • options.interval: number (default: 2000) Milliseconds between each Kinesis read. The AWS limit is 5 reads / second / shard

  • options.parser: Function If this is set, this function is applied to the data. Example:

      const reader = new KinesisReadable(client, streamName, {parser: JSON.parse})
      reader.on('data', console.log(data.id))
    
  • options.restartOnClose: boolean (default: false) Rediscover new shards once all current shards have been closed

  • And any getShardIterator parameter

Custom events

These events are emitted:

  • checkpoint This fires when data is received so you can keep track of the last successful sequence read:

      reader.on('checkpoint', (sequenceNumber: string) => {})
    

Loggers

KinesisWritable and KinesisReadable both take an optional logger option. If this is omitted, the debug logger will be used instead. To see output, set DEBUG=kinesis-streams:* in your environment.

Prior art

The writable stream is based on the interface of kinesis-write-stream. The checkpoint event in readable stream is based on kinesis-readable. The readable stream was originally written as a proof of concept in kinesis-console-consumer.

kinesis-write-stream was forked because at the time, it didn't support periodic flushes. Since then the configuration of the readable and writable streams have been rewritten to be consistent, and both emit lots of events now that consumers can use for instrumentation.

License

This package is licensed under Apache License 2.0, but the tests/writable.spec.js and test/fixture/* are originally from kinesis-write-stream MIT licensed from Espen Volden.