Skip to content

OptimizedRNNStack

Frank Seide edited this page Aug 28, 2016 · 7 revisions

Implements the optimized CuDNN5 RNN stack of one or more recurrent network layers.

OptimizedRNNStack (weights, input,
                   hiddenDims, numLayers = 1,
                   bidirectional = false,
                   recurrentOp='lstm')

Parameters

  • weights: one weight matrix containing all model parameters as a single matrix. Use dimension inference.
  • input: data to apply the stack of one or more recurrent networks to. Must be a sequence, and must not be sparse.
  • hiddenDims: dimension of the hidden state in each layer and, if bidirectional, of each of the two directions
  • numLayers (default: 1): number of layers
  • bidirectional (default: false): if true, the model is bidirectional
  • recurrentOp (default: lstm): select the RNN type. Allowed values: lstm, gru, rnnTanh, rnnReLU

Description

This function gives access to the CuDNN5 RNN, which implements a stack of one or more layers of recurrent networks. The networks can be uni- or bidirectional, and be of the following kind (recurrentOp parameter):

  • lstm: Long Short Term Memory (Hochreiter and Schmidhuber)
  • gru: Gated Recurrent Unit
  • rnnTanh: plain RNN with a tanh non-linearity
  • rnnReLU: plain RNN with a rectified linear non-linearity

If you use the lstm operation, we recommend to use this primitive through RecurrentLSTMLayerStack{}.

Example

Speech recognition model that consists of a 3-hidden layer a bidirectional LSTM with a hidden-state dimension per layer and direction of 512:

features = Input {40}
W = ParameterTensor {(Inferred:Inferred), initOutputRank=-1, initValueScale=1/10}
h = OptimizedRNNStack (W, features, 512, numLayers=3, bidirectional=true)
p = DenseLayer {9000, activation=Softmax, init='heUniform', initValueScale=1/3}
Clone this wiki locally