Skip to content

Parameters And Constants

Frank Seide edited this page Aug 27, 2016 · 38 revisions

ParameterTensor{}

Creates a scalar, vector, matrix, or tensor of learnable parameters.

ParameterTensor {shape,
                 init='uniform'/*|gaussian|...*/, initOutputRank=1, initValueScale=1.0, randomSeed=-1,
                 initValue=0.0, initFromFilePath='',
                 learningRateMultiplier=1.0}

Parameters

  • shape: shape (dimensions) of parameter as an array. E.g. (13:42) to create a matrix with 13 rows and 42 columns. For some operations, dimensions given as 0 are automatically inferred (see here)
  • init (default 'uniform'): specifies random initialization, e.g. init='heNormal' (see here)
  • initOutputRank (default 1): specifies number of leading fan-out axes. If negative, -number of fan-out trailing axes
  • initValueScale (default 1): additional scaling factor applied to random initialization values
  • randomSeed (default -1): if positive, use this random seed for random initialization. If negative, use a counter that gets increased for each ParameterTensor{}
  • initValue: specifies initialization with a constant value, e.g. initValue=0
  • initFromFilePath: specifies initialization by loading initial values from a file. E.g. initFromFilePath="my_init_vals.txt"
  • learningRateMultiplier: system learning rate will be scaled by this (0 to disable learning) (see here)

Return Value

A tensor of learnable parameters.

Description

This factory function creates a scalar, vector, matrix or tensor of learnable parameters, that is, a tensor that is recognized by the "train" action as containing parameters that shall be updated during training.

The values will be initialized, depending on which optional parameter is given, to

  • random numbers, if init is given;
  • a constant if initValue is given; or
  • a tensor read from an external input file if initFromFilePath is given. The default is init="uniform".

To create a scalar, vector, matrix, or tensor with rank>2, pass the following as the shape parameter:

  • (1) for a scalar;
  • (M) for a column vector with M elements;
  • (1:N) for a row vector with N elements. Row vectors are one-row matrices;
  • (M:N) for a matrix with N rows and I columns; and
  • (I:J:K...) for a tensor of arbitrary rank>2 (note: the maximum allowed rank is 12).

Automatic dimension inference

When a ParameterTensor is used for weights as an immediate input of specific operations, it is allowed to specify some dimensions as 0. For example, the matrix product ParameterTensor{42:0} * x) will automatically infer the second dimension to be equal to the dimension of x.

This is extremely handy for inputs of layers, as it frees the user's BrainScript code from the burden of passing around the input dimensions. Further, in some situations it is very difficult to know the precise input dimensions of a layer, for example for the first fully connected layer on top of a pyramid of convolution/pooling combinations without padding, where each convolution and pooling operation may drop rows or columns of boundary pixels, and strides scale the dimensions.

This feature is what allows CNTK's predefined layers to be specified by their output dimension only (e.g. DenseLayer{1024}).

Random initialization

Fan-in and fan-out for random initialization

Reading initial values from files

The initial values can be read from a text file. To do this, pass a pathname for the optional parameter initFromFilePath. The text file is expected to consist of one line per matrix rows, which consist of space-separated numbers, one per column. The row and column dimensions in the file must match shape.

Parameter-specific learning rate

Parameter-specific learning rates can be realized with the optional learningRateMultiplier parameter. This factor is multiplied with the actual learning rate when performing parameter updates. For example, if specified as 0, the parameter will not be updated, it is constant.

Examples

Constant{}

Create a constant tensor.

Constant {scalarValue, rows = 1, cols = 1}

Parameters

Return Value

Description

Examples

Clone this wiki locally