API Documentation¶

This is a construction site...

Network¶

class brainstorm.structure.network.Network(layers, buffer_manager, architecture, seed=None, handler=<brainstorm.handlers.numpy_handler.NumpyHandler object>)[source]¶

backward_pass()[source]¶: Perform a backward pass on all provided data and targets.

Note

All the targets to be used during this backward pass have to be passed to the network beforehand using provide_external_data. Also this backward pass depends on the internal state produced by a forward pass. So you have to always run a forward_pass first.

forward_pass(training_pass=False, context=None)[source]¶

Perform a forward pass on all the provided data.

Note

All the input data to be used during this forward pass have to be passed to the network beforehand using provide_external_data()

Parameters:

training_pass (Optional[bool]) – Indicates whether this forward pass belongs to training or not. This might change the behaviour of some layers.
context (Optional[dict]) – An optional network state as created by net.get_context(). If provided the network will treat this as if it was the the state of the network at the t=-1. This is useful for continuing the computations of a recurrent neural network. Defaults to None.

classmethod from_architecture(architecture)[source]¶

Create Network instance from given architecture.

Parameters:	architecture (dict) – JSON serializable Architecture description.
Returns:	A fully functional Network instance.
Return type:	Network

classmethod from_hdf5(filename)[source]¶

Load network from HDF5 file.

Parameters:	filename (str) – Name of the file that the network should be loaded from.
Returns:	The loaded network.
Return type:	Network

Trainer¶

class brainstorm.training.trainer.Trainer(stepper, verbose=True)[source]¶

Trainer objects organize the process of training a network. They can employ different training methods (Steppers) and call Hooks.

__init__(stepper, verbose=True)[source]¶

Create a new Trainer.

Parameters:	stepper (brainstorm.training.steppers.TrainingStepper) – verbose (bool) –

add_hook(hook)[source]¶

Add a hook to this trainer.

Hooks add a variety of functionality to the trainer and can be called after every specified number of parameter updates or epochs. See documentation for ::class::Hook for more details.

Note

During training, hooks will be called in the same order that they were added. This should be kept in mind when using a hook which relies on another hook having been called.

Parameters:	hook (brainstorm.hooks.Hook) – Any ::class::Hook object that should be called by this trainer.
Raises:	`ValueError` – If a hook with the same name has already been added.

train(net, training_data_iter, **named_data_iters)[source]¶: Train a network using a data iterator and further named data iterators.

Tools¶

brainstorm.tools.draw_network(network, file_name='network.png')[source]¶

Write a diagram for a network to a file.

Parameters:	network (brainstorm.structure.Network) – Network to be drawn. file_name (Optional[str]) – Defaults to ‘network.png’.

Note

This tool requires the pygraphviz library to be installed.

Raises:	`ImportError` – If pygraphviz can not be imported.

brainstorm.tools.evaluate(network, iter, scorers=(), out_name='', targets_name='targets', mask_name=None)[source]¶

Evaluate one or more scores for a network.

This tool can be used to evaluate scores of a trained network on test data.

Parameters:

network (brainstorm.structure.Network) – Network to be evaluated.
iter (brainstorm.DataIterator) – A data iterator which produces the data on which the scores are computed.
scorers (tuple[brainstorm.scorers.Scorer]) – A list or tuple of Scorers.
out_name (Optional[str]) – Name of the network output which is scored against the targets.
targets_name (Optional[str]) – Name of the targets data provided by the data iterator (iter).
mask_name (Optional[str]) – Name of the mask data provided by the data iterator (iter).

brainstorm.tools.extract(network, iter, buffer_names)[source]¶

Apply the network to some data and return the requested buffers.

Batches are returned as a dictionary, with one entry for each requested buffer, with the data in (T, B, ...) order.

Parameters:	network (brainstorm.structure.Network) – Network using which the features should be generated. iter (brainstorm.DataIterator) – A data iterator which produces the data on which the features are computed. buffer_names (list[unicode]) – Name of the buffer views to be saved (in dotted notation).
Returns:	dict[unicode, np.ndarray]

brainstorm.tools.extract_and_save(network, iter, buffer_names, file_name)[source]¶

Save the desired buffer values of a network to an HDF5 file.

In particular, this tool can be used to save the predictions of a network on a dataset. In general, any number of internal, input or output buffers of the network can be extracted.

Examples

>>> getter = Minibatches(100, default=x_test)
>>> extract_and_save(network,
...                  getter,
...                  ['Output.outputs.predictions',
...                   'Hid1.internals.H'],
...                  'network_features.hdf5')

Parameters:

network (brainstorm.structure.Network) – Network using which the features should be generated.
iter (brainstorm.DataIterator) – A data iterator which produces the data on which the features are computed.
buffer_names (list[unicode]) – Name of the buffer views to be saved (in dotted notation). See example.
file_name (unicode) – Name of the hdf5 file (including extension) in which the features should be saved.

brainstorm.tools.print_network_info(network)[source]¶

Print detailed information about the network.

This tools prints the input, output and parameter shapes for all the layers. It also prints the total number of parameters in each layer and in the full network.

Parameters:	network (brainstorm.structure.Network) – A network for which the details are printed.

brainstorm.tools.get_in_out_layers(task_type, in_shape, out_shape, data_name='default', targets_name='targets', projection_name=None, outlayer_name=None, mask_name=None, use_conv=None)[source]¶

Prepare input and output layers for building a network.

This is a helper function for quickly building networks. It returns an Input layer and a projection layer which is a FullyConnected or Convolution2D layer depending on the shape of the targets. It creates a mask layer if a mask name is provided, and connects it appropriately.

An appropriate layer to compute the matching loss is connected, depending on the task_type:

classification: The projection layer is connected to a SoftmaxCE layer, which receives targets from the input layer. This is suitable for a single-label classification task.

multi-label: The projection layer is connected to a SigmoidCE layer, which receives targets from the input layer. This is suitable for a multi-label classification task.

regression: The projection layer is connected to a SquaredError layer, which receives targets from the input layer. This is suitable for least squares regression.

Note

The projection layer uses parameters, so it should be initialized after network creation. Check argument descriptions to understand how it will be named.

Example

>>> from brainstorm import tools, Network, layers
>>> inp, out = tools.get_in_out_layers('classification', 784, 10)
>>> net = Network.from_layer(inp >> layers.FullyConnected(1000) >> out)

Parameters:

task_type (str) – one of [‘classification’, ‘regression’, ‘multi-label’]
in_shape (int or tuple[int]) – Shape of the input data.
out_shape (int or tuple[int]) – Shape of the network output.
data_name (Optional[str]) – Name of the input data which will be provided by a data iterator. Defaults to ‘default’.
targets_name (Optional[str]) – Name of the ground-truth target data which will be provided by a data iterator. Defaults to ‘targets’.
projection_name (Optional[str]) – Name for the projection layer which connects to the softmax layer. If unspecified, will be set to outlayer_name + ‘_projection’ if outlayer_name is provided, and ‘Output_projection’ otherwise.
outlayer_name (Optional[str]) – Name for the output layer. If unspecified, named to ‘Output’.
mask_name (Optional[str]) –
Name of the mask data which will be provided by a data iterator. Defaults to None.

The mask is needed if error should be injected only at certain time steps (for sequential data).
use_conv (Optional[bool]) – Specify whether the projection layer should be convolutional. If true the projection layer will use 1x1 convolutions otherwise it will be fully connected. Default is to autodetect this based on the output shape.

Returns:

tuple[Layer]

brainstorm.tools.create_net_from_spec(task_type, in_shape, out_shape, spec, data_name='default', targets_name='targets', mask_name=None, use_conv=None)[source]¶

Create a complete network from a spec line like this “F50 F20 F50”.

Spec:

Capital letters specify the layer type and are followed by arguments to the layer. Supported layers are:

F : FullyConnected

R : Recurrent

L : Lstm

B : BatchNorm

D : Dropout

C : Convolution2D

P : Pooling2D

Where applicable the optional first argument is the activation function from the set {l, r, s, t} corresponding to ‘linear’, ‘relu’, ‘sigmoid’ and ‘tanh’ resp.

FullyConnected, Recurrent and Lstm take their size as mandatory arguments (after the optional activation function argument).

Dropout takes the dropout probability as an optional argument.

Convolution2D takes two mandatory arguments: num_filters and kernel_size like this: ‘C32:3’ or with activation ‘Cs32:3’ meaning 32 filters with a kernel size of 3x3. They can be followed by ‘p1’ for padding and/or ‘s2’ for a stride of (2, 2).

Pooling2D takes an optional first argument for the type of pooling: ‘m’ for max and ‘a’ for average pooling. The next (mandatory) argument is the kernel size. As with Convolution2D it can be followed by ‘p1’ for padding and/or ‘s2’ for setting the stride to (2, 2).

Whitespace is allowed everywhere and will be completely ignored.

Examples

The mnist_pi example can be expressed like this: >>> net = create_net_from_spec(‘classification’, 784, 10, ... ‘D.2 F1200 D F1200 D’) The cifar10_cnn example can be shortened like this: >>> net = create_net_from_spec( ... ‘classification’, (3, 32, 32), 10, ... ‘C32:5p2 P3s2 C32:5p2 P3s2 C64:5p2 P3s2 F64’)

Parameters:	task_type (str) – one of [‘classification’, ‘regression’, ‘multi-label’] in_shape (int or tuple[int]) – Shape of the input data. out_shape (int or tuple[int]) – Output shape / nr of classes spec (str) – A line describing the network as explained above. data_name (Optional[str]) – Name of the input data which will be provided by a data iterator. Defaults to ‘default’. targets_name (Optional[str]) – Name of the ground-truth target data which will be provided by a data iterator. Defaults to ‘targets’. mask_name (Optional[str]) – Name of the mask data which will be provided by a data iterator. Defaults to None. The mask is needed if error should be injected only at certain time steps (for sequential data). use_conv (Optional[bool]) – Specify whether the projection layer should be convolutional. If true the projection layer will use 1x1 convolutions otherwise it will be fully connected. Default is to autodetect this based on the output shape.
Returns:	The constructed network initialized with DenseSqrtFanInOut for layers with activation function and a simple Gaussian default and fallback.
Return type:	brainstorm.structure.network.Network

Data Iterators¶

class brainstorm.data_iterators.AddGaussianNoise(iter, std_dict, mean_dict=None)[source]¶

Adds Gaussian noise to data generated by another iterator, which must provide named data items (such as Online, Minibatches, Undivided). Only Numpy data is supported,

Supports usage of different means and standard deviations for different named data items.

class brainstorm.data_iterators.AddSaltNPepper(iter, prob_dict, ratio_dict=None)[source]¶

Adds Salt&Pepper noise to data generated by another iterator, which must provide named data items (such as Online, Minibatches, Undivided). Only Numpy data is supported,

Supports usage of different amounts and ratios of salt VS pepper for different named data items.

class brainstorm.data_iterators.DataIterator(data_shapes, length)[source]¶

Base class for Data Iterators.

data_shapes¶

dict[str, tuple[int]]

List of input names that this iterator provides.

length¶

int | None

Number of iterations that this iterator will run.

class brainstorm.data_iterators.Flip(iter, prob_dict=None)[source]¶

Randomly flip images horizontally. Images are generated by another iterator, which must provide named data items (such as Online, Minibatches, Undivided). Only 5D Numpy data in TNHWC format is supported.

Defaults to flipping the ‘default’ named data item with a probability of 0.5. Note that the last dimension is flipped, which typically corresponds to flipping images horizontally.

class brainstorm.data_iterators.Minibatches(batch_size=1, shuffle=True, cut_according_to='mask', **named_data)[source]¶

Minibatch iterator for inputs and targets.

If either a ‘mask’ is given or some other means of determining sequence length is specified by cut_according_to, this iterator also cuts the sequences in each minibatch to their maximum length (which can be less than the maximum length over the whole dataset).

Note

When shuffling is enabled, this iterator only randomizes the order of minibatches, but doesn’t re-shuffle instances across batches.

class brainstorm.data_iterators.MultiHot(iter, vocab_size_dict)[source]¶

Convert data to multi hot vectors, according to provided vocabulary sizes. If vocabulary size is not provided for some data item, it is yielded as is.

Currently this iterator only supports 3D data.

class brainstorm.data_iterators.OneHot(iter, vocab_size_dict)[source]¶

Convert data to one hot vectors, according to provided vocabulary sizes. If vocabulary size is not provided for some data item, it is yielded as is.

Currently this iterator only supports 3D data where the last (right-most) dimension is sized 1.

class brainstorm.data_iterators.Pad(iter, size_dict, value_dict=None)[source]¶

Pads images equally on all sides. Images are generated by another iterator, which must provide named data items (such as Online, Minibatches, Undivided). Only 5D Numpy data in TNHWC format is supported.

5D data corresponds to sequences of multi-channel images, which is the typical use case. Zero-padding is used unless specified otherwise.

class brainstorm.data_iterators.RandomCrop(iter, shape_dict)[source]¶

Randomly crops image data. Images are generated by another iterator, which must provide named data items (such as Online, Minibatches, Undivided). Only 5D Numpy data in TNHWC format is supported.

5D data corresponds to sequences of multi-channel images, which is the typical use case.

class brainstorm.data_iterators.Undivided(**named_data)[source]¶: Processes the entire data in one block (only one iteration).

Initializers¶

class brainstorm.initializers.ArrayInitializer(array)[source]¶: Initializes the parameters as the values of the input array.

class brainstorm.initializers.DenseSqrtFanIn(scale='rel')[source]¶

Initializes the parameters randomly according to a uniform distribution over the interval [-scale/sqrt(n), scale/sqrt(n)] where n is the number of inputs to each unit. Uses scale=sqrt(6) by default which is appropriate for rel units.

When number of inputs and outputs are the same, this is equivalent to using DenseSqrtFanInOut.

Scaling:

rel: sqrt(6)
tanh: sqrt(3)
sigmoid: 4 * sqrt(3)
linear: 1

Parameters:	scale (Optional(float or str) – The activation function dependent scaling factor. Can be either float or one of [‘rel’, ‘tanh’, ‘sigmoid’, ‘linear’]. Defaults to ‘rel’.

class brainstorm.initializers.DenseSqrtFanInOut(scale='rel')[source]¶

Initializes the parameters randomly according to a uniform distribution over the interval [-scale/sqrt(n1+n2), scale/sqrt(n1+n2)] where n1 is the number of inputs to each unit and n2 is the number of units in the current layer. Uses scale=sqrt(12) by default which is appropriate for rel units.

Scaling:

rel: sqrt(12)
tanh: sqrt(6)
sigmoid: 4 * sqrt(6)
linear: 1

Parameters:	scale (Optional(float or str) – The activation function dependent scaling factor. Can be either float or one of [‘rel’, ‘tanh’, ‘sigmoid’, ‘linear’]. Defaults to ‘rel’.

Reference:: Glorot, Xavier, and Yoshua Bengio. “Understanding the difficulty of training deep feedforward neural networks” International conference on artificial intelligence and statistics. 2010.

class brainstorm.initializers.EchoState(spectral_radius=1.0)[source]¶

Classic echo state initialization. Creates a matrix with a fixed spectral radius (default=1.0). Spectral radius should be < 1 to satisfy ES-property. Only works for square matrices.

Example

>>> net.initialize(default=Gaussian(),
                   Recurrent={'R': EchoState(0.77)})

class brainstorm.initializers.Gaussian(std=0.1, mean=0.0)[source]¶: Initializes the parameters randomly according to a normal distribution of given mean and standard deviation.

class brainstorm.initializers.Identity(scale=1.0, std=0.01, enforce_square=True)[source]¶: Initialize a matrix to the (scaled) identity matrix + some noise.

class brainstorm.initializers.LstmOptInit(input_block=0.0, input_gate=0.0, forget_gate=0.0, output_gate=0.0)[source]¶

Used to initialize an LstmOpt layer. This is useful because in an LstmOpt layer all the parameters are concatenated for efficiency.

The parameters (input_block, input_gate, forget_gate, and output_gate) can be scalars or Initializers themselves.

class brainstorm.initializers.Orthogonal(scale=1.0)[source]¶

Orthogonal initialization.

Reference: Saxe, Andrew M., James L. McClelland, and Surya Ganguli. “Exact solutions to the nonlinear dynamics of learning in deep linear neural networks.” arXiv preprint arXiv:1312.6120 (2013).

class brainstorm.initializers.RandomWalk(act_func='linear', scale=None)[source]¶

Initializes a (square) weight matrix with the random walk scheme proposed by:

Sussillo, David, and L. F. Abbott. “Random Walk Initialization for Training Very Deep Feedforward Networks.” arXiv:1412.6558 [cs, Stat], December 19, 2014. http://arxiv.org/abs/1412.6558.

class brainstorm.initializers.SparseInputs(sub_initializer, connections=15)[source]¶

Makes sure every unit only gets activation from a certain number of input units and the rest of the parameters are 0. The connections are initialized by evaluating the passed sub_initializer.

Example

>>> net.initialize(FullyConnected=SparseInputs(Gaussian(),
...                                            connections=10))

class brainstorm.initializers.SparseOutputs(sub_initializer, connections=15)[source]¶

Makes sure every unit is propagating its activation only to a certain number of output units, and the rest of the parameters are 0. The connections are initialized by evaluating the passed sub_initializer.

Example

>>> net.initialize(FullyConnected=SparseOutputs(Gaussian(),
                                                connections=10))

class brainstorm.initializers.Uniform(low=0.1, high=None)[source]¶: Initializes the parameters randomly according to a uniform distribution over the interval [low, high].

Hooks¶

class brainstorm.hooks.EarlyStopper(log_name, patience=1, criterion='min', name=None, timescale='epoch', interval=1, verbose=None)[source]¶

Stop the training if a log entry does not improve for some time.

Can stop training when the log entry is at its minimum (such as an error) or maximum (such as accuracy) according to the criterion argument.

The timescale and interval should be the same as those for the monitoring hook which logs the quantity of interest.

Parameters:

log_name – Name of the log entry to be checked for improvement. It should be in the form <monitorname>.<log_name> where log_name itself may be a nested dictionary key in dotted notation.
patience – Number of log updates to wait before stopping training. Default is 1.
criterion – Indicates whether training should be stopped when the log entry is at its minimum or maximum value. Must be either ‘min’ or ‘max’. Defaults to ‘min’.
name (Optional[str]) – Name of this monitor. This name is used as a key in the trainer logs. Default is ‘EarlyStopper’.
timescale (Optional[str]) – Specifies whether the Monitor should be called after each epoch or after each update. Default is ‘epoch’.
interval (Optional[int]) – This monitor should be called every interval epochs/updates. Default is 1.
verbose – bool, optional Specifies whether the logs of this monitor should be printed, and acts as a fallback verbosity for the used data iterator. If not set it defaults to the verbosity setting of the trainer.

Examples

Add a hook to monitor a quantity of interest:

>>> scorer = bs.scorers.Accuracy()
>>> trainer.add_hook(bs.hooks.MonitorScores('valid_getter', [scorer],
...                                         name='validation'))

Stop training if validation set accuracy does not rise for 10 epochs:

>>> trainer.add_hook(bs.hooks.EarlyStopper('validation.Accuracy',
...                                        patience=10,
...                                        criterion='max'))

Stop training if loss on validation set does not drop for 5 epochs:

>>> trainer.add_hook(bs.hooks.EarlyStopper('validation.total_loss',
...                                        patience=5,
...                                        criterion='min'))

class brainstorm.hooks.InfoUpdater(run, name=None, timescale='epoch', interval=1)[source]¶: Save the information from logs to the Sacred custom info dict

class brainstorm.hooks.ModifyStepperAttribute(schedule, attr_name='learning_rate', timescale='epoch', interval=1, name=None, verbose=None)[source]¶: Modify an attribute of the training stepper.

class brainstorm.hooks.MonitorLayerDeltas(layer_name, name=None, timescale='epoch', interval=1, verbose=None)[source]¶: Monitor some statistics about all the deltas of a layer.

class brainstorm.hooks.MonitorLayerGradients(layer_name, name=None, timescale='epoch', interval=1, verbose=None)[source]¶: Monitor some statistics about all the gradients of a layer.

class brainstorm.hooks.MonitorLayerInOuts(layer_name, name=None, timescale='epoch', interval=1, verbose=None)[source]¶: Monitor some statistics about all the inputs and outputs of a layer.

class brainstorm.hooks.MonitorLayerParameters(layer_name, name=None, timescale='epoch', interval=1, verbose=None)[source]¶: Monitor some statistics about all the parameters of a layer.

class brainstorm.hooks.MonitorLoss(iter_name, name=None, timescale='epoch', interval=1, verbose=None)[source]¶: Monitor the losses computed by the network on a dataset using a given data iterator.

class brainstorm.hooks.MonitorScores(iter_name, scorers, name=None, timescale='epoch', interval=1, verbose=None)[source]¶

Monitor the losses and optionally several scores using a given data iterator.

Parameters:

iter_name (str) – name of the data iterator to use (as specified in the train() call)
scorers (List[brainstorm.scorers.Scorer]) – List of Scorers to evaluate.
name (Optional[str]) – Name of this monitor. This name is used as a key in the trainer logs. Default is ‘MonitorScores’
timescale (Optional[str]) – Specifies whether the Monitor should be called after each epoch or after each update. Default is ‘epoch’.
interval (Optional[int]) – This monitor should be called every interval epochs/updates. Default is 1.
verbose – bool, optional Specifies whether the logs of this monitor should be printed, and acts as a fallback verbosity for the used data iterator. If not set it defaults to the verbosity setting of the trainer.

Value Modifiers¶

class brainstorm.value_modifiers.ClipValues(low=-1.0, high=1.0)[source]¶

Clips (limits) the weights to be between low and high. Defaults to low=-1 and high=1.

Should be added to the network via the set_weight_modifiers method like so:

>> net.set_weight_modifiers(RnnLayer={‘HR’: ClipValues()})

See Network.set_weight_modifiers for more information on how to control which weights to affect.

class brainstorm.value_modifiers.ConstrainL2Norm(limit)[source]¶

Constrains the L2 norm of the incoming weights to every neuron/unit to be less than or equal to a limit. If the L2 norm for any unit exceeds the limit, the weights are rescaled such that the squared L2 norm equals the limit. Ignores Biases.

Should be added to the network via the set_weight_modifiers method like so:

>> net.set_weight_modifiers(RnnLayer={‘HX’: ConstrainL2Norm()})

See Network.set_weight_modifiers for more information on how to control which weights to affect.

class brainstorm.value_modifiers.FreezeValues(weights=None)[source]¶

Prevents the weights from changing at all.

If the weights argument is left at None it will remember the first weights it sees and resets them to that every time.

Should be added to the network via the set_constraints method like so: >> net.set_constraints(RnnLayer={‘HR’: FreezeValues()}) See Network.set_constraints for more information on how to control which weights to affect.

class brainstorm.value_modifiers.L1Decay(factor)[source]¶

Applies L1 weight decay.

New gradients = gradients + factor * sign(parameters)

class brainstorm.value_modifiers.L2Decay(factor)[source]¶

Applies L2 weight decay.

New gradients = gradients + factor * parameters

class brainstorm.value_modifiers.MaskValues(mask)[source]¶

Multiplies the weights elementwise with the mask.

This can be used to clamp some of the weights to zero.

Should be added to the network via the set_weight_modifiers method like so:

>> net.set_weight_modifiers(RnnLayer={‘HR’: MaskValues(M)})

See Network.set_weight_modifiers for more information on how to control which weights to affect.

class brainstorm.value_modifiers.ValueModifier[source]¶: ValueModifiers can be installed in a Network to affect either the parameters or the gradients.

Scorers¶

Handler¶

class brainstorm.handlers.base_handler.Handler[source]¶

Abstract base class for all handlers.

This base is used mainly to ensure a common interface and provide documentation for derived handlers. When implementing new methods one should adhere to the naming scheme. Most mathematical operations should have a suffix or suffixes indicating the shapes of inputs it expects:

s for scalar, v for vector (a 2D array with at least dimension equal to 1), m for matrix (a 2D array), t for tensor (which means arbitrary shape, synonym for array).

Note that these shapes are not checked by each handler itself. However, the DebugHandler can be used to perform these checks to ensure that operations are not abused.

dtype¶: Data type that this handler works with.

context¶: Context which may be used by this handler for operation.

EMPTY¶: An empty array matching this handler’s type.

rnd¶: A random state maintained by this handler.

array_type¶: The type of array object that this handler works with.

__describe__()¶

Returns a description of this object. That is a dictionary containing the name of the class as @type and all members of the class. This description is json-serializable.

If a sub-class of Describable contains non-describable members, it has to override this method to specify how it should be described.

Returns:	Description of this object
Return type:	dict

__new_from_description__(description)¶

Creates a new object from a given description.

If a sub-class of Describable contains non-describable fields, it has to override this method to specify how they should be initialized from their description.

Parameters:	description (dict) – description of this object
Returns:	A new instance of this class according to the description.

abs_t(a, out)[source]¶

Compute the element-wise absolute value.

Parameters:	a (array_type) – Array whose absolute values are to be computed. out (array_type) – Array into which the output is placed. Must have the same shape as `a`.
Returns:	None

add_into_if(a, out, cond)[source]¶

Add element of a to element of out if corresponding element in cond is non-zero.

Parameters:	a (array_type) – Array whose elements (might) be added to out. out (array_type) – Output array, whose values might be increased by values from a. cond (array_type) – The condition array. Only those entries from a are added into out whose corresponding cond elements are non-zero.
Returns:	None

add_mv(m, v, out)[source]¶

Add a matrix to a vector with broadcasting.

Add an (M, N) matrix to a (1, N) or (M, 1) vector using broadcasting such that the output is (M, N).

Parameters:	m (array_type) – The first array to be added. Must be 2D. v (array_type) – The second array to be added. Must be 2D with at least one dimension of size 1 and the other dimension matching the corresponding size of `m`. out (array_type) – Array into which the output is placed. Must have the same shape as `m`.
Returns:	None

add_st(s, t, out)[source]¶

Add a scalar to each element of a tensor.

Parameters:	s (dtype) – The scalar value to be added. t (array_type) – The array to be added. out (array_type) – Array into which the output is placed. Must have the same shape as `t`.
Returns:	None

add_tt(a, b, out)[source]¶

Add two tensors element-wise,

Parameters:	a (array_type) – First array. b (array_type) – Second array. out (array_type) – Array into which the output is placed. Must have the same shape as `a` and `b`.
Returns:	None

allocate(shape)[source]¶

Allocate new memory with given shape but arbitrary content.

Parameters:	shape (tuple[int]) – Shape of the array.
Returns:	New array with given shape.
Return type:	object

avgpool2d_backward_batch(inputs, window, outputs, padding, stride, in_deltas, out_deltas)[source]¶

Computes the gradients for 2D average-pooling on a batch of images.

Parameters:	inputs (array_type) – window (tuple[int]) – outputs (array_type) – padding (int) – stride (tuple[int]) – in_deltas (array_type) – out_deltas (array_type) –
Returns:	None

avgpool2d_forward_batch(inputs, window, outputs, padding, stride)[source]¶

Performs 2D average-pooling on a batch of images.

Parameters:	inputs (array_type) – window (tuple[int]) – outputs (array_type) – padding (int) – stride (tuple[int]) – argmax (array_type) –
Returns:	None

binarize_v(v, out)[source]¶

Convert a column vector into a matrix of one-hot row vectors.

Usually used to convert class IDs into one-hot vectors. Therefore, out[i, j] = 1, if j equals v[i, 0] out[i, j] = 0, otherwise.

Note that out must have enough columns such that all indices in v are valid.

Parameters:	v (array_type) – Column vector (2D array with a single column). out (array_type) – Matrix (2D array) into which the output is placed. The number of rows must be the same as `v` and number of columns must be greater than the maximum value in `v`.
Returns:	None

broadcast_t(a, axis, out)[source]¶

Broadcast the given axis of an array by copying elements.

This function provides a numpy-broadcast-like operation for the the dimension given by axis. E.g. for axis=3 an array with shape (2, 3, 4, 1) may be broadcasted to shape (2, 3, 4, 5), by copying all the elements 5 times.

Parameters:	a (array_type) – Array whose elements should be broadcasted. The dimension corresponding to axis must be of size 1. axis (int) – the axis along which to broadcast out (array_type) – Array into which the output is placed. Must have same the number of dimensions as a. Only the dimension corresponding to axis can differ from a.
Returns:	None

clip_t(a, a_min, a_max, out)[source]¶

Clip (limit) the values in an array.

Given an interval, values outside the interval are clipped to the interval edges. For example, if an interval of [0, 1] is specified, values smaller than 0 become 0, and values larger than 1 become 1.

Parameters:	a (array_type) – Array containing the elements to clip. a_min (dtype) – Minimum value. a_max (dtype) – Maximum value. out (array_type) – Array into which the output is placed. Must have the same shape as `a`.
Returns:	None

conv2d_backward_batch(inputs, weights, padding, stride, in_deltas, out_deltas, weight_deltas, bias_deltas)[source]¶

Computes the gradients for a 2D convolution on a batch of images.

Parameters:	inputs (array_type) – weights (array_type) – padding (int) – stride (tuple[int]) – in_deltas (array_type) – out_deltas (array_type) – weight_deltas (array_type) – bias_deltas (array_type) –
Returns:	None

conv2d_forward_batch(inputs, weights, bias, outputs, padding, stride)[source]¶

Performs a 2D convolution on a batch of images.

Parameters:	inputs (array_type) – weights (array_type) – bias (array_type) – outputs (array_type) – padding (int) – stride (tuple[int]) –
Returns:	None

copy_to(src, dest)[source]¶

Copy the contents of one array to another.

Both source and destination arrays must be of this handler’s supported type and have the same shape.

Parameters:	dest (array_type) – Destination array. src (array_type) – Source array.
Returns:	None

copy_to_if(src, dest, cond)[source]¶

Copy element of ‘src’ to element of ‘dest’ if cond is not equal to 0.

Parameters:	src (array_type) – Source array whose elements (might) be copied into dest. dest (array_type) – Destination array. cond (array_type) – The condition array. Only those src elements get copied to dest whose corresponding cond elements are non-zero.
Returns:	None

create_from_numpy(arr)[source]¶

Create a new array with the same entries as a Numpy array.

Parameters:	arr (numpy.ndarray) – Numpy array whose elements should be used to fill the new array.
Returns:	New array with same shape and entries as the given Numpy array.
Return type:	array_type

divide_mv(m, v, out)[source]¶

Divide a matrix by a vector.

Divide a (M, N) matrix element-wise by a (1, N) vector using broadcasting such that the output is (M, N).

Parameters:	a (array_type) – First array (dividend). Must be 2D. b (array_type) – Second array (divisor). Must be 2D with at least one dimension of size 1 and second dimension matching the corresponding size of `m`. out (array_type) – Array into which the output is placed. Must have the same shape as `m`.
Returns:	None

divide_tt(a, b, out)[source]¶

Divide two tensors element-wise.

Parameters:	a (array_type) – First array (dividend). b (array_type) – Second array (divisor). Must have the same shape as `a`. out (array_type) – Array into which the output is placed. Must have the same shape as `a` and `b`.
Returns:	None

dot_add_mm(a, b, out, transa=False, transb=False)[source]¶

Multiply two matrices and add to a matrix.

Only 2D arrays (matrices) are supported.

Parameters:	a (array_type) – First matrix. b (array_type) – Second matrix. Must have compatible shape to be right-multiplied with `a`. out (array_type) – Array into which the output is added. Must have correct shape for the product of the two matrices.
Returns:	None

dot_mm(a, b, out, transa=False, transb=False)[source]¶

Multiply two matrices.

Only 2D arrays (matrices) are supported.

Parameters:	a (array_type) – First matrix. b (array_type) – Second matrix. Must have compatible shape to be right-multiplied with `a`. out (array_type) – Array into which the output is placed. Must have correct shape for the product of the two matrices.
Returns:	None

el(x, y)[source]¶

Compute exponential linear activation function.

f(x) = x if x > 0 else exp(x) - 1

Note that we chose to fix alpha to 1

Parameters:	x (array_type) – Input Array. y (array_type) – Output Array
Returns:	None

References

Clevert, D. A., Unterthiner, T., & Hochreiter, S. (2015). Fast and Accurate Deep Network Learning by Exponential Linear Units. arXiv preprint arXiv:1511.07289.

el_deriv(x, y, dy, dx)[source]¶

Backpropagate derivatives through the exponential linear function.

f’(x) = 1 if x > 0 else f(x) + 1

Note that we chose to fix alpha to 1

Parameters:	x (array_type) – Inputs to the exponential linear function. This argument is not used and is present only to conform with other activation functions. y (array_type) – Outputs of the exponential linear function. dy (array_type) – Derivatives with respect to the outputs. dx (array_type) – Array in which the derivatives with respect to the inputs are placed.
Returns:	None

References

Clevert, D. A., Unterthiner, T., & Hochreiter, S. (2015). Fast and Accurate Deep Network Learning by Exponential Linear Units. arXiv preprint arXiv:1511.07289.

fill(mem, val)[source]¶

Fill an array with a given value.

Parameters:	mem (array_type) – Array to be filled. val (dtype) – Value to fill.
Returns:	None

fill_gaussian(mean, std, out)[source]¶

Fill an array with values drawn from a Gaussian distribution.

Parameters:	mean (float) – Mean of the Gaussian Distribution. std (float) – Standard deviation of the Gaussian distribution. out (array_type) – Target array to fill with values.
Returns:	None

fill_if(mem, val, cond)[source]¶

Set the elements of mem to val if corresponding cond element is non-zero.

Parameters:	mem (array_type) – Array to be filled. val (dtype) – The scalar which the elements of mem (might) be set to. cond (array_type) – The condition array. Only those mem elements are set to val whose corresponding cond elements are non-zero.
Returns:	None

generate_probability_mask(mask, probability)[source]¶

Fill an array with zeros and ones.

Fill an array with zeros and ones such that the probability of an element being one is equal to probability.

Parameters:	mask (array_type) – Array to will be filled. probability (float) – Probability of an element of `mask` equal to one. (being) –
Returns:	None

get_numpy_copy(mem)[source]¶

Return a copy of the given data as a numpy array.

Parameters:	mem (array_type) – Source array to be copied.
Returns:	Numpy array with same content as mem.
Return type:	numpy.ndarray

index_m_by_v(m, v, out)[source]¶

Get elements from a matrix using indices from a vector.

v and out must be column vectors of the same size. Elements from the matrix m are copied using the indices given by a column vector. From row i of the matrix, the element from column v[i, 0] is copied to out, such that out[i, 0] = m[i, v[i, 0]].

Note that m must have enough columns such that all indices in v are valid.

Parameters:	m (array_type) – Matrix (2D array) whose elements should be copied. v (array_type) – Column vector (2D array with a single column) whose values are used as indices into `m`. The number of rows must be the same as `m`. out (array_type) – Array into which the output is placed. It’s shape must be the same as `v`.
Returns:	None

is_fully_finite(a)[source]¶

Check if all entries of the array are finite (no nans or infs).

Parameters:	a (array_type) – Input array to check.
Returns:	True if there are no infs or nans, False otherwise.
Return type:	bool

log_t(a, out)[source]¶

Compute the element-wise natural logarithm.

The natural logarithm log is the inverse of the exponential function, so that log(exp(x)) = x.

Parameters:	a (array_type) – Array whose logarithm is to be computed. out (array_type) – Array into which the output is placed. Must have the same shape as `a`.
Returns:	None

maxpool2d_backward_batch(inputs, window, outputs, padding, stride, argmax, in_deltas, out_deltas)[source]¶

Computes the gradients for 2D max-pooling on a batch of images.

Parameters:	inputs (array_type) – window (tuple[int]) – outputs (array_type) – padding (int) – stride (tuple[int]) – argmax (array_type) – in_deltas (array_type) – out_deltas (array_type) –
Returns:	None

maxpool2d_forward_batch(inputs, window, outputs, padding, stride, argmax)[source]¶

Performs a 2D max-pooling on a batch of images.

Parameters:	inputs (array_type) – window (tuple[int]) – outputs (array_type) – padding (int) – stride (tuple[int]) – argmax (array_type) –
Returns:	None

merge_tt(a, b, out)[source]¶

Merge arrays a and b along their last axis.

Parameters:	a (array_type) – Array to be merged. b (array_type) – Array to be merged. out (array_type) – Array into which the output is placed.
Returns:	None

modulo_tt(a, b, out)[source]¶

Take the modulo between two arrays elementwise. (out = a % b)

Parameters:	a (array_type) – First array (dividend). b (array_type) – Second array (divisor). Must have the same shape as a. out (array_type) – Array into which the remainder is placed. Must have the same shape as `a` and `b`.
Returns:	None

mult_add_st(s, t, out)[source]¶

Multiply a scalar with each element of a tensor and add to a tensor.

Parameters:	s (dtype) – The scalar value to be multiplied. t (array_type) – The array to be multiplied. out (array_type) – Array into which the product is added. Must have the same shape as `t`.
Returns:	None

mult_add_tt(a, b, out)[source]¶

Multiply two tensors element-wise and add to a tensor.

Parameters:	a (array_type) – First array. b (array_type) – Second array. Must have the same shape as `a`. out (array_type) – Array into which the output is added. Must have the same shape as `a` and `b`.
Returns:	None

mult_mv(m, v, out)[source]¶

Multiply a matrix with a vector.

Multiply an (M, N) matrix with a (1, N) or (M, 1) vector using broadcasting such that the output is (M, N). Also allows the “vector” to have the same dimension as the matrix in which case it behaves the same as mult_tt().

Parameters:	m (array_type) – The first array. Must be 2D. v (array_type) – The second array, to be multiplied with `a`. Must be 2D with at least one dimension of size 1 and the other dimension matching the corresponding size of `m`. out (array_type) – Array into which the output is placed. Must have the same shape as `m`.
Returns:	None

mult_st(s, t, out)[source]¶

Multiply a scalar with each element of a tensor.

Parameters:	s (dtype) – The scalar value to be multiplied. t (array_type) – The array to be multiplied. out (array_type) – Array into which the output is placed. Must have the same shape as `t`.
Returns:	None

mult_tt(a, b, out)[source]¶

Multiply two tensors of the same shape element-wise.

Parameters:	a (array_type) – First array. b (array_type) – Second array. Must have the same shape as `a`. out (array_type) – Array into which the output is placed. Must have the same shape as `a` and `b`.
Returns:	None

ones(shape)[source]¶

Allocate new memory with given shape and filled with ones.

Parameters:	shape (tuple[int]) – Shape of the array.
Returns:	New array with given shape filled with ones.
Return type:	object

rel(x, y)[source]¶

Compute the rel (rectified linear) function.

y = rel(x) = max(0, x)

Parameters:	x (array_type) – Input array. y (array_type) – Output array.
Returns:	None

rel_deriv(x, y, dy, dx)[source]¶

Backpropagate derivatives through the rectified linear function.

Parameters:	x (array_type) – Inputs to the rel function. This argument is not used and is present only to conform with other activation functions. y (array_type) – Outputs of the rel function. dy (array_type) – Derivatives with respect to the outputs. dx (array_type) – Array in which the derivatives with respect to the inputs are placed.
Returns:	None

set_from_numpy(mem, arr)[source]¶

Set the content of an array from a given numpy array.

Parameters:	mem (array_type) – Destination array that should be set. arr (numpy.ndarray) – Source numpy array.
Returns:	None

sigmoid(x, y)[source]¶

Compute the sigmoid function.

y = sigmoid(x) = 1 / (1 + exp(-x)) :param x: Input array. :type x: array_type :param y: Output array. :type y: array_type

Returns:	None

sigmoid_deriv(x, y, dy, dx)[source]¶

Backpropagate derivatives through the sigmoid function.

Parameters:	x (array_type) – Inputs to the sigmoid function. This argument is not used and is present only to conform with other activation functions. y (array_type) – Outputs of the sigmoid function. dy (array_type) – Derivatives with respect to the outputs. dx (array_type) – Array in which the derivatives with respect to the inputs are placed.
Returns:	None

sign_t(a, out)[source]¶

Compute an element-wise indication of the sign of a number.

Output has the value 1.0 if an element is positive, 0 if it is zero, and -1.0 if it is negative.

Parameters:	a (array_type) – Array whose sign is to be computed. out (array_type) – Array into which the output is placed. Must have the same shape as `a`.
Returns:	None

softmax_m(m, out)[source]¶

Compute the softmax function over last dimension of a matrix.

Parameters:	m (array_type) – Input array. out (array_type) – Output array.
Returns:	None

split_add_tt(x, out_a, out_b)[source]¶

Split array x along the last axis and add the parts to out_i.

Parameters:	x (array_type) – Array to be split. out_a (array_type) – Array to which 1st part of x is added. out_b (array_type) – Array to which 2nd part of x is added.
Returns:	None

sqrt_t(a, out)[source]¶

Compute the positive square-root of an array, element-wise.

Parameters:	a (array_type) – Array whose square root is to be computed. out (array_type) – Array into which the output is placed. Must have the same shape as `a`.
Returns:	None

subtract_mv(m, v, out)[source]¶

Subtract a vector from a matrix with broadcasting.

Parameters:	m (array_type) – The first array. Must be 2D. v (array_type) – The second array, to be subtracted from `a`. Must be 2D with at least one dimension of size 1 and second dimension matching the corresponding size of `m`. out (array_type) – Array into which the output is placed. Must have the same shape as `m`.
Returns:	None

subtract_tt(a, b, out)[source]¶

Subtract a tensor from another element-wise.

Parameters:	a (array_type) – First array. b (array_type) – Second array, to be subtracted from `a`. Must have the same shape as `a`. out (array_type) – Array into which the output (`a` - `b`) is placed. Must have the same shape as `a` and `b`.
Returns:	None

sum_t(a, axis, out)[source]¶

Sum the elements of an array along a given axis.

If axis is None, the sum is computed over all elements of the array. Otherwise, it is computed along the specified axis.

Note

Only 1D and 2D arrays are currently supported.

Parameters:	a (array_type) – Array to be summed. axis (int) – Axis over which the summation should be done. out (array_type) – Array into which the output is placed.
Returns:	None

tanh(x, y)[source]¶

Compute the tanh (hyperbolic tangent) function.

y = tanh(x) = (e^z - e^-z) / (e^z + e^-z)

Parameters:	x (array_type) – Input array. y (array_type) – Output array.
Returns:	None

tanh_deriv(x, y, dy, dx)[source]¶

Backpropagate derivatives through the tanh function.

Parameters:	x (array_type) – Inputs to the tanh function. This argument is not used and is present only to conform with other activation functions. y (array_type) – Outputs of the tanh function. dy (array_type) – Derivatives with respect to the outputs. dx (array_type) – Array in which the derivatives with respect to the inputs are placed.
Returns:	None

zeros(shape)[source]¶

Allocate new memory with given shape and filled with zeros.

Parameters:	shape (tuple[int]) – Shape of the array.
Returns:	New array with given shape filled with zeros.
Return type:	object

Describables¶

This module provides the core functionality for describing objects.

Description¶

In brainstorm most objects can be converted into a so called description using the get_description() function. A description is a JSON-serializable data structure that contains all static information to re-create that object using the create_from_description() function. It does not, however, contain any dynamic information. This means that an object created from a description is in the same state as if it had been freshly instantiated, without any later modifications of its internal state.

The descriptions for the basic types int, float, bool, str, list and dict are these values themselves. Other objects need to inherit from Describable and their description is always a dict containing the '@type': 'ClassName' key possibly along with other properties.

Conversion to and from descriptions¶

brainstorm.describable.create_from_description(description)[source]¶

Instantiate a new object from a description.

Parameters:	description (dict) – A description of the object.
Returns:	A new instance of the described object

brainstorm.describable.get_description(this)[source]¶

Create a JSON-serializable description of this object.

This description can be used to create a new instance of this object by calling create_from_description().

Parameters:	this (Describable) – An object to be described. Must either be a basic datatype or inherit from Describable.
Returns:	A JSON-serializable description of this object.

Describable Base Class¶

class brainstorm.describable.Describable[source]¶

Base class for all objects that can be described and initialized from a description.

Derived classes can specify the __undescribed__ field to prevent certain attributes from being described. This field can be either a set of attribute names, or a dictionary mapping attribute names to their initialization value (used when a new object is created from the description).

Derived classes can also specify an __default_values__ dict. This dict allows for omitting certain attributes from the description if their value is equal to that default value.

__undescribed__ = {}¶: Set or dict of attributes that should not be part of the description. If specified as a dict, then these attributes are initialized to the specified values.

__default_values__ = {}¶: Dict of attributes with their corresponding default values. They will only be part of the description if their value differs from their default value

__init_from_description__(description)[source]¶

Subclasses can override this to provide additional initialization when created from a description.

This method will be called AFTER the object has already been created from the description.

Parameters:	description (dict) – the description of this object

__describe__()[source]¶

Returns a description of this object. That is a dictionary containing the name of the class as @type and all members of the class. This description is json-serializable.

If a sub-class of Describable contains non-describable members, it has to override this method to specify how it should be described.

Returns:	Description of this object
Return type:	dict

classmethod __new_from_description__(description)[source]¶

Creates a new object from a given description.

If a sub-class of Describable contains non-describable fields, it has to override this method to specify how they should be initialized from their description.

Parameters:	description (dict) – description of this object
Returns:	A new instance of this class according to the description.

__describe__()[source]

Returns a description of this object. That is a dictionary containing the name of the class as @type and all members of the class. This description is json-serializable.

If a sub-class of Describable contains non-describable members, it has to override this method to specify how it should be described.

Returns:	Description of this object
Return type:	dict

__init_from_description__(description)[source]

Subclasses can override this to provide additional initialization when created from a description.

This method will be called AFTER the object has already been created from the description.

Parameters:	description (dict) – the description of this object

classmethod __new_from_description__(description)[source]

Creates a new object from a given description.

If a sub-class of Describable contains non-describable fields, it has to override this method to specify how they should be initialized from their description.

Parameters:	description (dict) – description of this object
Returns:	A new instance of this class according to the description.