API Documentation¶
This is a construction site...
Network¶
-
class
brainstorm.structure.network.
Network
(layers, buffer_manager, architecture, seed=None, handler=<brainstorm.handlers.numpy_handler.NumpyHandler object>)[source]¶ -
backward_pass
()[source]¶ Perform a backward pass on all provided data and targets.
Note
All the targets to be used during this backward pass have to be passed to the network beforehand using provide_external_data. Also this backward pass depends on the internal state produced by a forward pass. So you have to always run a forward_pass first.
-
forward_pass
(training_pass=False, context=None)[source]¶ Perform a forward pass on all the provided data.
Note
All the input data to be used during this forward pass have to be passed to the network beforehand using
provide_external_data()
Parameters: - training_pass (Optional[bool]) – Indicates whether this forward pass belongs to training or not. This might change the behaviour of some layers.
- context (Optional[dict]) – An optional network state as created by net.get_context(). If provided the network will treat this as if it was the the state of the network at the t=-1. This is useful for continuing the computations of a recurrent neural network. Defaults to None.
-
classmethod
from_architecture
(architecture)[source]¶ Create Network instance from given architecture.
Parameters: architecture (dict) – JSON serializable Architecture description. Returns: A fully functional Network instance. Return type: Network
-
classmethod
from_hdf5
(filename)[source]¶ Load network from HDF5 file.
Parameters: filename (str) – Name of the file that the network should be loaded from. Returns: The loaded network. Return type: Network See also
-
classmethod
from_layer
(some_layer)[source]¶ Create Network instance from a construction layer.
Parameters: some_layer (brainstorm.construction.ConstructionWrapper) – Some layer used to wire up an architecture with >> Returns: A fully functional Network instance. Return type: Network
-
get
(buffer_path)[source]¶ Get a numpy copy of the buffer corresponding to buffer_path.
Examples
>>> parameters = net.get('parameters') >>> outputs = net.get('OutputLayer.outputs.probabilities') >>> forget_gates = net.get('Lstm.internals.Fb')
Parameters: buffer_path (str) – A dotted path to the buffer that should be copied and returned.
Returns: A numpy array copy of the specified buffer.
Return type: numpy.ndarray
Raises: KeyError
If no buffer is found for the given path.
-
get_context
()[source]¶ Get the last timestep internal state of this network. (after a forward pass) This can be passed to the forward_pass method as context to continue a batch of sequences.
Returns: Internal state of this network at the last timestep. Return type: dict
-
get_input
(input_name)[source]¶ Get a numpy copy of one of the named inputs that are currently used.
Parameters: input_name (str) – The name of the input that should be retrieved. Returns: A numpy array copy of the specified input. Return type: numpy.ndarray
-
get_loss_values
()[source]¶ Get a dictionary of all the loss values that resulted from a forward pass.
For simple networks with just one loss the dictionary has only a single entry called ‘total_loss’.
If there are multiple Loss layers the dictionary will also contain an entry for each Loss layer mapping its name to its loss, and the ‘total_loss’ entry will contain the sum of all of them.
Returns: A dictionary of all loss values that this network produced. Return type: dict[str, float]
-
initialize
(default_or_init_dict=None, seed=None, **kwargs)[source]¶ Initialize the weights of the network.
Initialization can be specified in three equivalent ways:
just a default initializer:
>>> net.initialize(Gaussian())
Note that this is equivalent to:
>>> net.initialize(default=Gaussian())
by passing a dictionary:
>>> net.initialize({'RegularLayer': Uniform(), ... 'LstmLayer': Gaussian()})
by using keyword arguments:
>>> net.initialize(RegularLayer=Uniform(), ... LstmLayer=Uniform())
All following explanations will be with regards to the dictionary style of initialization, because it is the most general one.
Note
It is not recommended to combine 2. and 3. but if they are, then keyword arguments take precedence.
Each initialization consists of a layer-pattern and that maps to an initializer or a weight-pattern dictionary.
Layer patterns can take the following forms:
{'layer_name': INIT_OR_SUBDICT}
Matches all the weights of the layer named layer_name{'layer_*': INIT_OR_SUBDICT}
Matches all layers with a name that starts withlayer_
The wild-card*
can appear at arbitrary positions and even multiple times in one path.
There are two special layer patterns:
{'default': INIT}
Matches all weights that are not matched by any other path-pattern{'fallback': INIT}
Set a fallback initializer for every weight. It will only be evaluated for the weights for which the regular initializer failed with an InitializationError.This is useful for initializers that require a certain shape of weights and will not work otherwise. The fallback will then be used for all cases when that initializer failed.
The weight-pattern sub-dictionary follows the same form as the layer- pattern:
{'layer_pattern': {'a': INIT_A, 'b': INIT_B}}
{'layer_pattern': {'a*': INIT}
{'layer_pattern': {'default': INIT}
{'layer_pattern': {'fallback': INIT}
An initializer can either be a scalar, something that converts to a numpy array of the correct shape or an
Initializer
object. So for example:>>> net.initialize(default=0, ... RnnLayer={'b': [1, 2, 3, 4, 5]}, ... ForwardLayer=bs.Gaussian())
Note
Each view must match exactly one initialization and up to one fallback to be unambiguous. Otherwise the initialization will fail.
You can specify a seed to make the initialization reproducible:
>>> net.initialize({'default': bs.Gaussian()}, seed=1234)
-
provide_external_data
(data, all_inputs=True)[source]¶ Provide the data for this network to perform its forward and backward passes on.
Parameters: - data (dict) – A dictionary of the data that will be copied to the outputs of the Input layer.
- all_inputs (bool) – If set to False this method will NOT check that all inputs are provided. Defaults to True.
-
save_as_hdf5
(filename, comment='')[source]¶ Save this network as an HDF5 file. The file will contain a description of this network and the parameters.
Parameters: - filename (str) – Name of the file this network should be saved to. All directories have to exist already.
- comment (Optional[str]) – An optional comment that will be saved inside the file.
-
set_gradient_modifiers
(default_or_mod_dict=None, **kwargs)[source]¶ Install
ValueModifiers
in the network to change the gradient.They can be run manually using
apply_gradient_modifiers()
, but they will also be called by the network after each backward pass.Gradient modifiers can be set for specific weights in the same way as initializers can, but there is no fallback. (see
initialize()
for details)A modifier can be a ValueModifiers object or a list of them. So for example:
>>> net.set_gradient_modifiers( ... default=bs.value_modifiers.ClipValues(-1, 1) ... FullyConnectedLayer={'W': [bs.value_modifiers.ClipValues(), ... bs.value_modifiers.MaskValues(MASK)]} ... )
Note
The order in which ValueModifiers appear in the list matters, because it is the same order in which they will be executed.
-
set_handler
(new_handler)[source]¶ Change the handler of this network.
Examples
Use this to run a network on the GPU using the pycuda:
>>> from brainstorm.handlers import PyCudaHandler >>> net.set_handler(PyCudaHandler())
Parameters: new_handler (brainstorm.handlers.base_handler.Handler) – The new handler this network should use.
-
set_weight_modifiers
(default_or_mod_dict=None, **kwargs)[source]¶ Install
ValueModifiers
in the network to change the weights.They can be run manually using
apply_weight_modifiers()
, but they will also be called by the trainer after each weight update.Value modifiers can be set for specific weights in the same way initializers can, but there is no fallback. (see
initialize()
for details)A modifier can be a ValueModifiers object or a list of them. So for example:
>>> net.set_weight_modifiers( ... default=bs.ClipValues(-1, 1) ... FullyConnectedLayer={'W': [bs.RescaleIncomingWeights(), ... bs.MaskValues(my_mask)]} ... )
Note
The order in which ValueModifiers appear in the list matters, because it is the same order in which they will be executed.
-
Trainer¶
-
class
brainstorm.training.trainer.
Trainer
(stepper, verbose=True)[source]¶ Trainer objects organize the process of training a network. They can employ different training methods (
Steppers
) and callHooks
.-
__init__
(stepper, verbose=True)[source]¶ Create a new Trainer.
Parameters: - stepper (brainstorm.training.steppers.TrainingStepper) –
- verbose (bool) –
-
add_hook
(hook)[source]¶ Add a hook to this trainer.
Hooks add a variety of functionality to the trainer and can be called after every specified number of parameter updates or epochs. See documentation for ::class::Hook for more details.
Note
During training, hooks will be called in the same order that they were added. This should be kept in mind when using a hook which relies on another hook having been called.
Parameters: hook (brainstorm.hooks.Hook) – Any ::class::Hook object that should be called by this trainer. Raises: ValueError
– If a hook with the same name has already been added.
-
Tools¶
-
brainstorm.tools.
draw_network
(network, file_name='network.png')[source]¶ Write a diagram for a network to a file.
Parameters: - network (brainstorm.structure.Network) – Network to be drawn.
- file_name (Optional[str]) – Defaults to ‘network.png’.
Note
This tool requires the pygraphviz library to be installed.
Raises: ImportError
– If pygraphviz can not be imported.
-
brainstorm.tools.
evaluate
(network, iter, scorers=(), out_name='', targets_name='targets', mask_name=None)[source]¶ Evaluate one or more scores for a network.
This tool can be used to evaluate scores of a trained network on test data.
Parameters: - network (brainstorm.structure.Network) – Network to be evaluated.
- iter (brainstorm.DataIterator) – A data iterator which produces the data on which the scores are computed.
- scorers (tuple[brainstorm.scorers.Scorer]) – A list or tuple of Scorers.
- out_name (Optional[str]) – Name of the network output which is scored against the targets.
- targets_name (Optional[str]) – Name of the targets data provided by the
data iterator (
iter
). - mask_name (Optional[str]) – Name of the mask data provided by the
data iterator (
iter
).
-
brainstorm.tools.
extract
(network, iter, buffer_names)[source]¶ Apply the network to some data and return the requested buffers.
Batches are returned as a dictionary, with one entry for each requested buffer, with the data in (T, B, ...) order.
Parameters: - network (brainstorm.structure.Network) – Network using which the features should be generated.
- iter (brainstorm.DataIterator) – A data iterator which produces the data on which the features are computed.
- buffer_names (list[unicode]) – Name of the buffer views to be saved (in dotted notation).
Returns: dict[unicode, np.ndarray]
-
brainstorm.tools.
extract_and_save
(network, iter, buffer_names, file_name)[source]¶ Save the desired buffer values of a network to an HDF5 file.
In particular, this tool can be used to save the predictions of a network on a dataset. In general, any number of internal, input or output buffers of the network can be extracted.
Examples
>>> getter = Minibatches(100, default=x_test) >>> extract_and_save(network, ... getter, ... ['Output.outputs.predictions', ... 'Hid1.internals.H'], ... 'network_features.hdf5')
Parameters: - network (brainstorm.structure.Network) – Network using which the features should be generated.
- iter (brainstorm.DataIterator) – A data iterator which produces the data on which the features are computed.
- buffer_names (list[unicode]) – Name of the buffer views to be saved (in dotted notation). See example.
- file_name (unicode) – Name of the hdf5 file (including extension) in which the features should be saved.
-
brainstorm.tools.
print_network_info
(network)[source]¶ Print detailed information about the network.
This tools prints the input, output and parameter shapes for all the layers. It also prints the total number of parameters in each layer and in the full network.
Parameters: network (brainstorm.structure.Network) – A network for which the details are printed.
-
brainstorm.tools.
get_in_out_layers
(task_type, in_shape, out_shape, data_name='default', targets_name='targets', projection_name=None, outlayer_name=None, mask_name=None, use_conv=None)[source]¶ Prepare input and output layers for building a network.
This is a helper function for quickly building networks. It returns an
Input
layer and a projection layer which is aFullyConnected
orConvolution2D
layer depending on the shape of the targets. It creates a mask layer if a mask name is provided, and connects it appropriately.An appropriate layer to compute the matching loss is connected, depending on the task_type:
classification: The projection layer is connected to a SoftmaxCE layer, which receives targets from the input layer. This is suitable for a single-label classification task.
multi-label: The projection layer is connected to a SigmoidCE layer, which receives targets from the input layer. This is suitable for a multi-label classification task.
regression: The projection layer is connected to a SquaredError layer, which receives targets from the input layer. This is suitable for least squares regression.
Note
The projection layer uses parameters, so it should be initialized after network creation. Check argument descriptions to understand how it will be named.
Example
>>> from brainstorm import tools, Network, layers >>> inp, out = tools.get_in_out_layers('classification', 784, 10) >>> net = Network.from_layer(inp >> layers.FullyConnected(1000) >> out)
Parameters: - task_type (str) – one of [‘classification’, ‘regression’, ‘multi-label’]
- in_shape (int or tuple[int]) – Shape of the input data.
- out_shape (int or tuple[int]) – Shape of the network output.
- data_name (Optional[str]) – Name of the input data which will be provided by a data iterator. Defaults to ‘default’.
- targets_name (Optional[str]) – Name of the ground-truth target data which will be provided by a data iterator. Defaults to ‘targets’.
- projection_name (Optional[str]) – Name for the projection layer which connects to the softmax layer.
If unspecified, will be set to
outlayer_name
+ ‘_projection’ ifoutlayer_name
is provided, and ‘Output_projection’ otherwise. - outlayer_name (Optional[str]) – Name for the output layer. If unspecified, named to ‘Output’.
- mask_name (Optional[str]) –
Name of the mask data which will be provided by a data iterator. Defaults to None.
The mask is needed if error should be injected only at certain time steps (for sequential data).
- use_conv (Optional[bool]) – Specify whether the projection layer should be convolutional. If true the projection layer will use 1x1 convolutions otherwise it will be fully connected. Default is to autodetect this based on the output shape.
Returns: tuple[Layer]
-
brainstorm.tools.
create_net_from_spec
(task_type, in_shape, out_shape, spec, data_name='default', targets_name='targets', mask_name=None, use_conv=None)[source]¶ Create a complete network from a spec line like this “F50 F20 F50”.
- Spec:
Capital letters specify the layer type and are followed by arguments to the layer. Supported layers are:
- F : FullyConnected
- R : Recurrent
- L : Lstm
- B : BatchNorm
- D : Dropout
- C : Convolution2D
- P : Pooling2D
Where applicable the optional first argument is the activation function from the set {l, r, s, t} corresponding to ‘linear’, ‘relu’, ‘sigmoid’ and ‘tanh’ resp.
FullyConnected, Recurrent and Lstm take their size as mandatory arguments (after the optional activation function argument).
Dropout takes the dropout probability as an optional argument.
Convolution2D takes two mandatory arguments: num_filters and kernel_size like this: ‘C32:3’ or with activation ‘Cs32:3’ meaning 32 filters with a kernel size of 3x3. They can be followed by ‘p1’ for padding and/or ‘s2’ for a stride of (2, 2).
Pooling2D takes an optional first argument for the type of pooling: ‘m’ for max and ‘a’ for average pooling. The next (mandatory) argument is the kernel size. As with Convolution2D it can be followed by ‘p1’ for padding and/or ‘s2’ for setting the stride to (2, 2).
Whitespace is allowed everywhere and will be completely ignored.
Examples
The mnist_pi example can be expressed like this: >>> net = create_net_from_spec(‘classification’, 784, 10, ... ‘D.2 F1200 D F1200 D’) The cifar10_cnn example can be shortened like this: >>> net = create_net_from_spec( ... ‘classification’, (3, 32, 32), 10, ... ‘C32:5p2 P3s2 C32:5p2 P3s2 C64:5p2 P3s2 F64’)
Parameters: - task_type (str) – one of [‘classification’, ‘regression’, ‘multi-label’]
- in_shape (int or tuple[int]) – Shape of the input data.
- out_shape (int or tuple[int]) – Output shape / nr of classes
- spec (str) – A line describing the network as explained above.
- data_name (Optional[str]) – Name of the input data which will be provided by a data iterator. Defaults to ‘default’.
- targets_name (Optional[str]) – Name of the ground-truth target data which will be provided by a data iterator. Defaults to ‘targets’.
- mask_name (Optional[str]) –
Name of the mask data which will be provided by a data iterator. Defaults to None.
The mask is needed if error should be injected only at certain time steps (for sequential data).
- use_conv (Optional[bool]) – Specify whether the projection layer should be convolutional. If true the projection layer will use 1x1 convolutions otherwise it will be fully connected. Default is to autodetect this based on the output shape.
Returns: The constructed network initialized with DenseSqrtFanInOut for layers with activation function and a simple Gaussian default and fallback.
Return type:
Data Iterators¶
-
class
brainstorm.data_iterators.
AddGaussianNoise
(iter, std_dict, mean_dict=None)[source]¶ Adds Gaussian noise to data generated by another iterator, which must provide named data items (such as Online, Minibatches, Undivided). Only Numpy data is supported,
Supports usage of different means and standard deviations for different named data items.
-
class
brainstorm.data_iterators.
AddSaltNPepper
(iter, prob_dict, ratio_dict=None)[source]¶ Adds Salt&Pepper noise to data generated by another iterator, which must provide named data items (such as Online, Minibatches, Undivided). Only Numpy data is supported,
Supports usage of different amounts and ratios of salt VS pepper for different named data items.
-
class
brainstorm.data_iterators.
DataIterator
(data_shapes, length)[source]¶ Base class for Data Iterators.
-
data_shapes
¶ dict[str, tuple[int]]
List of input names that this iterator provides.
-
length
¶ int | None
Number of iterations that this iterator will run.
-
-
class
brainstorm.data_iterators.
Flip
(iter, prob_dict=None)[source]¶ Randomly flip images horizontally. Images are generated by another iterator, which must provide named data items (such as Online, Minibatches, Undivided). Only 5D Numpy data in TNHWC format is supported.
Defaults to flipping the ‘default’ named data item with a probability of 0.5. Note that the last dimension is flipped, which typically corresponds to flipping images horizontally.
-
class
brainstorm.data_iterators.
Minibatches
(batch_size=1, shuffle=True, cut_according_to='mask', **named_data)[source]¶ Minibatch iterator for inputs and targets.
If either a ‘mask’ is given or some other means of determining sequence length is specified by cut_according_to, this iterator also cuts the sequences in each minibatch to their maximum length (which can be less than the maximum length over the whole dataset).
Note
When shuffling is enabled, this iterator only randomizes the order of minibatches, but doesn’t re-shuffle instances across batches.
-
class
brainstorm.data_iterators.
MultiHot
(iter, vocab_size_dict)[source]¶ Convert data to multi hot vectors, according to provided vocabulary sizes. If vocabulary size is not provided for some data item, it is yielded as is.
Currently this iterator only supports 3D data.
-
class
brainstorm.data_iterators.
OneHot
(iter, vocab_size_dict)[source]¶ Convert data to one hot vectors, according to provided vocabulary sizes. If vocabulary size is not provided for some data item, it is yielded as is.
Currently this iterator only supports 3D data where the last (right-most) dimension is sized 1.
-
class
brainstorm.data_iterators.
Pad
(iter, size_dict, value_dict=None)[source]¶ Pads images equally on all sides. Images are generated by another iterator, which must provide named data items (such as Online, Minibatches, Undivided). Only 5D Numpy data in TNHWC format is supported.
5D data corresponds to sequences of multi-channel images, which is the typical use case. Zero-padding is used unless specified otherwise.
-
class
brainstorm.data_iterators.
RandomCrop
(iter, shape_dict)[source]¶ Randomly crops image data. Images are generated by another iterator, which must provide named data items (such as Online, Minibatches, Undivided). Only 5D Numpy data in TNHWC format is supported.
5D data corresponds to sequences of multi-channel images, which is the typical use case.
Initializers¶
-
class
brainstorm.initializers.
ArrayInitializer
(array)[source]¶ Initializes the parameters as the values of the input array.
-
class
brainstorm.initializers.
DenseSqrtFanIn
(scale='rel')[source]¶ Initializes the parameters randomly according to a uniform distribution over the interval [-scale/sqrt(n), scale/sqrt(n)] where n is the number of inputs to each unit. Uses scale=sqrt(6) by default which is appropriate for rel units.
When number of inputs and outputs are the same, this is equivalent to using
DenseSqrtFanInOut
.- Scaling:
- rel: sqrt(6)
- tanh: sqrt(3)
- sigmoid: 4 * sqrt(3)
- linear: 1
Parameters: scale (Optional(float or str) – The activation function dependent scaling factor. Can be either float or one of [‘rel’, ‘tanh’, ‘sigmoid’, ‘linear’]. Defaults to ‘rel’.
-
class
brainstorm.initializers.
DenseSqrtFanInOut
(scale='rel')[source]¶ Initializes the parameters randomly according to a uniform distribution over the interval [-scale/sqrt(n1+n2), scale/sqrt(n1+n2)] where n1 is the number of inputs to each unit and n2 is the number of units in the current layer. Uses scale=sqrt(12) by default which is appropriate for rel units.
- Scaling:
- rel: sqrt(12)
- tanh: sqrt(6)
- sigmoid: 4 * sqrt(6)
- linear: 1
Parameters: scale (Optional(float or str) – The activation function dependent scaling factor. Can be either float or one of [‘rel’, ‘tanh’, ‘sigmoid’, ‘linear’]. Defaults to ‘rel’. - Reference:
- Glorot, Xavier, and Yoshua Bengio. “Understanding the difficulty of training deep feedforward neural networks” International conference on artificial intelligence and statistics. 2010.
-
class
brainstorm.initializers.
EchoState
(spectral_radius=1.0)[source]¶ Classic echo state initialization. Creates a matrix with a fixed spectral radius (default=1.0). Spectral radius should be < 1 to satisfy ES-property. Only works for square matrices.
Example
>>> net.initialize(default=Gaussian(), Recurrent={'R': EchoState(0.77)})
-
class
brainstorm.initializers.
Gaussian
(std=0.1, mean=0.0)[source]¶ Initializes the parameters randomly according to a normal distribution of given mean and standard deviation.
-
class
brainstorm.initializers.
Identity
(scale=1.0, std=0.01, enforce_square=True)[source]¶ Initialize a matrix to the (scaled) identity matrix + some noise.
-
class
brainstorm.initializers.
LstmOptInit
(input_block=0.0, input_gate=0.0, forget_gate=0.0, output_gate=0.0)[source]¶ Used to initialize an LstmOpt layer. This is useful because in an LstmOpt layer all the parameters are concatenated for efficiency.
The parameters (input_block, input_gate, forget_gate, and output_gate) can be scalars or Initializers themselves.
-
class
brainstorm.initializers.
Orthogonal
(scale=1.0)[source]¶ Orthogonal initialization.
Reference: Saxe, Andrew M., James L. McClelland, and Surya Ganguli. “Exact solutions to the nonlinear dynamics of learning in deep linear neural networks.” arXiv preprint arXiv:1312.6120 (2013).
-
class
brainstorm.initializers.
RandomWalk
(act_func='linear', scale=None)[source]¶ Initializes a (square) weight matrix with the random walk scheme proposed by:
Sussillo, David, and L. F. Abbott. “Random Walk Initialization for Training Very Deep Feedforward Networks.” arXiv:1412.6558 [cs, Stat], December 19, 2014. http://arxiv.org/abs/1412.6558.
-
class
brainstorm.initializers.
SparseInputs
(sub_initializer, connections=15)[source]¶ Makes sure every unit only gets activation from a certain number of input units and the rest of the parameters are 0. The connections are initialized by evaluating the passed sub_initializer.
Example
>>> net.initialize(FullyConnected=SparseInputs(Gaussian(), ... connections=10))
-
class
brainstorm.initializers.
SparseOutputs
(sub_initializer, connections=15)[source]¶ Makes sure every unit is propagating its activation only to a certain number of output units, and the rest of the parameters are 0. The connections are initialized by evaluating the passed sub_initializer.
Example
>>> net.initialize(FullyConnected=SparseOutputs(Gaussian(), connections=10))
Hooks¶
-
class
brainstorm.hooks.
EarlyStopper
(log_name, patience=1, criterion='min', name=None, timescale='epoch', interval=1, verbose=None)[source]¶ Stop the training if a log entry does not improve for some time.
Can stop training when the log entry is at its minimum (such as an error) or maximum (such as accuracy) according to the
criterion
argument.The
timescale
andinterval
should be the same as those for the monitoring hook which logs the quantity of interest.Parameters: - log_name – Name of the log entry to be checked for improvement. It should be in the form <monitorname>.<log_name> where log_name itself may be a nested dictionary key in dotted notation.
- patience – Number of log updates to wait before stopping training. Default is 1.
- criterion – Indicates whether training should be stopped when the log entry is at its minimum or maximum value. Must be either ‘min’ or ‘max’. Defaults to ‘min’.
- name (Optional[str]) – Name of this monitor. This name is used as a key in the trainer logs. Default is ‘EarlyStopper’.
- timescale (Optional[str]) – Specifies whether the Monitor should be called after each epoch or after each update. Default is ‘epoch’.
- interval (Optional[int]) – This monitor should be called every
interval
epochs/updates. Default is 1. - verbose – bool, optional Specifies whether the logs of this monitor should be printed, and acts as a fallback verbosity for the used data iterator. If not set it defaults to the verbosity setting of the trainer.
Examples
Add a hook to monitor a quantity of interest:
>>> scorer = bs.scorers.Accuracy() >>> trainer.add_hook(bs.hooks.MonitorScores('valid_getter', [scorer], ... name='validation'))
Stop training if validation set accuracy does not rise for 10 epochs:
>>> trainer.add_hook(bs.hooks.EarlyStopper('validation.Accuracy', ... patience=10, ... criterion='max'))
Stop training if loss on validation set does not drop for 5 epochs:
>>> trainer.add_hook(bs.hooks.EarlyStopper('validation.total_loss', ... patience=5, ... criterion='min'))
-
class
brainstorm.hooks.
InfoUpdater
(run, name=None, timescale='epoch', interval=1)[source]¶ Save the information from logs to the Sacred custom info dict
-
class
brainstorm.hooks.
ModifyStepperAttribute
(schedule, attr_name='learning_rate', timescale='epoch', interval=1, name=None, verbose=None)[source]¶ Modify an attribute of the training stepper.
-
class
brainstorm.hooks.
MonitorLayerDeltas
(layer_name, name=None, timescale='epoch', interval=1, verbose=None)[source]¶ Monitor some statistics about all the deltas of a layer.
-
class
brainstorm.hooks.
MonitorLayerGradients
(layer_name, name=None, timescale='epoch', interval=1, verbose=None)[source]¶ Monitor some statistics about all the gradients of a layer.
-
class
brainstorm.hooks.
MonitorLayerInOuts
(layer_name, name=None, timescale='epoch', interval=1, verbose=None)[source]¶ Monitor some statistics about all the inputs and outputs of a layer.
-
class
brainstorm.hooks.
MonitorLayerParameters
(layer_name, name=None, timescale='epoch', interval=1, verbose=None)[source]¶ Monitor some statistics about all the parameters of a layer.
-
class
brainstorm.hooks.
MonitorLoss
(iter_name, name=None, timescale='epoch', interval=1, verbose=None)[source]¶ Monitor the losses computed by the network on a dataset using a given data iterator.
-
class
brainstorm.hooks.
MonitorScores
(iter_name, scorers, name=None, timescale='epoch', interval=1, verbose=None)[source]¶ Monitor the losses and optionally several scores using a given data iterator.
Parameters: - iter_name (str) – name of the data iterator to use (as specified in the train() call)
- scorers (List[brainstorm.scorers.Scorer]) – List of Scorers to evaluate.
- name (Optional[str]) – Name of this monitor. This name is used as a key in the trainer logs. Default is ‘MonitorScores’
- timescale (Optional[str]) – Specifies whether the Monitor should be called after each epoch or after each update. Default is ‘epoch’.
- interval (Optional[int]) – This monitor should be called every
interval
epochs/updates. Default is 1. - verbose – bool, optional Specifies whether the logs of this monitor should be printed, and acts as a fallback verbosity for the used data iterator. If not set it defaults to the verbosity setting of the trainer.
See also
MonitorLoss: monitor the overall loss of the network.
-
class
brainstorm.hooks.
SaveBestNetwork
(log_name, filename=None, criterion='max', name=None, timescale='epoch', interval=1, verbose=None)[source]¶ Check to see if the specified log entry is at it’s best value and if so, save the network to a specified file.
Can save the network when the log entry is at its minimum (such as an error) or maximum (such as accuracy) according to the
criterion
argument.The
timescale
andinterval
should be the same as those for the monitoring hook which logs the quantity of interest.Parameters: - log_name – Name of the log entry to be checked for improvement. It should be in the form <monitorname>.<log_name> where log_name itself may be a nested dictionary key in dotted notation.
- filename – Name of the HDF5 file to which the network should be saved.
- criterion – Indicates whether training should be stopped when the log entry is at its minimum or maximum value. Must be either ‘min’ or ‘max’. Defaults to ‘min’.
- name (Optional[str]) – Name of this monitor. This name is used as a key in the trainer logs. Default is ‘SaveBestNetwork’.
- timescale (Optional[str]) – Specifies whether the Monitor should be called after each epoch or after each update. Default is ‘epoch’.
- interval (Optional[int]) – This monitor should be called every
interval
epochs/updates. Default is 1. - verbose – bool, optional Specifies whether the logs of this monitor should be printed, and acts as a fallback verbosity for the used data iterator. If not set it defaults to the verbosity setting of the trainer.
Examples
Add a hook to monitor a quantity of interest:
>>> scorer = bs.scorers.Accuracy() >>> trainer.add_hook(bs.hooks.MonitorScores('valid_getter', [scorer], ... name='validation'))
Check every epoch and save the network if validation accuracy rises:
>>> trainer.add_hook(bs.hooks.SaveBestNetwork('validation.Accuracy', ... filename='best_acc.h5', ... criterion='max'))
Check every epoch and save the network if validation loss drops:
>>> trainer.add_hook(bs.hooks.SaveBestNetwork('validation.total_loss', ... filename='best_loss.h5', ... criterion='min'))
-
class
brainstorm.hooks.
SaveLogs
(filename, name=None, timescale='epoch', interval=1)[source]¶ Periodically Save the trainer logs dictionary to an HDF5 file. Default behavior is to save every epoch.
-
class
brainstorm.hooks.
SaveNetwork
(filename, name=None, timescale='epoch', interval=1)[source]¶ Periodically save the weights of the network to the given file. Default behavior is to save the network after every training epoch.
-
class
brainstorm.hooks.
StopAfterEpoch
(max_epochs, name=None, timescale='epoch', interval=1, verbose=None)[source]¶ Stop the training after a specified number of epochs.
Parameters: - max_epochs (int) – The number of epochs to train.
- name (Optional[str]) – Name of this monitor. This name is used as a key in the trainer logs. Default is ‘StopAfterEpoch’.
- timescale (Optional[str]) – Specifies whether the Monitor should be called after each epoch or after each update. Default is ‘epoch’.
- interval (Optional[int]) – This monitor should be called every
interval
epochs/updates. Default is 1. - verbose – bool, optional Specifies whether the logs of this monitor should be printed, and acts as a fallback verbosity for the used data iterator. If not set it defaults to the verbosity setting of the trainer.
-
class
brainstorm.hooks.
StopAfterThresholdReached
(log_name, threshold, criterion='min', name=None, timescale='epoch', interval=1, verbose=None)[source]¶ Stop the training if a log entry reaches the given threshold
Can stop training when the log entry becomes sufficiently small (such as an error) or sufficiently large (such as accuracy) according to the threshold.
Parameters: - log_name – Name of the log entry to be checked for improvement. It should be in the form <monitorname>.<log_name> where log_name itself may be a nested dictionary key in dotted notation.
- threshold – The threshold value to reach
- criterion – Indicates whether training should be stopped when the log entry is at its minimum or maximum value. Must be either ‘min’ or ‘max’. Defaults to ‘min’.
- name (Optional[str]) – Name of this monitor. This name is used as a key in the trainer logs. Default is ‘StopAfterThresholdReached’.
- timescale (Optional[str]) – Specifies whether the Monitor should be called after each epoch or after each update. Default is ‘epoch’.
- interval (Optional[int]) – This monitor should be called every
interval
epochs/updates. Default is 1. - verbose – bool, optional Specifies whether the logs of this monitor should be printed, and acts as a fallback verbosity for the used data iterator. If not set it defaults to the verbosity setting of the trainer.
Examples
Stop training if validation set accuracy is at least 97 %:
>>> trainer.add_hook(StopAfterThresholdReached('validation.Accuracy', ... threshold=0.97, ... criterion='max'))
Stop training if loss on validation set goes below 0.2:
>>> trainer.add_hook(StopAfterThresholdReached('validation.total_loss', ... threshold=0.2, ... criterion='min'))
-
class
brainstorm.hooks.
StopOnNan
(logs_to_check=(), check_parameters=True, check_training_loss=True, name=None, timescale='epoch', interval=1, verbose=None)[source]¶ Stop the training if infinite or NaN values are found in parameters.
This hook can also check a list of logs for invalid values.
Parameters: - logs_to_check (Optional[list, tuple]) – A list of trainer logs to check in dotted notation. Defaults to ().
- check_parameters (Optional[bool]) – Indicates whether the parameters should be checked for NaN. Defaults to True.
- name (Optional[str]) – Name of this monitor. This name is used as a key in the trainer logs. Default is ‘StopOnNan’.
- timescale (Optional[str]) – Specifies whether the Monitor should be called after each epoch or after each update. Default is ‘epoch’.
- interval (Optional[int]) – This monitor should be called every
interval
epochs/updates. Default is 1. - verbose – bool, optional Specifies whether the logs of this monitor should be printed, and acts as a fallback verbosity for the used data iterator. If not set it defaults to the verbosity setting of the trainer.
-
class
brainstorm.hooks.
StopOnSigQuit
(name=None, timescale='epoch', interval=1, verbose=None)[source]¶ Stop training after the next call if it received a SIGQUIT (Ctrl + ).
This hook makes it possible to exit the training loop and continue with the rest of the program execution.
Parameters: - name (Optional[str]) – Name of this monitor. This name is used as a key in the trainer logs. Default is ‘StopOnSigQuit’.
- timescale (Optional[str]) – Specifies whether the Monitor should be called after each epoch or after each update. Default is ‘epoch’.
- interval (Optional[int]) – This monitor should be called every
interval
epochs/updates. Default is 1. - verbose – bool, optional Specifies whether the logs of this monitor should be printed, and acts as a fallback verbosity for the used data iterator. If not set it defaults to the verbosity setting of the trainer.
Value Modifiers¶
-
class
brainstorm.value_modifiers.
ClipValues
(low=-1.0, high=1.0)[source]¶ Clips (limits) the weights to be between low and high. Defaults to low=-1 and high=1.
Should be added to the network via the set_weight_modifiers method like so:
>> net.set_weight_modifiers(RnnLayer={‘HR’: ClipValues()})
See Network.set_weight_modifiers for more information on how to control which weights to affect.
-
class
brainstorm.value_modifiers.
ConstrainL2Norm
(limit)[source]¶ Constrains the L2 norm of the incoming weights to every neuron/unit to be less than or equal to a limit. If the L2 norm for any unit exceeds the limit, the weights are rescaled such that the squared L2 norm equals the limit. Ignores Biases.
Should be added to the network via the set_weight_modifiers method like so:
>> net.set_weight_modifiers(RnnLayer={‘HX’: ConstrainL2Norm()})
See Network.set_weight_modifiers for more information on how to control which weights to affect.
-
class
brainstorm.value_modifiers.
FreezeValues
(weights=None)[source]¶ Prevents the weights from changing at all.
If the weights argument is left at None it will remember the first weights it sees and resets them to that every time.
Should be added to the network via the set_constraints method like so: >> net.set_constraints(RnnLayer={‘HR’: FreezeValues()}) See Network.set_constraints for more information on how to control which weights to affect.
-
class
brainstorm.value_modifiers.
L1Decay
(factor)[source]¶ Applies L1 weight decay.
New gradients = gradients + factor * sign(parameters)
-
class
brainstorm.value_modifiers.
L2Decay
(factor)[source]¶ Applies L2 weight decay.
New gradients = gradients + factor * parameters
-
class
brainstorm.value_modifiers.
MaskValues
(mask)[source]¶ Multiplies the weights elementwise with the mask.
This can be used to clamp some of the weights to zero.
Should be added to the network via the set_weight_modifiers method like so:
>> net.set_weight_modifiers(RnnLayer={‘HR’: MaskValues(M)})
See Network.set_weight_modifiers for more information on how to control which weights to affect.
Scorers¶
Handler¶
-
class
brainstorm.handlers.base_handler.
Handler
[source]¶ Abstract base class for all handlers.
This base is used mainly to ensure a common interface and provide documentation for derived handlers. When implementing new methods one should adhere to the naming scheme. Most mathematical operations should have a suffix or suffixes indicating the shapes of inputs it expects:
s for scalar, v for vector (a 2D array with at least dimension equal to 1), m for matrix (a 2D array), t for tensor (which means arbitrary shape, synonym for array).
Note that these shapes are not checked by each handler itself. However, the DebugHandler can be used to perform these checks to ensure that operations are not abused.
-
dtype
¶ Data type that this handler works with.
-
context
¶ Context which may be used by this handler for operation.
-
EMPTY
¶ An empty array matching this handler’s type.
-
rnd
¶ A random state maintained by this handler.
-
array_type
¶ The type of array object that this handler works with.
-
__describe__
()¶ Returns a description of this object. That is a dictionary containing the name of the class as
@type
and all members of the class. This description is json-serializable.If a sub-class of Describable contains non-describable members, it has to override this method to specify how it should be described.
Returns: Description of this object Return type: dict
-
__new_from_description__
(description)¶ Creates a new object from a given description.
If a sub-class of Describable contains non-describable fields, it has to override this method to specify how they should be initialized from their description.
Parameters: description (dict) – description of this object Returns: A new instance of this class according to the description.
-
abs_t
(a, out)[source]¶ Compute the element-wise absolute value.
Parameters: - a (array_type) – Array whose absolute values are to be computed.
- out (array_type) – Array into which the output is placed. Must
have the same shape as
a
.
Returns: None
-
add_into_if
(a, out, cond)[source]¶ Add element of a to element of out if corresponding element in cond is non-zero.
Parameters: - a (array_type) – Array whose elements (might) be added to out.
- out (array_type) – Output array, whose values might be increased by values from a.
- cond (array_type) – The condition array. Only those entries from a are added into out whose corresponding cond elements are non-zero.
Returns: None
-
add_mv
(m, v, out)[source]¶ Add a matrix to a vector with broadcasting.
Add an (M, N) matrix to a (1, N) or (M, 1) vector using broadcasting such that the output is (M, N).
Parameters: - m (array_type) – The first array to be added. Must be 2D.
- v (array_type) – The second array to be added. Must be 2D with at
least one dimension of size 1 and the other
dimension matching the corresponding size of
m
. - out (array_type) – Array into which the output is placed. Must
have the same shape as
m
.
Returns: None
-
add_st
(s, t, out)[source]¶ Add a scalar to each element of a tensor.
Parameters: - s (dtype) – The scalar value to be added.
- t (array_type) – The array to be added.
- out (array_type) – Array into which the output is placed. Must
have the same shape as
t
.
Returns: None
-
add_tt
(a, b, out)[source]¶ Add two tensors element-wise,
Parameters: - a (array_type) – First array.
- b (array_type) – Second array.
- out (array_type) – Array into which the output is placed. Must
have the same shape as
a
andb
.
Returns: None
-
allocate
(shape)[source]¶ Allocate new memory with given shape but arbitrary content.
Parameters: shape (tuple[int]) – Shape of the array. Returns: New array with given shape. Return type: object
-
avgpool2d_backward_batch
(inputs, window, outputs, padding, stride, in_deltas, out_deltas)[source]¶ Computes the gradients for 2D average-pooling on a batch of images.
Parameters: - inputs (array_type) –
- window (tuple[int]) –
- outputs (array_type) –
- padding (int) –
- stride (tuple[int]) –
- in_deltas (array_type) –
- out_deltas (array_type) –
Returns: None
-
avgpool2d_forward_batch
(inputs, window, outputs, padding, stride)[source]¶ Performs 2D average-pooling on a batch of images.
Parameters: - inputs (array_type) –
- window (tuple[int]) –
- outputs (array_type) –
- padding (int) –
- stride (tuple[int]) –
- argmax (array_type) –
Returns: None
-
binarize_v
(v, out)[source]¶ Convert a column vector into a matrix of one-hot row vectors.
Usually used to convert class IDs into one-hot vectors. Therefore, out[i, j] = 1, if j equals v[i, 0] out[i, j] = 0, otherwise.
Note that out must have enough columns such that all indices in
v
are valid.Parameters: - v (array_type) – Column vector (2D array with a single column).
- out (array_type) – Matrix (2D array) into which the output is
placed. The number of rows must be the same as
v
and number of columns must be greater than the maximum value inv
.
Returns: None
-
broadcast_t
(a, axis, out)[source]¶ Broadcast the given axis of an array by copying elements.
This function provides a numpy-broadcast-like operation for the the dimension given by axis. E.g. for axis=3 an array with shape (2, 3, 4, 1) may be broadcasted to shape (2, 3, 4, 5), by copying all the elements 5 times.
Parameters: - a (array_type) – Array whose elements should be broadcasted. The dimension corresponding to axis must be of size 1.
- axis (int) – the axis along which to broadcast
- out (array_type) – Array into which the output is placed. Must have same the number of dimensions as a. Only the dimension corresponding to axis can differ from a.
Returns: None
-
clip_t
(a, a_min, a_max, out)[source]¶ Clip (limit) the values in an array.
Given an interval, values outside the interval are clipped to the interval edges. For example, if an interval of [0, 1] is specified, values smaller than 0 become 0, and values larger than 1 become 1.
Parameters: - a (array_type) – Array containing the elements to clip.
- a_min (dtype) – Minimum value.
- a_max (dtype) – Maximum value.
- out (array_type) – Array into which the output is placed. Must
have the same shape as
a
.
Returns: None
-
conv2d_backward_batch
(inputs, weights, padding, stride, in_deltas, out_deltas, weight_deltas, bias_deltas)[source]¶ Computes the gradients for a 2D convolution on a batch of images.
Parameters: - inputs (array_type) –
- weights (array_type) –
- padding (int) –
- stride (tuple[int]) –
- in_deltas (array_type) –
- out_deltas (array_type) –
- weight_deltas (array_type) –
- bias_deltas (array_type) –
Returns: None
-
conv2d_forward_batch
(inputs, weights, bias, outputs, padding, stride)[source]¶ Performs a 2D convolution on a batch of images.
Parameters: - inputs (array_type) –
- weights (array_type) –
- bias (array_type) –
- outputs (array_type) –
- padding (int) –
- stride (tuple[int]) –
Returns: None
-
copy_to
(src, dest)[source]¶ Copy the contents of one array to another.
Both source and destination arrays must be of this handler’s supported type and have the same shape.
Parameters: - dest (array_type) – Destination array.
- src (array_type) – Source array.
Returns: None
-
copy_to_if
(src, dest, cond)[source]¶ Copy element of ‘src’ to element of ‘dest’ if cond is not equal to 0.
Parameters: - src (array_type) – Source array whose elements (might) be copied into dest.
- dest (array_type) – Destination array.
- cond (array_type) – The condition array. Only those src elements get copied to dest whose corresponding cond elements are non-zero.
Returns: None
-
create_from_numpy
(arr)[source]¶ Create a new array with the same entries as a Numpy array.
Parameters: arr (numpy.ndarray) – Numpy array whose elements should be used to fill the new array. Returns: New array with same shape and entries as the given Numpy array. Return type: array_type
-
divide_mv
(m, v, out)[source]¶ Divide a matrix by a vector.
Divide a (M, N) matrix element-wise by a (1, N) vector using broadcasting such that the output is (M, N).
Parameters: - a (array_type) – First array (dividend). Must be 2D.
- b (array_type) – Second array (divisor). Must be 2D with at
least one dimension of size 1 and second
dimension matching the corresponding size of
m
. - out (array_type) – Array into which the output is placed. Must
have the same shape as
m
.
Returns: None
-
divide_tt
(a, b, out)[source]¶ Divide two tensors element-wise.
Parameters: - a (array_type) – First array (dividend).
- b (array_type) – Second array (divisor). Must have the same shape
as
a
. - out (array_type) – Array into which the output is placed. Must have
the same shape as
a
andb
.
Returns: None
-
dot_add_mm
(a, b, out, transa=False, transb=False)[source]¶ Multiply two matrices and add to a matrix.
Only 2D arrays (matrices) are supported.
Parameters: - a (array_type) – First matrix.
- b (array_type) – Second matrix. Must have compatible shape to be
right-multiplied with
a
. - out (array_type) – Array into which the output is added. Must have correct shape for the product of the two matrices.
Returns: None
-
dot_mm
(a, b, out, transa=False, transb=False)[source]¶ Multiply two matrices.
Only 2D arrays (matrices) are supported.
Parameters: - a (array_type) – First matrix.
- b (array_type) – Second matrix. Must have compatible shape to be
right-multiplied with
a
. - out (array_type) – Array into which the output is placed. Must have correct shape for the product of the two matrices.
Returns: None
-
el
(x, y)[source]¶ Compute exponential linear activation function.
f(x) = x if x > 0 else exp(x) - 1
Note that we chose to fix alpha to 1
Parameters: - x (array_type) – Input Array.
- y (array_type) – Output Array
Returns: None
References
Clevert, D. A., Unterthiner, T., & Hochreiter, S. (2015). Fast and Accurate Deep Network Learning by Exponential Linear Units. arXiv preprint arXiv:1511.07289.
-
el_deriv
(x, y, dy, dx)[source]¶ Backpropagate derivatives through the exponential linear function.
f’(x) = 1 if x > 0 else f(x) + 1
Note that we chose to fix alpha to 1
Parameters: - x (array_type) – Inputs to the exponential linear function. This argument is not used and is present only to conform with other activation functions.
- y (array_type) – Outputs of the exponential linear function.
- dy (array_type) – Derivatives with respect to the outputs.
- dx (array_type) – Array in which the derivatives with respect to the inputs are placed.
Returns: None
References
Clevert, D. A., Unterthiner, T., & Hochreiter, S. (2015). Fast and Accurate Deep Network Learning by Exponential Linear Units. arXiv preprint arXiv:1511.07289.
-
fill
(mem, val)[source]¶ Fill an array with a given value.
Parameters: - mem (array_type) – Array to be filled.
- val (dtype) – Value to fill.
Returns: None
-
fill_gaussian
(mean, std, out)[source]¶ Fill an array with values drawn from a Gaussian distribution.
Parameters: - mean (float) – Mean of the Gaussian Distribution.
- std (float) – Standard deviation of the Gaussian distribution.
- out (array_type) – Target array to fill with values.
Returns: None
-
fill_if
(mem, val, cond)[source]¶ Set the elements of mem to val if corresponding cond element is non-zero.
Parameters: - mem (array_type) – Array to be filled.
- val (dtype) – The scalar which the elements of mem (might) be set to.
- cond (array_type) – The condition array. Only those mem elements are set to val whose corresponding cond elements are non-zero.
Returns: None
-
generate_probability_mask
(mask, probability)[source]¶ Fill an array with zeros and ones.
Fill an array with zeros and ones such that the probability of an element being one is equal to
probability
.Parameters: - mask (array_type) – Array to will be filled.
- probability (float) – Probability of an element of
mask
- equal to one. (being) –
Returns: None
-
get_numpy_copy
(mem)[source]¶ Return a copy of the given data as a numpy array.
Parameters: mem (array_type) – Source array to be copied. Returns: Numpy array with same content as mem. Return type: numpy.ndarray
-
index_m_by_v
(m, v, out)[source]¶ Get elements from a matrix using indices from a vector.
v
andout
must be column vectors of the same size. Elements from the matrixm
are copied using the indices given by a column vector. From row i of the matrix, the element from column v[i, 0] is copied to out, such that out[i, 0] = m[i, v[i, 0]].Note that m must have enough columns such that all indices in
v
are valid.Parameters: - m (array_type) – Matrix (2D array) whose elements should be copied.
- v (array_type) – Column vector (2D array with a single column) whose
values are used as indices into
m
. The number of rows must be the same asm
. - out (array_type) – Array into which the output is placed. It’s shape
must be the same as
v
.
Returns: None
-
is_fully_finite
(a)[source]¶ Check if all entries of the array are finite (no nans or infs).
Parameters: a (array_type) – Input array to check. Returns: True if there are no infs or nans, False otherwise. Return type: bool
-
log_t
(a, out)[source]¶ Compute the element-wise natural logarithm.
The natural logarithm log is the inverse of the exponential function, so that log(exp(x)) = x.
Parameters: - a (array_type) – Array whose logarithm is to be computed.
- out (array_type) – Array into which the output is placed. Must
have the same shape as
a
.
Returns: None
-
maxpool2d_backward_batch
(inputs, window, outputs, padding, stride, argmax, in_deltas, out_deltas)[source]¶ Computes the gradients for 2D max-pooling on a batch of images.
Parameters: - inputs (array_type) –
- window (tuple[int]) –
- outputs (array_type) –
- padding (int) –
- stride (tuple[int]) –
- argmax (array_type) –
- in_deltas (array_type) –
- out_deltas (array_type) –
Returns: None
-
maxpool2d_forward_batch
(inputs, window, outputs, padding, stride, argmax)[source]¶ Performs a 2D max-pooling on a batch of images.
Parameters: - inputs (array_type) –
- window (tuple[int]) –
- outputs (array_type) –
- padding (int) –
- stride (tuple[int]) –
- argmax (array_type) –
Returns: None
-
merge_tt
(a, b, out)[source]¶ Merge arrays a and b along their last axis.
Parameters: - a (array_type) – Array to be merged.
- b (array_type) – Array to be merged.
- out (array_type) – Array into which the output is placed.
Returns: None
-
modulo_tt
(a, b, out)[source]¶ Take the modulo between two arrays elementwise. (out = a % b)
Parameters: - a (array_type) – First array (dividend).
- b (array_type) – Second array (divisor). Must have the same shape as a.
- out (array_type) – Array into which the remainder is placed. Must have the same
shape as
a
andb
.
Returns: None
-
mult_add_st
(s, t, out)[source]¶ Multiply a scalar with each element of a tensor and add to a tensor.
Parameters: - s (dtype) – The scalar value to be multiplied.
- t (array_type) – The array to be multiplied.
- out (array_type) – Array into which the product is added. Must have
the same shape as
t
.
Returns: None
-
mult_add_tt
(a, b, out)[source]¶ Multiply two tensors element-wise and add to a tensor.
Parameters: - a (array_type) – First array.
- b (array_type) – Second array. Must have the same shape as
a
. - out (array_type) – Array into which the output is added. Must have
the same shape as
a
andb
.
Returns: None
-
mult_mv
(m, v, out)[source]¶ Multiply a matrix with a vector.
Multiply an (M, N) matrix with a (1, N) or (M, 1) vector using broadcasting such that the output is (M, N). Also allows the “vector” to have the same dimension as the matrix in which case it behaves the same as
mult_tt()
.Parameters: - m (array_type) – The first array. Must be 2D.
- v (array_type) – The second array, to be multiplied with
a
. Must be 2D with at least one dimension of size 1 and the other dimension matching the corresponding size ofm
. - out (array_type) – Array into which the output is placed. Must
have the same shape as
m
.
Returns: None
-
mult_st
(s, t, out)[source]¶ Multiply a scalar with each element of a tensor.
Parameters: - s (dtype) – The scalar value to be multiplied.
- t (array_type) – The array to be multiplied.
- out (array_type) – Array into which the output is placed. Must have
the same shape as
t
.
Returns: None
-
mult_tt
(a, b, out)[source]¶ Multiply two tensors of the same shape element-wise.
Parameters: - a (array_type) – First array.
- b (array_type) – Second array. Must have the same shape as
a
. - out (array_type) – Array into which the output is placed. Must have
the same shape as
a
andb
.
Returns: None
-
ones
(shape)[source]¶ Allocate new memory with given shape and filled with ones.
Parameters: shape (tuple[int]) – Shape of the array. Returns: New array with given shape filled with ones. Return type: object
-
rel
(x, y)[source]¶ Compute the rel (rectified linear) function.
y = rel(x) = max(0, x)
Parameters: - x (array_type) – Input array.
- y (array_type) – Output array.
Returns: None
-
rel_deriv
(x, y, dy, dx)[source]¶ Backpropagate derivatives through the rectified linear function.
Parameters: - x (array_type) – Inputs to the rel function. This argument is not used and is present only to conform with other activation functions.
- y (array_type) – Outputs of the rel function.
- dy (array_type) – Derivatives with respect to the outputs.
- dx (array_type) – Array in which the derivatives with respect to the inputs are placed.
Returns: None
-
set_from_numpy
(mem, arr)[source]¶ Set the content of an array from a given numpy array.
Parameters: - mem (array_type) – Destination array that should be set.
- arr (numpy.ndarray) – Source numpy array.
Returns: None
-
sigmoid
(x, y)[source]¶ Compute the sigmoid function.
y = sigmoid(x) = 1 / (1 + exp(-x)) :param x: Input array. :type x: array_type :param y: Output array. :type y: array_type
Returns: None
-
sigmoid_deriv
(x, y, dy, dx)[source]¶ Backpropagate derivatives through the sigmoid function.
Parameters: - x (array_type) – Inputs to the sigmoid function. This argument is not used and is present only to conform with other activation functions.
- y (array_type) – Outputs of the sigmoid function.
- dy (array_type) – Derivatives with respect to the outputs.
- dx (array_type) – Array in which the derivatives with respect to the inputs are placed.
Returns: None
-
sign_t
(a, out)[source]¶ Compute an element-wise indication of the sign of a number.
Output has the value 1.0 if an element is positive, 0 if it is zero, and -1.0 if it is negative.
Parameters: - a (array_type) – Array whose sign is to be computed.
- out (array_type) – Array into which the output is placed. Must
have the same shape as
a
.
Returns: None
-
softmax_m
(m, out)[source]¶ Compute the softmax function over last dimension of a matrix.
Parameters: - m (array_type) – Input array.
- out (array_type) – Output array.
Returns: None
-
split_add_tt
(x, out_a, out_b)[source]¶ Split array x along the last axis and add the parts to out_i.
Parameters: - x (array_type) – Array to be split.
- out_a (array_type) – Array to which 1st part of x is added.
- out_b (array_type) – Array to which 2nd part of x is added.
Returns: None
-
sqrt_t
(a, out)[source]¶ Compute the positive square-root of an array, element-wise.
Parameters: - a (array_type) – Array whose square root is to be computed.
- out (array_type) – Array into which the output is placed. Must
have the same shape as
a
.
Returns: None
-
subtract_mv
(m, v, out)[source]¶ Subtract a vector from a matrix with broadcasting.
Parameters: - m (array_type) – The first array. Must be 2D.
- v (array_type) – The second array, to be subtracted from
a
. Must be 2D with at least one dimension of size 1 and second dimension matching the corresponding size ofm
. - out (array_type) – Array into which the output is placed. Must
have the same shape as
m
.
Returns: None
-
subtract_tt
(a, b, out)[source]¶ Subtract a tensor from another element-wise.
Parameters: - a (array_type) – First array.
- b (array_type) – Second array, to be subtracted from
a
. Must have the same shape asa
. - out (array_type) – Array into which the output
(
a
-b
) is placed. Must have the same shape asa
andb
.
Returns: None
-
sum_t
(a, axis, out)[source]¶ Sum the elements of an array along a given axis.
If axis is None, the sum is computed over all elements of the array. Otherwise, it is computed along the specified axis.
Note
Only 1D and 2D arrays are currently supported.
Parameters: - a (array_type) – Array to be summed.
- axis (int) – Axis over which the summation should be done.
- out (array_type) – Array into which the output is placed.
Returns: None
-
tanh
(x, y)[source]¶ Compute the tanh (hyperbolic tangent) function.
y = tanh(x) = (e^z - e^-z) / (e^z + e^-z)
Parameters: - x (array_type) – Input array.
- y (array_type) – Output array.
Returns: None
-
tanh_deriv
(x, y, dy, dx)[source]¶ Backpropagate derivatives through the tanh function.
Parameters: - x (array_type) – Inputs to the tanh function. This argument is not used and is present only to conform with other activation functions.
- y (array_type) – Outputs of the tanh function.
- dy (array_type) – Derivatives with respect to the outputs.
- dx (array_type) – Array in which the derivatives with respect to the inputs are placed.
Returns: None
-
Describables¶
This module provides the core functionality for describing objects.
Description¶
In brainstorm most objects can be converted into a so called description using
the get_description()
function.
A description is a JSON-serializable data structure that contains all static
information to re-create that object using the create_from_description()
function. It does not, however, contain any dynamic information.
This means that an object created from a description is in the same state as
if it had been freshly instantiated, without any later modifications of its
internal state.
The descriptions for the basic types int
, float
, bool
, str
,
list
and dict
are these values themselves. Other objects need to
inherit from Describable
and their description is always a dict
containing the '@type': 'ClassName'
key possibly along with other
properties.
Conversion to and from descriptions¶
-
brainstorm.describable.
create_from_description
(description)[source]¶ Instantiate a new object from a description.
Parameters: description (dict) – A description of the object. Returns: A new instance of the described object
-
brainstorm.describable.
get_description
(this)[source]¶ Create a JSON-serializable description of this object.
This description can be used to create a new instance of this object by calling
create_from_description()
.Parameters: this (Describable) – An object to be described. Must either be a basic datatype or inherit from Describable. Returns: A JSON-serializable description of this object.
Describable Base Class¶
-
class
brainstorm.describable.
Describable
[source]¶ Base class for all objects that can be described and initialized from a description.
Derived classes can specify the
__undescribed__
field to prevent certain attributes from being described. This field can be either a set of attribute names, or a dictionary mapping attribute names to their initialization value (used when a new object is created from the description).Derived classes can also specify an
__default_values__
dict. This dict allows for omitting certain attributes from the description if their value is equal to that default value.-
__undescribed__
= {}¶ Set or dict of attributes that should not be part of the description. If specified as a dict, then these attributes are initialized to the specified values.
-
__default_values__
= {}¶ Dict of attributes with their corresponding default values. They will only be part of the description if their value differs from their default value
-
__init_from_description__
(description)[source]¶ Subclasses can override this to provide additional initialization when created from a description.
This method will be called AFTER the object has already been created from the description.
Parameters: description (dict) – the description of this object
-
__describe__
()[source]¶ Returns a description of this object. That is a dictionary containing the name of the class as
@type
and all members of the class. This description is json-serializable.If a sub-class of Describable contains non-describable members, it has to override this method to specify how it should be described.
Returns: Description of this object Return type: dict
-
classmethod
__new_from_description__
(description)[source]¶ Creates a new object from a given description.
If a sub-class of Describable contains non-describable fields, it has to override this method to specify how they should be initialized from their description.
Parameters: description (dict) – description of this object Returns: A new instance of this class according to the description.
-
__describe__
()[source] Returns a description of this object. That is a dictionary containing the name of the class as
@type
and all members of the class. This description is json-serializable.If a sub-class of Describable contains non-describable members, it has to override this method to specify how it should be described.
Returns: Description of this object Return type: dict
-
__init_from_description__
(description)[source] Subclasses can override this to provide additional initialization when created from a description.
This method will be called AFTER the object has already been created from the description.
Parameters: description (dict) – the description of this object
-
classmethod
__new_from_description__
(description)[source] Creates a new object from a given description.
If a sub-class of Describable contains non-describable fields, it has to override this method to specify how they should be initialized from their description.
Parameters: description (dict) – description of this object Returns: A new instance of this class according to the description.
-