vanillanets
Components

Layers

API reference for DenseLayer.

DenseLayer

class vanillanets.layers.DenseLayer(n_inputs, n_neurons, *, activation='relu',
                                     init='auto', distribution='normal',
                                     bias_init='zeros', seed=None)

Applies a linear transformation: output = inputs @ weights + biases.

DenseLayer is a pure linear transform. activation is only used to resolve the weight-initialization scheme when init='auto' - it has no effect on forward/backward. Non-linearities must be added as separate layers:

model.add(DenseLayer(784, 128, activation='relu'))
model.add(ReLU())  # <- required separately

Parameters

  • n_inputs (int) - number of input features (fan_in).
  • n_neurons (int) - number of output neurons (fan_out).
  • activation (str, default='relu') - 'relu', 'leaky_relu', 'tanh', 'sigmoid', 'softmax', or 'linear'. Used only when init='auto'.
  • init (str, default='auto') - 'auto', 'he', or 'xavier'. Other values raise ValueError.
  • distribution (str, default='normal') - 'normal' or 'uniform'.
  • bias_init (str | int | float, default='zeros') - 'zeros' or a numeric constant. Other values raise ValueError.
  • seed (int, optional) - passed to np.random.default_rng(). If omitted, fresh entropy is used each construction.

init='auto' resolution

if activation in ('relu', 'leaky_relu'):
    init = 'he'
else:
    init = 'xavier'

Any activation value outside ('relu', 'leaky_relu') - including 'tanh', 'sigmoid', 'softmax', 'linear', or a typo - resolves to Xavier. This is not validated against the activation layer you actually add.

Initialization formulas

fan_in = n_inputs, fan_out = n_neurons.

initdistributionFormula
'he''normal' (default)weights ~ N(0, sqrt(2 / fan_in))
'he''uniform'weights ~ U(-limit, limit), limit = sqrt(6 / fan_in)
'xavier''normal'weights ~ N(0, sqrt(2 / (fan_in + fan_out)))
'xavier''uniform'weights ~ U(-limit, limit), limit = sqrt(6 / (fan_in + fan_out))

bias_init

ValueResult
'zeros' (default)np.zeros((1, n_neurons))
int / floatnp.full((1, n_neurons), value)
anything elseValueError

Forward

self.inputs = inputs
self.output = inputs @ self.weights + self.biases

Backward

self.dweights = self.inputs.T @ dvalues
self.dbiases  = np.sum(dvalues, axis=0, keepdims=True)
self.dinputs  = dvalues @ self.weights.T

Shapes

TensorShape
Input(batch_size, n_inputs)
weights(n_inputs, n_neurons)
biases(1, n_neurons)
Output(batch_size, n_neurons)
dweightssame as weights
dbiasessame as biases
dinputs(batch_size, n_inputs)

Attributes

  • weights (ndarray) - shape (n_inputs, n_neurons).
  • biases (ndarray) - shape (1, n_neurons).
  • dweights, dbiases, dinputs - populated after backward().

These exact attribute names are read by the optimizer via hasattr(layer, 'weights') - a custom layer must use this naming to receive parameter updates.

Example

from vanillanets.layers import DenseLayer

layer = DenseLayer(n_inputs=784, n_neurons=128, activation='relu', seed=42)
layer.forward(X_batch)
print(layer.output.shape)  # (batch_size, 128)

On this page