Components
Layers
API reference for DenseLayer.
DenseLayer
class vanillanets.layers.DenseLayer(n_inputs, n_neurons, *, activation='relu',
init='auto', distribution='normal',
bias_init='zeros', seed=None)Applies a linear transformation: output = inputs @ weights + biases.
DenseLayer is a pure linear transform. activation is only used to resolve the weight-initialization scheme when init='auto' - it has no effect on forward/backward. Non-linearities must be added as separate layers:
model.add(DenseLayer(784, 128, activation='relu'))
model.add(ReLU()) # <- required separatelyParameters
- n_inputs (int) - number of input features (
fan_in). - n_neurons (int) - number of output neurons (
fan_out). - activation (str, default='relu') -
'relu','leaky_relu','tanh','sigmoid','softmax', or'linear'. Used only wheninit='auto'. - init (str, default='auto') -
'auto','he', or'xavier'. Other values raiseValueError. - distribution (str, default='normal') -
'normal'or'uniform'. - bias_init (str | int | float, default='zeros') -
'zeros'or a numeric constant. Other values raiseValueError. - seed (int, optional) - passed to
np.random.default_rng(). If omitted, fresh entropy is used each construction.
init='auto' resolution
if activation in ('relu', 'leaky_relu'):
init = 'he'
else:
init = 'xavier'Any activation value outside ('relu', 'leaky_relu') - including 'tanh', 'sigmoid', 'softmax', 'linear', or a typo - resolves to Xavier. This is not validated against the activation layer you actually add.
Initialization formulas
fan_in = n_inputs, fan_out = n_neurons.
init | distribution | Formula |
|---|---|---|
'he' | 'normal' (default) | weights ~ N(0, sqrt(2 / fan_in)) |
'he' | 'uniform' | weights ~ U(-limit, limit), limit = sqrt(6 / fan_in) |
'xavier' | 'normal' | weights ~ N(0, sqrt(2 / (fan_in + fan_out))) |
'xavier' | 'uniform' | weights ~ U(-limit, limit), limit = sqrt(6 / (fan_in + fan_out)) |
bias_init
| Value | Result |
|---|---|
'zeros' (default) | np.zeros((1, n_neurons)) |
int / float | np.full((1, n_neurons), value) |
| anything else | ValueError |
Forward
self.inputs = inputs
self.output = inputs @ self.weights + self.biasesBackward
self.dweights = self.inputs.T @ dvalues
self.dbiases = np.sum(dvalues, axis=0, keepdims=True)
self.dinputs = dvalues @ self.weights.TShapes
| Tensor | Shape |
|---|---|
| Input | (batch_size, n_inputs) |
weights | (n_inputs, n_neurons) |
biases | (1, n_neurons) |
| Output | (batch_size, n_neurons) |
dweights | same as weights |
dbiases | same as biases |
dinputs | (batch_size, n_inputs) |
Attributes
- weights (ndarray) - shape
(n_inputs, n_neurons). - biases (ndarray) - shape
(1, n_neurons). - dweights, dbiases, dinputs - populated after
backward().
These exact attribute names are read by the optimizer via hasattr(layer, 'weights') - a custom layer must use this naming to receive parameter updates.
Example
from vanillanets.layers import DenseLayer
layer = DenseLayer(n_inputs=784, n_neurons=128, activation='relu', seed=42)
layer.forward(X_batch)
print(layer.output.shape) # (batch_size, 128)