Getting Started

Learn the core sequential workflow of VanillaNets, from building an architecture to training and evaluating your model.

The VanillaNets API is designed around a straightforward sequential workflow. If you have used modern deep learning frameworks before, this process will feel immediately familiar.

Building a model from end to end involves four distinct phases: defining the architecture, compiling the training parameters, executing the training loop, and evaluating the results.

The Sequential Workflow

1. Define the Architecture

Start by initializing a base Model and sequentially stacking your dense layers and non-linear activation functions using model.add().

from vanillanets import Model, DenseLayer, Optimizer_Adam
from vanillanets.activations import ReLU, Sigmoid
from vanillanets.losses import BinaryCrossEntropy
from vanillanets.metrics import Accuracy

# Initialize an empty sequential model
model = Model()

# Stack layers and activations
model.add(DenseLayer(n_inputs=30, n_neurons=64))
model.add(ReLU())
model.add(DenseLayer(n_inputs=64, n_neurons=1))
model.add(Sigmoid())

Once the architecture is defined, you must configure how the model will learn. Use model.set() to define the objective (loss function), the optimization strategy, and the metrics you want to track. Finally, call model.finalize() to lock the computational graph and prepare it for training.

model.set(
    loss=BinaryCrossEntropy(),
    optimizer=Optimizer_Adam(learning_rate=0.01),
    metrics={'accuracy': Accuracy()}
)

# Lock the architecture and prepare the backward pass components
model.finalize()

3. Train the Model

Use the fit() method to begin the training loop. You can monitor the model's performance on unseen data in real-time by passing a tuple to the validation_data argument.

model.fit(
    X_train,
    y_train,
    epochs=100,
    print_every=10, 
    validation_data=(X_val, y_val)
)

4. Evaluate and Predict

After training, quantify the model's performance on a held-out test set using evaluate(). To generate predictions on new, unseen data, use predict().

# Compute final loss and chosen metrics
loss, metrics = model.evaluate(X_test, y_test)
print(f"Test Loss: {loss:.4f}, Accuracy: {metrics['accuracy']:.4f}")

# Generate raw model outputs for new data
predictions = model.predict(X_new)

Task-Specific Configurations

The most critical step in building your network is ensuring the final layer's dimension, the final activation function, and the loss function all align with your specific machine learning task.

Use this reference table to configure your network's output correctly:

Task Type	Final Layer Size	Final Activation	Appropriate Loss Function
Binary Classification	`DenseLayer(..., 1)`	`Sigmoid()`	`BinaryCrossEntropy()`
Multiclass Classification	`DenseLayer(..., num_classes)`	`Softmax()`	`CategoricalCrossEntropy()`
Regression	`DenseLayer(..., 1)`	`Linear()`	`MeanSquaredError()`

SparseCategoricalCrossEntropy is a subclass of CategoricalCrossEntropy that accepts integer-encoded labels (y shape (n,)) instead of one-hot. Both get the fused Softmax backward pass when paired with Softmax().

Tracking Performance Metrics

VanillaNets allows you to track multiple metrics simultaneously by passing a dictionary to the metrics argument in model.set().

model.set(
    loss=loss,
    optimizer=optimizer,
    metrics={'accuracy': Accuracy(), 'precision': Precision(), 'recall': Recall()}
)

Classification: Accuracy, Precision, Recall, F1Score, ConfusionMatrix
Regression: R2Score, MAE, RMSE

Where to Go Next

Architecture - what model.fit() does internally, and the forward/backward contract.
Layers - DenseLayer initialization strategies and gradients.
Activations - forward/backward formulas for each activation.
Losses & Optimizers - loss gradients, the fused Softmax+CrossEntropy path, and SGD/Adam update rules.
Metrics - label conversion and metric formulas.