Getting Started
Learn the core sequential workflow of VanillaNets, from building an architecture to training and evaluating your model.
The VanillaNets API is designed around a straightforward sequential workflow. If you have used modern deep learning frameworks before, this process will feel immediately familiar.
Building a model from end to end involves four distinct phases: defining the architecture, compiling the training parameters, executing the training loop, and evaluating the results.
The Sequential Workflow
1. Define the Architecture
Start by initializing a base Model and sequentially stacking your dense layers and non-linear activation functions using model.add().
from vanillanets import Model, DenseLayer, Optimizer_Adam
from vanillanets.activations import ReLU, Sigmoid
from vanillanets.losses import BinaryCrossEntropy
from vanillanets.metrics import Accuracy
# Initialize an empty sequential model
model = Model()
# Stack layers and activations
model.add(DenseLayer(n_inputs=30, n_neurons=64))
model.add(ReLU())
model.add(DenseLayer(n_inputs=64, n_neurons=1))
model.add(Sigmoid())2. Compile the Model
Once the architecture is defined, you must configure how the model will learn. Use model.set() to define the objective (loss function), the optimization strategy, and the metrics you want to track. Finally, call model.finalize() to lock the computational graph and prepare it for training.
model.set(
loss=BinaryCrossEntropy(),
optimizer=Optimizer_Adam(learning_rate=0.01),
metrics={'accuracy': Accuracy()}
)
# Lock the architecture and prepare the backward pass components
model.finalize()3. Train the Model
Use the fit() method to begin the training loop. You can monitor the model's performance on unseen data in real-time by passing a tuple to the validation_data argument.
model.fit(
X_train,
y_train,
epochs=100,
print_every=10,
validation_data=(X_val, y_val)
)4. Evaluate and Predict
After training, quantify the model's performance on a held-out test set using evaluate(). To generate predictions on new, unseen data, use predict().
# Compute final loss and chosen metrics
loss, metrics = model.evaluate(X_test, y_test)
print(f"Test Loss: {loss:.4f}, Accuracy: {metrics['accuracy']:.4f}")
# Generate raw model outputs for new data
predictions = model.predict(X_new)Task-Specific Configurations
The most critical step in building your network is ensuring the final layer's dimension, the final activation function, and the loss function all align with your specific machine learning task.
Use this reference table to configure your network's output correctly:
| Task Type | Final Layer Size | Final Activation | Appropriate Loss Function |
|---|---|---|---|
| Binary Classification | DenseLayer(..., 1) | Sigmoid() | BinaryCrossEntropy() |
| Multiclass Classification | DenseLayer(..., num_classes) | Softmax() | CategoricalCrossEntropy() |
| Regression | DenseLayer(..., 1) | Linear() | MeanSquaredError() |
SparseCategoricalCrossEntropyis a subclass ofCategoricalCrossEntropythat accepts integer-encoded labels (yshape(n,)) instead of one-hot. Both get the fused Softmax backward pass when paired withSoftmax().
Tracking Performance Metrics
VanillaNets allows you to track multiple metrics simultaneously by passing a dictionary to the metrics argument in model.set().
model.set(
loss=loss,
optimizer=optimizer,
metrics={'accuracy': Accuracy(), 'precision': Precision(), 'recall': Recall()}
)- Classification:
Accuracy,Precision,Recall,F1Score,ConfusionMatrix - Regression:
R2Score,MAE,RMSE
Where to Go Next
- Architecture - what
model.fit()does internally, and the forward/backward contract. - Layers -
DenseLayerinitialization strategies and gradients. - Activations - forward/backward formulas for each activation.
- Losses & Optimizers - loss gradients, the fused Softmax+CrossEntropy path, and SGD/Adam update rules.
- Metrics - label conversion and metric formulas.