ML-As-4
Problem 1: Neural Networks (20 pts)
Consider a 3-layer fully connected neural network with the following architecture:
- Input layer: n = 4 neurons
- Hidden layer: m = 3 neurons using a custom activation function
- Output layer: k = 2 neurons using a softmax activation function
The network parameters (weights and biases) are given as:
and for the hidden layer. and for the output layer.
Given the input vector
where
Q1
1. Derive the equations for the forward pass through the network, including both the hidden and output layers. (3 pts)
Forward pass through the network:
- Hidden layer:
- Output layer:
- Loss:
2. Calculate the outputs Z₁, H, Z₂, and ŷ explicitly for a given input and the following initial weights and biases:
- Note that
is the net input to the hidden layer, is the activation output of the hidden layer, and is the net input to the output layer. (3 pts)
Q2
Derive the gradient of the loss with respect to each parameter
Error term
Gradient of the loss with respect to
Gradient of the loss with respect to
Error term
Gradient of the loss with respect to
Gradient of the loss with respect to
Q3
Suppose the learning rate
The general update rule for each parameter
where
Updating
Updating
Updating
Updating