EXPERIMENT 2 Implementing a Neural Network with Hidden Layer
AIM:
To train a Neural Network with hidden layers on labeled training data.               [CO1] [BTL4, 5]
DESCRIPTION:
A Neural Network with hidden layers can only solve linearly not separable cases also. The
first hidden layer converts the non-linearly separable case into linearly separable case.
Consider the XOR gate Truth Table and the corresponding plot shown in Figure 1.
 x1         x2          Y
 input      input       output
 0          0           0
 0          1           1
 1          0           1
 1          1           0
                (a)                                                (b)
                      Figure 1. (a) Truth Table of XOR gate. (b) plot of XOR gate.
A simple neural network for the XOR gate requires at least a 2-layer neural network, as the
XOR function is not linearly separable. The architecture, shown in Figure 2 would include:
     1. Input Layer: 2 neurons for the XOR inputs.
     2. Hidden Layer: At least 2 neurons with a non-linear activation function.
     3. Output Layer: 1 neuron with a sigmoid activation to produce the XOR output.
     4. Training the network:
            o    Define the input-output pairs for the XOR gate.
            o    Perform forward propagation to compute predictions.
            o    Use backpropagation to update weights and biases using gradient descent.
          Figure 2. A Neural network with one hidden layer to implement XOR gate.
PROCEDURE:
Implement the following in Google Colab.
import numpy as np
# Sigmoid activation function
def sigmoid(x):
    return 1 / (1 + np.exp(-x))
# Derivative of the sigmoid function
def sigmoid_derivative(x):
    return x * (1 - x)
# XOR gate inputs and outputs
inputs = np.array([[0, 0],
                   [0, 1],
                   [1, 0],
                   [1, 1]])
outputs = np.array([[0], [1], [1], [0]])
# Initialize weights and biases
np.random.seed(42)
# 2 inputs to 2 hidden neurons
input_layer_weights = np.random.rand(2, 2)
# 2 hidden neurons to 1 output
hidden_layer_weights = np.random.rand(2, 1)
input_layer_bias = np.random.rand(1, 2)
hidden_layer_bias = np.random.rand(1, 1)
learning_rate = 0.1
# Training the network
epochs = 10000
for epoch in range(epochs):
    # Forward propagation
    # Hidden layer
    hidden_layer_input = np.dot(inputs, input_layer_weights)
                             + input_layer_bias
    hidden_layer_output = sigmoid(hidden_layer_input)
    # Output layer
    output_layer_input = np.dot(hidden_layer_output,
                    hidden_layer_weights) + hidden_layer_bias
    predictions = sigmoid(output_layer_input)
    # Calculate error
    error = outputs - predictions
    # Backpropagation
    # Output layer adjustments
    output_layer_delta = error *
                             sigmoid_derivative(predictions)
    hidden_layer_error = np.dot(output_layer_delta,
                             hidden_layer_weights.T)
    # Hidden layer adjustments
    hidden_layer_delta = hidden_layer_error *
                    sigmoid_derivative(hidden_layer_output)
    # Update weights and biases
    hidden_layer_weights += np.dot(hidden_layer_output.T,
                   output_layer_delta) * learning_rate
    hidden_layer_bias += np.sum(output_layer_delta, axis=0,
              keepdims=True) * learning_rate
    input_layer_weights += np.dot(inputs.T,
              hidden_layer_delta) * learning_rate
    input_layer_bias += np.sum(hidden_layer_delta, axis=0,
              keepdims=True) * learning_rate
# Testing the network
print("Trained input layer weights:\n", input_layer_weights)
print("Trained hidden layer weights:\n", hidden_layer_weights)
print("Trained input layer bias:\n", input_layer_bias)
print("Trained hidden layer bias:\n", hidden_layer_bias)
# Testing on inputs
for input_data in inputs:
      hidden_layer_input = np.dot(input_data,
                input_layer_weights) + input_layer_bias
      hidden_layer_output = sigmoid(hidden_layer_input)
      output_layer_input = np.dot(hidden_layer_output,
                hidden_layer_weights) + hidden_layer_bias
      result = sigmoid(output_layer_input)
      print(f"Input: {input_data}, Output: {round(result[0])}")
Tasks:
    1. Colab link to the implemented code. Change the activation function to linear function
         and conduct analysis.                             [2 marks] [CO 1] [BTL 4, 5]
    2. Write similar code for another non-linearly separable function with three inputs and
         one output (other than logic gates)               [3 marks] [CO 1] [BTL 4]
 EXPERIMENT 3 To Train a Neural Network with Back Propagation
AIM:
To train a Neural Network with backpropagation method including regularization
                                                                [CO1] [BTL4, 5]
DESCRIPTION:
A Neural Network is used for a binary classification task using backpropagation method. To
avoid overfitting, regularization factor is included in the objective function. For classification
tasks, cross entropy is popularly used loss function. Training data and testing data are
generated randomly. The architecture, shown in Figure 1 consists of:
Input Layer: 2 nodes for the features.
Hidden Layer: 10 nodes in a single hidden layer.
Output Layer: 1 node for binary classification.
                        Figure 1. A Neural network with one hidden layer.
PROCEDURE:
 Implement the following in Google Colab.
 import numpy as np
 import matplotlib.pyplot as plt
 import networkx as nx
 from sklearn.metrics import confusion_matrix,
 ConfusionMatrixDisplay
# Activation functions
def sigmoid(z):
    return 1 / (1 + np.exp(-z))
def sigmoid_derivative(z):
    return sigmoid(z) * (1 - sigmoid(z))
# Loss function
def compute_loss(y, y_hat, W1, W2, lambda_):
    m = y.shape[0]
    cross_entropy = -np.mean(y * np.log(y_hat) + (1 - y) *
                                  np.log(1 - y_hat))
    l2_regularization = (lambda_ / (2 * m)) *
          (np.sum(np.square(W1)) + np.sum(np.square(W2)))
    return cross_entropy + l2_regularization
# Forward propagation
def forward_propagation(X, W1, b1, W2, b2):
    Z1 = np.dot(X, W1) + b1
    A1 = sigmoid(Z1)
    Z2 = np.dot(A1, W2) + b2
    A2 = sigmoid(Z2)
    cache = {"Z1": Z1, "A1": A1, "Z2": Z2, "A2": A2}
    return A2, cache
# Backpropagation
def backward_propagation(X, y, cache, W1, W2, lambda_):
    m = X.shape[0]
    A1, A2 = cache["A1"], cache["A2"]
    # Gradients for output layer
    dZ2 = A2 - y
    dW2 = (1 / m) * np.dot(A1.T, dZ2) + (lambda_ / m) * W2
    db2 = (1 / m) * np.sum(dZ2, axis=0, keepdims=True)
   # Gradients for hidden layer
   dA1 = np.dot(dZ2, W2.T)
   dZ1 = dA1 * sigmoid_derivative(cache["Z1"])
   dW1 = (1 / m) * np.dot(X.T, dZ1) + (lambda_ / m) * W1
   db1 = (1 / m) * np.sum(dZ1, axis=0, keepdims=True)
   gradients = {"dW1": dW1, "db1": db1, "dW2": dW2, "db2":
                                           db2}
   return gradients
# Update weights
def update_weights(W1, b1, W2, b2, gradients, learning_rate):
    W1 -= learning_rate * gradients["dW1"]
    b1 -= learning_rate * gradients["db1"]
    W2 -= learning_rate * gradients["dW2"]
    b2 -= learning_rate * gradients["db2"]
    return W1, b1, W2, b2
# Training loop
def train(X, y, hidden_units, learning_rate, lambda_,
iterations):
    input_units = X.shape[1]
    output_units = 1
   # Initialize weights and biases
   W1 = np.random.randn(input_units, hidden_units) * 0.01
   b1 = np.zeros((1, hidden_units))
   W2 = np.random.randn(hidden_units, output_units) * 0.01
   b2 = np.zeros((1, output_units))
   for i in range(iterations):
       # Forward propagation
       y_hat, cache = forward_propagation(X, W1, b1, W2, b2)
       # Compute loss
       loss = compute_loss(y, y_hat, W1, W2, lambda_)
       # Backpropagation
       gradients = backward_propagation(X, y, cache, W1, W2,
                  lambda_)
       # Update weights
       W1, b1, W2, b2 = update_weights(W1, b1, W2, b2,
                  gradients, learning_rate)
       # Print loss every 100 iterations
       if i % 100 == 0:
           print(f"Iteration {i}, Loss: {loss:.4f}")
   return W1, b1, W2, b2
# Prediction
def predict(X, W1, b1, W2, b2):
    A2, _ = forward_propagation(X, W1, b1, W2, b2)
    return (A2 > 0.5).astype(int)
# Accuracy calculation
def calculate_accuracy(y_true, y_pred):
    return np.mean(y_true.flatten() == y_pred.flatten())
# Confusion matrix visualization
def plot_confusion_matrix(y_true, y_pred):
    cm = confusion_matrix(y_true, y_pred)
    disp = ConfusionMatrixDisplay(confusion_matrix=cm,
                   display_labels=[0, 1])
    disp.plot(cmap=plt.cm.Blues)
    plt.title("Confusion Matrix")
    plt.show()
# Example usage
np.random.seed(0)
X = np.random.randn(100, 2) # 100 samples, 2 features
y = (np.sum(X, axis=1, keepdims=True) > 0).astype(int)                            #
Binary labels
# Normalize input data
X_normalized = (X - np.mean(X, axis=0)) / np.std(X, axis=0)
# Train the model
W1, b1, W2, b2 = train(X_normalized, y, hidden_units=10,
learning_rate=0.05, lambda_=0.01, iterations=2000)
# Calculate training accuracy
train_predictions = predict(X_normalized, W1, b1, W2, b2)
train_accuracy = calculate_accuracy(y, train_predictions)
print(f"Training Accuracy: {train_accuracy:.2f}")
# Complex test data
X_test = np.random.randn(20, 2) * 2 +
np.sin(np.random.randn(20, 2))
y_test = ((X_test[:, 0] ** 2 + X_test[:, 1] ** 2) >
5).astype(int).reshape(-1, 1)
X_test_normalized = (X_test - np.mean(X, axis=0)) / np.std(X,
axis=0)
# Calculate testing accuracy
test_predictions = predict(X_test_normalized, W1, b1, W2, b2)
test_accuracy = calculate_accuracy(y_test, test_predictions)
print(f"Testing Accuracy: {test_accuracy:.2f}")
# Plot confusion matrix
plot_confusion_matrix(y_test.flatten(),
test_predictions.flatten())
Tasks:
Improve the accuracy with any changes on the code. Include confusion tables in each case
with proper figure number and caption.
    1. Colab link to the implemented code. Run the code without L2 regularization function
       (make necessary changes in the weight update section). Based on the final weight
       values obtained with and without L2 normalization, write your inference.
                                                         [2 marks] [CO 1] [BTL 4, 5]
    2. Run the code with L1 regularization function (make necessary changes in the weight
       update section). Based on the final weight values obtained with and without L1
       normalization, write your inference. Also compare the weights obtained with L2
       regularization and write your inference.          [3 marks] [CO 1] [BTL 4, 5]
   EXPERIMENT 4 To Understand Neural network architecture for
                    Multiclass Models
AIM:
Implementing a Simple Neural Network for Predicting handwritten numerical digits
DESCRIPTION:
The Neural network architecture is shown in Figure 1 will be used for implementation. The
Architecture has one hidden layers with 128 and 64 neurons respectively. Output layer has
10 neurons with softmax layer.
                   Figure 1. Neural network for 10 class classification.
PROCEDURE:
Here’s how you can implement a simple shallow neural network (NN) with two dense layers
for MNIST digit recognition in Google Colab. We will normalize the input data by
subtracting the mean and dividing by the standard deviation.
### Step 1: Set up your environment
import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.utils import to_categorical
import matplotlib.pyplot as plt
This code imports the necessary libraries:
- `numpy` for numerical operations.
- `tensorflow` for building and training the neural network.
- `mnist` to load the MNIST dataset.
- `Sequential` to define a linear stack of layers for the model.
- `Dense` to add fully connected layers.
- `Flatten` to flatten the input images.
- `to_categorical` to one-hot encode the target labels.
- ‘plt’ to plot graphs
### Step 2: Load and preprocess the data
# Load data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# Normalize the data
mean = np.mean(x_train)
std = np.std(x_train)
x_train = (x_train - mean) / std
x_test = (x_test - mean) / std
# Reshape the data to fit the model
x_train = x_train.reshape(-1, 28*28)
x_test = x_test.reshape(-1, 28*28)
# One-hot encode the target labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)
**Explanation:**
1. `mnist.load_data()` loads the dataset, splitting it into training and test sets.
2. We calculate the mean and standard deviation of the training data and normalize both
training and test data.
3. The images are reshaped from 28x28 pixels to a 784-dimensional vector (28*28) since the
neural network expects input in this form.
4. The labels are one-hot encoded. For example, a label `3` becomes `[0, 0, 0, 1, 0, 0, 0, 0, 0,
0]`.
### Step 3: Define the model
# Create the model
model = Sequential()
# Add the layers
model.add(Dense(128, activation='relu', input_shape=(784,)))
model.add(Dense(64, activation='relu'))
model.add(Dense(10, activation='softmax'))
# Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
**Explanation:**
1. `Sequential()` creates a linear stack of layers.
2. The first `Dense` layer has 128 neurons and uses the ReLU activation function. The
`input_shape` is set to 784, matching our flattened image vectors.
3. The second `Dense` layer has 64 neurons with ReLU activation.
4. The output layer has 10 neurons with `softmax` activation, giving probabilities for each
of the 10 digit classes.
5. The model is compiled using the Adam optimizer and categorical cross-entropy as the loss
function. Accuracy is used as the evaluation metric.
### Step 4: Train the model
# Train the model
history    =     model.fit(x_train,    y_train,                                epochs=10,
batch_size=32, validation_data=(x_test, y_test))
**Explanation:**
- The model is trained for 10 epochs with a batch size of 32. The `validation_data` parameter
allows us to monitor the model’s performance on the test set during training.
### Step 5: Evaluate the model
# Evaluate the model on test data
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f'Test accuracy: {test_acc}')
**Explanation:**
- This code evaluates the trained model on the test set and prints the test accuracy.
# Step 8: Plot the training performance
# We plot both training and validation accuracy and loss over
epochs to see how the model performs.
plt.figure(figsize=(14, 5))
# Accuracy plot
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'],                           label='Validation
Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.title('Accuracy over Epochs')
plt.legend()
# Loss plot
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'],       label='Validation
Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Loss over Epochs')
plt.legend()
plt.show()
Tasks:
    1. Provide Colab link to the above code.                        [1 marks] [CO2] [BTL 4]
    2. Change the default optimizer to anything other that Adam. Describe the optimizer in
         detail, at least one side full page. Provide Colab link.   [4 marks] [CO2] [BTL 4]