0% found this document useful (0 votes)
42 views12 pages

Neural Network XOR Gate Implementation

Uploaded by

claudle200415
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views12 pages

Neural Network XOR Gate Implementation

Uploaded by

claudle200415
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

EXPERIMENT 2 Implementing a Neural Network with Hidden Layer

AIM:

To train a Neural Network with hidden layers on labeled training data. [CO1] [BTL4, 5]

DESCRIPTION:

A Neural Network with hidden layers can only solve linearly not separable cases also. The
first hidden layer converts the non-linearly separable case into linearly separable case.
Consider the XOR gate Truth Table and the corresponding plot shown in Figure 1.

x1 x2 Y
input input output
0 0 0
0 1 1
1 0 1
1 1 0
(a) (b)
Figure 1. (a) Truth Table of XOR gate. (b) plot of XOR gate.

A simple neural network for the XOR gate requires at least a 2-layer neural network, as the
XOR function is not linearly separable. The architecture, shown in Figure 2 would include:
1. Input Layer: 2 neurons for the XOR inputs.
2. Hidden Layer: At least 2 neurons with a non-linear activation function.
3. Output Layer: 1 neuron with a sigmoid activation to produce the XOR output.
4. Training the network:
o Define the input-output pairs for the XOR gate.
o Perform forward propagation to compute predictions.
o Use backpropagation to update weights and biases using gradient descent.
Figure 2. A Neural network with one hidden layer to implement XOR gate.
PROCEDURE:
Implement the following in Google Colab.
import numpy as np

# Sigmoid activation function


def sigmoid(x):
return 1 / (1 + np.exp(-x))

# Derivative of the sigmoid function


def sigmoid_derivative(x):
return x * (1 - x)

# XOR gate inputs and outputs


inputs = np.array([[0, 0],
[0, 1],
[1, 0],
[1, 1]])
outputs = np.array([[0], [1], [1], [0]])

# Initialize weights and biases


np.random.seed(42)
# 2 inputs to 2 hidden neurons
input_layer_weights = np.random.rand(2, 2)
# 2 hidden neurons to 1 output
hidden_layer_weights = np.random.rand(2, 1)
input_layer_bias = np.random.rand(1, 2)
hidden_layer_bias = np.random.rand(1, 1)
learning_rate = 0.1

# Training the network


epochs = 10000
for epoch in range(epochs):
# Forward propagation
# Hidden layer
hidden_layer_input = np.dot(inputs, input_layer_weights)
+ input_layer_bias
hidden_layer_output = sigmoid(hidden_layer_input)

# Output layer
output_layer_input = np.dot(hidden_layer_output,
hidden_layer_weights) + hidden_layer_bias
predictions = sigmoid(output_layer_input)

# Calculate error
error = outputs - predictions

# Backpropagation
# Output layer adjustments
output_layer_delta = error *
sigmoid_derivative(predictions)
hidden_layer_error = np.dot(output_layer_delta,
hidden_layer_weights.T)

# Hidden layer adjustments


hidden_layer_delta = hidden_layer_error *
sigmoid_derivative(hidden_layer_output)

# Update weights and biases


hidden_layer_weights += np.dot(hidden_layer_output.T,
output_layer_delta) * learning_rate
hidden_layer_bias += np.sum(output_layer_delta, axis=0,
keepdims=True) * learning_rate
input_layer_weights += np.dot(inputs.T,
hidden_layer_delta) * learning_rate
input_layer_bias += np.sum(hidden_layer_delta, axis=0,
keepdims=True) * learning_rate

# Testing the network


print("Trained input layer weights:\n", input_layer_weights)
print("Trained hidden layer weights:\n", hidden_layer_weights)
print("Trained input layer bias:\n", input_layer_bias)
print("Trained hidden layer bias:\n", hidden_layer_bias)

# Testing on inputs
for input_data in inputs:
hidden_layer_input = np.dot(input_data,
input_layer_weights) + input_layer_bias
hidden_layer_output = sigmoid(hidden_layer_input)
output_layer_input = np.dot(hidden_layer_output,
hidden_layer_weights) + hidden_layer_bias
result = sigmoid(output_layer_input)
print(f"Input: {input_data}, Output: {round(result[0])}")

Tasks:
1. Colab link to the implemented code. Change the activation function to linear function
and conduct analysis. [2 marks] [CO 1] [BTL 4, 5]
2. Write similar code for another non-linearly separable function with three inputs and
one output (other than logic gates) [3 marks] [CO 1] [BTL 4]
EXPERIMENT 3 To Train a Neural Network with Back Propagation
AIM:

To train a Neural Network with backpropagation method including regularization


[CO1] [BTL4, 5]

DESCRIPTION:

A Neural Network is used for a binary classification task using backpropagation method. To
avoid overfitting, regularization factor is included in the objective function. For classification
tasks, cross entropy is popularly used loss function. Training data and testing data are
generated randomly. The architecture, shown in Figure 1 consists of:
Input Layer: 2 nodes for the features.
Hidden Layer: 10 nodes in a single hidden layer.
Output Layer: 1 node for binary classification.

Figure 1. A Neural network with one hidden layer.


PROCEDURE:
Implement the following in Google Colab.
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
from sklearn.metrics import confusion_matrix,
ConfusionMatrixDisplay
# Activation functions
def sigmoid(z):
return 1 / (1 + np.exp(-z))

def sigmoid_derivative(z):
return sigmoid(z) * (1 - sigmoid(z))

# Loss function
def compute_loss(y, y_hat, W1, W2, lambda_):
m = y.shape[0]
cross_entropy = -np.mean(y * np.log(y_hat) + (1 - y) *
np.log(1 - y_hat))
l2_regularization = (lambda_ / (2 * m)) *
(np.sum(np.square(W1)) + np.sum(np.square(W2)))
return cross_entropy + l2_regularization

# Forward propagation
def forward_propagation(X, W1, b1, W2, b2):
Z1 = np.dot(X, W1) + b1
A1 = sigmoid(Z1)
Z2 = np.dot(A1, W2) + b2
A2 = sigmoid(Z2)
cache = {"Z1": Z1, "A1": A1, "Z2": Z2, "A2": A2}
return A2, cache

# Backpropagation
def backward_propagation(X, y, cache, W1, W2, lambda_):
m = X.shape[0]
A1, A2 = cache["A1"], cache["A2"]
# Gradients for output layer
dZ2 = A2 - y
dW2 = (1 / m) * np.dot(A1.T, dZ2) + (lambda_ / m) * W2
db2 = (1 / m) * np.sum(dZ2, axis=0, keepdims=True)

# Gradients for hidden layer


dA1 = np.dot(dZ2, W2.T)
dZ1 = dA1 * sigmoid_derivative(cache["Z1"])
dW1 = (1 / m) * np.dot(X.T, dZ1) + (lambda_ / m) * W1
db1 = (1 / m) * np.sum(dZ1, axis=0, keepdims=True)

gradients = {"dW1": dW1, "db1": db1, "dW2": dW2, "db2":


db2}
return gradients

# Update weights
def update_weights(W1, b1, W2, b2, gradients, learning_rate):
W1 -= learning_rate * gradients["dW1"]
b1 -= learning_rate * gradients["db1"]
W2 -= learning_rate * gradients["dW2"]
b2 -= learning_rate * gradients["db2"]
return W1, b1, W2, b2
# Training loop
def train(X, y, hidden_units, learning_rate, lambda_,
iterations):
input_units = X.shape[1]
output_units = 1

# Initialize weights and biases


W1 = np.random.randn(input_units, hidden_units) * 0.01
b1 = np.zeros((1, hidden_units))
W2 = np.random.randn(hidden_units, output_units) * 0.01
b2 = np.zeros((1, output_units))

for i in range(iterations):
# Forward propagation
y_hat, cache = forward_propagation(X, W1, b1, W2, b2)

# Compute loss
loss = compute_loss(y, y_hat, W1, W2, lambda_)

# Backpropagation
gradients = backward_propagation(X, y, cache, W1, W2,
lambda_)

# Update weights
W1, b1, W2, b2 = update_weights(W1, b1, W2, b2,
gradients, learning_rate)

# Print loss every 100 iterations


if i % 100 == 0:
print(f"Iteration {i}, Loss: {loss:.4f}")

return W1, b1, W2, b2

# Prediction
def predict(X, W1, b1, W2, b2):
A2, _ = forward_propagation(X, W1, b1, W2, b2)
return (A2 > 0.5).astype(int)

# Accuracy calculation
def calculate_accuracy(y_true, y_pred):
return np.mean(y_true.flatten() == y_pred.flatten())

# Confusion matrix visualization


def plot_confusion_matrix(y_true, y_pred):
cm = confusion_matrix(y_true, y_pred)
disp = ConfusionMatrixDisplay(confusion_matrix=cm,
display_labels=[0, 1])
disp.plot(cmap=plt.cm.Blues)
plt.title("Confusion Matrix")
plt.show()
# Example usage
np.random.seed(0)
X = np.random.randn(100, 2) # 100 samples, 2 features
y = (np.sum(X, axis=1, keepdims=True) > 0).astype(int) #
Binary labels

# Normalize input data


X_normalized = (X - np.mean(X, axis=0)) / np.std(X, axis=0)

# Train the model


W1, b1, W2, b2 = train(X_normalized, y, hidden_units=10,
learning_rate=0.05, lambda_=0.01, iterations=2000)

# Calculate training accuracy


train_predictions = predict(X_normalized, W1, b1, W2, b2)
train_accuracy = calculate_accuracy(y, train_predictions)
print(f"Training Accuracy: {train_accuracy:.2f}")

# Complex test data


X_test = np.random.randn(20, 2) * 2 +
np.sin(np.random.randn(20, 2))
y_test = ((X_test[:, 0] ** 2 + X_test[:, 1] ** 2) >
5).astype(int).reshape(-1, 1)
X_test_normalized = (X_test - np.mean(X, axis=0)) / np.std(X,
axis=0)

# Calculate testing accuracy


test_predictions = predict(X_test_normalized, W1, b1, W2, b2)
test_accuracy = calculate_accuracy(y_test, test_predictions)
print(f"Testing Accuracy: {test_accuracy:.2f}")

# Plot confusion matrix


plot_confusion_matrix(y_test.flatten(),
test_predictions.flatten())

Tasks:
Improve the accuracy with any changes on the code. Include confusion tables in each case
with proper figure number and caption.

1. Colab link to the implemented code. Run the code without L2 regularization function
(make necessary changes in the weight update section). Based on the final weight
values obtained with and without L2 normalization, write your inference.
[2 marks] [CO 1] [BTL 4, 5]
2. Run the code with L1 regularization function (make necessary changes in the weight
update section). Based on the final weight values obtained with and without L1
normalization, write your inference. Also compare the weights obtained with L2
regularization and write your inference. [3 marks] [CO 1] [BTL 4, 5]
EXPERIMENT 4 To Understand Neural network architecture for
Multiclass Models
AIM:
Implementing a Simple Neural Network for Predicting handwritten numerical digits

DESCRIPTION:
The Neural network architecture is shown in Figure 1 will be used for implementation. The
Architecture has one hidden layers with 128 and 64 neurons respectively. Output layer has
10 neurons with softmax layer.

Figure 1. Neural network for 10 class classification.


PROCEDURE:
Here’s how you can implement a simple shallow neural network (NN) with two dense layers
for MNIST digit recognition in Google Colab. We will normalize the input data by
subtracting the mean and dividing by the standard deviation.

### Step 1: Set up your environment


import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.utils import to_categorical
import matplotlib.pyplot as plt
This code imports the necessary libraries:
- `numpy` for numerical operations.
- `tensorflow` for building and training the neural network.
- `mnist` to load the MNIST dataset.
- `Sequential` to define a linear stack of layers for the model.
- `Dense` to add fully connected layers.
- `Flatten` to flatten the input images.
- `to_categorical` to one-hot encode the target labels.
- ‘plt’ to plot graphs

### Step 2: Load and preprocess the data

# Load data
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize the data


mean = np.mean(x_train)
std = np.std(x_train)
x_train = (x_train - mean) / std
x_test = (x_test - mean) / std

# Reshape the data to fit the model


x_train = x_train.reshape(-1, 28*28)
x_test = x_test.reshape(-1, 28*28)

# One-hot encode the target labels


y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

**Explanation:**
1. `mnist.load_data()` loads the dataset, splitting it into training and test sets.
2. We calculate the mean and standard deviation of the training data and normalize both
training and test data.
3. The images are reshaped from 28x28 pixels to a 784-dimensional vector (28*28) since the
neural network expects input in this form.
4. The labels are one-hot encoded. For example, a label `3` becomes `[0, 0, 0, 1, 0, 0, 0, 0, 0,
0]`.

### Step 3: Define the model

# Create the model


model = Sequential()

# Add the layers


model.add(Dense(128, activation='relu', input_shape=(784,)))
model.add(Dense(64, activation='relu'))
model.add(Dense(10, activation='softmax'))

# Compile the model


model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

**Explanation:**
1. `Sequential()` creates a linear stack of layers.
2. The first `Dense` layer has 128 neurons and uses the ReLU activation function. The
`input_shape` is set to 784, matching our flattened image vectors.
3. The second `Dense` layer has 64 neurons with ReLU activation.
4. The output layer has 10 neurons with `softmax` activation, giving probabilities for each
of the 10 digit classes.
5. The model is compiled using the Adam optimizer and categorical cross-entropy as the loss
function. Accuracy is used as the evaluation metric.

### Step 4: Train the model

# Train the model


history = model.fit(x_train, y_train, epochs=10,
batch_size=32, validation_data=(x_test, y_test))

**Explanation:**
- The model is trained for 10 epochs with a batch size of 32. The `validation_data` parameter
allows us to monitor the model’s performance on the test set during training.

### Step 5: Evaluate the model

# Evaluate the model on test data


test_loss, test_acc = model.evaluate(x_test, y_test)
print(f'Test accuracy: {test_acc}')

**Explanation:**
- This code evaluates the trained model on the test set and prints the test accuracy.

# Step 8: Plot the training performance


# We plot both training and validation accuracy and loss over
epochs to see how the model performs.
plt.figure(figsize=(14, 5))

# Accuracy plot
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation
Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.title('Accuracy over Epochs')
plt.legend()

# Loss plot
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation
Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Loss over Epochs')
plt.legend()
plt.show()

Tasks:
1. Provide Colab link to the above code. [1 marks] [CO2] [BTL 4]
2. Change the default optimizer to anything other that Adam. Describe the optimizer in
detail, at least one side full page. Provide Colab link. [4 marks] [CO2] [BTL 4]

You might also like