0% found this document useful (0 votes)

60 views8 pages

Neural Networks

This document describes a neural network model for binary sentiment classification on a movie review dataset. The model uses an embedding layer to represent words, followed by flatten, dense and dropout layers, and is trained to predict positive or negative sentiment. Key hyperparameters include vocabulary size, embedding dimensions, epochs and batch size. The code loads the dataset, prepares the text data and splits it into train and test sets for training the neural network model.

Uploaded by

gesicin232

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views8 pages

Neural Networks

Uploaded by

gesicin232

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Neural Networks

Dataset Description
This dataset comprises 50,000 movie reviews. It is designed for binary sentiment
classification, focusing on predicting whether a review is positive or negative. With a
substantial volume of highly polar reviews, this dataset is suitable for tasks involving
sentiment analysis.

Hyperparameters Used
The hyperparameters used while creating the neural network in the provided code
include:

1. max_words :

Description: Maximum number of unique words considered as features in the

dataset.

Purpose: It limits the vocabulary size and is used in the Tokenizer to build the
word index.

2. embedding_dim :

Description: Dimensionality of the dense vectors representing words in the

embedding layer.

Purpose: Determines the size of the word embeddings, influencing the

complexity and detail in representing words.

3. epochs :

Description: Number of times the entire training dataset is processed by the

neural network during training.

Purpose: Defines the number of training iterations, impacting the model's

learning.

4. batch_size :

Description: Number of samples processed in each iteration during training.

Neural Networks 1
Purpose: Affects the model's weight updates; larger batches may speed up
training, but smaller batches may provide more accurate updates.

5. validation_split :

Description: Fraction of the training data used for validation during training.

Purpose: Monitors the model's performance on unseen data during training,

helping to detect overfitting.

Description of the Code

link for the google collab:

Google Colaboratory

https://colab.research.google.com/drive/18yCvD9kqRos1vsukto
qbdxJnXU2rO7ao?usp=sharing

Importing all the necessary libraries:

pandas is used for data manipulation.

train_test_split from sklearn.model_selection is used to split the dataset into training

and testing sets.

LabelEncoderfrom sklearn.preprocessing is used to encode the target labels

(positive/negative sentiments).

Tokenizerand pad_sequences from tensorflow.keras.preprocessing.text are used to

tokenize and pad the input text data.

Sequential , Embedding , Flatten , Dense ,

and Dropout from tensorflow.keras.models and tensorflow.keras.layers are used to
define the FNN model.

import pandas as pd
from sklearn.model_selection import train_test_split

Neural Networks 2
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequence
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, Flatten, Dense, D

Loading the IMDb movie reviews dataset using pd.read_csv

df = pd.read_csv("/content/drive/MyDrive/IMDB Dataset.csv")

This past of the code helps in transforming the target variable into numerical values

It startes with assigning numerical values to the unique categorical values to the traget
label(the encoder is only used on the target labels).
In this case as there are 2 unique values in the target variable positive and negaitive it
will assign them 0 and 1

The fit_transform fuction fit label encoder and return encoded labels.

# Encode target labels (positive/negative)

le = LabelEncoder()
df['sentiment'] = le.fit_transform(df['sentiment'])

This part of the code helps in assigning a fixed integer id to each word occurring in any
document(any piece of text that is treated as a single entity) of the training set

max_words = 10000 :

This sets the maximum number of unique words to consider in the tokenizer. Only the
most frequent max_words words will be kept during tokenization, and less frequent words
will be ignored.

tokenizer = Tokenizer(num_words=max_words, split=' ') :

Initializes a tokenizer from the Tokenizer class. There are 2 parameters in it.

num_words parameter specifies the maximum number of words to keep

Neural Networks 3
split=' ' indicates that words will be split based on space.

All punctuations are removed.

tokenizer.fit_on_texts(df['review'].values) :

The fit_on_texts fits the tokenizer on the review column if the dataframe.

The .value is then used to convert the 'review' column into a NumPy array.

This step builds the vocabulary and assigns a unique numerical index to each word in
the corpus(data).

X = tokenizer.texts_to_sequences(df['review'].values) :
Each review (sentence) is now transformed into a sequence of numbers. If one of the
review is 'apple orange banana', now it will '1 3 2', where 1 corresponds to 'apple', 3 to
'orange', and 2 to 'banana'.

X = pad_sequences(X) :

Pads the sequences to ensure they all have the same length. This is necessary for
feeding the data into a neural network with fixed input size. If a review has fewer words
than the maximum sequence length, it is padded with zeros at the beginning; if it is
longer, it is truncated.

# Tokenize the text

max_words = 10000
tokenizer = Tokenizer(num_words=max_words, split=' ')
tokenizer.fit_on_texts(df['review'].values)
X = tokenizer.texts_to_sequences(df['review'].values)
X = pad_sequences(X)

This part of the code is spliting the dataset

Inputs:

X : The input data, which typically represents the features or independent variables.

df['sentiment'] : The target variable, which is the sentiment label associated with
each input.

Parameters:

Neural Networks 4
: Specifies that 20% of the data will be used as the test set, and the
test_size=0.2

remaining 80% will be used as the training set.

random_state=42 : Sets a random seed for reproducibility. The same seed ensures
that the random splitting of data is consistent across runs.

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, df['senti

1. Embedding Layer: model.add(Embedding(max_words, embedding_dim,

input_length=X.shape[1]))

The Embedding layer is used for word embedding, which represents each word in
the input sequence as a dense vector of fixed size (embedding_dim).

max_words : The maximum number of words to consider as features.

embedding_dim : The dimension of the dense embedding.

input_length=X.shape[1] : The length of the input sequences (number of features in

each sequence).

2. Flatten Layer: model.add(Flatten())

The Flatten layer is used to flatten the output of the embedding layer into a one-
dimensional array. It prepares the data for the fully connected layers.

3. Dense Layer (ReLU Activation): model.add(Dense(256, activation='relu'))

This dense layer has 256 units and uses the Rectified Linear Unit (ReLU) activation
function. It introduces non-linearity to the model.

4. Dropout Layer: model.add(Dropout(0.5))

The Dropout layer helps prevent overfitting by randomly setting a fraction of input units
to zero during training (here, 50%).

5. Dense Output Layer (Sigmoid Activation): model.add(Dense(1, activation='sigmoid'))

The final dense layer has 1 unit (output node) with a sigmoid activation function. This is
common for binary classification problems.

6. Model Compilation: model.compile(loss='binary_crossentropy', optimizer='adam',

metrics=['accuracy'])

Neural Networks 5
The model is compiled with the specified loss function (binary_crossentropy for binary
classification), optimizer (adam), and metrics (accuracy for evaluation).

# Build the FNN model

embedding_dim = 128

model = Sequential()
model.add(Embedding(max_words, embedding_dim, input_length=X.sha
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='adam', metr

epochs and batch_size :

epochs is the number of times the model will be trained on the entire training
dataset.

batch_size is the number of samples (data points) used in each iteration of training.
model.fit():

The fit method trains the model by adjusting internal parameters based on provided
training data (X_train and y_train) to minimize the specified loss function. It requires
inputs like the number of epochs, batch size, and an optional validation split.
validation_split:

The validation split, set as validation_split=0.2, designates 20% of the training data for
validation. Monitoring the model's performance on this set during training offers insights
into its generalization to unseen data.
history:

The method returns a History object (history) with training process details, including loss
and accuracy over each epoch. This information aids analysis and visualization of the
model's performance during and after training.

# Train the model

epochs = 5

Neural Networks 6
batch_size = 32

history = model.fit(X_train, y_train, epochs=epochs, batch_size=

This code evaluates the trained model on the test dataset and prints the resulting loss
and accuracy metrics. The evaluation provides insights into how well the model
generalizes to new, unseen data.

# Evaluate the model

score = model.evaluate(X_test, y_test, verbose=0)
print(f'Test loss: {score[0]}, Test accuracy: {score[1]}')

This step is crucial for preserving the trained model so that it can be later loaded and
used for making predictions on new data without having to retrain the model from
scratch. The saved model file ('imdb_sentiment_analysis_fnn_model.h5') will contain the
architecture, weights, and configuration of the trained neural network.

# Save the model

model.save('imdb_sentiment_analysis_fnn_model.h5')

Results

Test loss: 0.6220757961273193, Test accuracy: 0.87739998

Training Metrics:

Epoch 1: Achieved a training accuracy of approximately 74.4% with a loss of

0.5067.

Neural Networks 7
Epoch 2: Improved to a training accuracy of around 93.9% with a significantly
reduced loss of 0.1683.

Epochs 3 and 4: Achieved even higher training accuracy, reaching 99.0% and
99.7%, respectively. Loss values continued to decrease.

Validation Metrics:

Epoch 1: Validation accuracy was around 88.4%, and the validation loss was
0.2885.

Epoch 2: Validation accuracy remained high at 88.2%, with a slightly increased

loss of 0.2888.

Epochs 3, 4, and 5: Validation accuracy remained stable, ranging from 88.0%

to 88.5%. Validation loss increased slightly.

Test Metrics:

After 5 epochs, the model was evaluated on the test dataset, resulting in a test
loss of approximately 0.6221 and a test accuracy of about 87.7%.

Neural Networks 8

Unit 4
No ratings yet
Unit 4
23 pages
Adobe Scan 08 Jan 2025
No ratings yet
Adobe Scan 08 Jan 2025
7 pages
Text Classification - Movie Review - News Wires
No ratings yet
Text Classification - Movie Review - News Wires
5 pages
Keras RNN Guide for Beginners
No ratings yet
Keras RNN Guide for Beginners
13 pages
FALLSEM2024-25 BCSE332P LO VL2024250102168 2024-10-07 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE332P LO VL2024250102168 2024-10-07 Reference-Material-I
18 pages
Lecture Notes 6
No ratings yet
Lecture Notes 6
5 pages
DLT Experiment 2
No ratings yet
DLT Experiment 2
7 pages
DL Exp-10,11,12
No ratings yet
DL Exp-10,11,12
6 pages
Experiment 2
No ratings yet
Experiment 2
5 pages
DL 5
No ratings yet
DL 5
5 pages
CV Lab Manual
No ratings yet
CV Lab Manual
126 pages
Text Classification With Transformer - 1716327784332
No ratings yet
Text Classification With Transformer - 1716327784332
3 pages
Assignment 2
No ratings yet
Assignment 2
8 pages
FDL 6
No ratings yet
FDL 6
3 pages
Sentiment Analysis With NLP Deep Learning
No ratings yet
Sentiment Analysis With NLP Deep Learning
8 pages
2023 Aug How To Produce Data For A Neural networkORG
No ratings yet
2023 Aug How To Produce Data For A Neural networkORG
6 pages
Practical 2
No ratings yet
Practical 2
4 pages
Module V
No ratings yet
Module V
19 pages
2023 Aug How To Prepare Data For A Neural Network A Step-by-Step Guide
No ratings yet
2023 Aug How To Prepare Data For A Neural Network A Step-by-Step Guide
7 pages
DL 3
No ratings yet
DL 3
6 pages
566f0619-9145-4b8f-b12b-cb8a5b0cd30d
No ratings yet
566f0619-9145-4b8f-b12b-cb8a5b0cd30d
17 pages
Exercise 8
No ratings yet
Exercise 8
6 pages
Assingment-3 NLP
No ratings yet
Assingment-3 NLP
5 pages
NLP Lab Assignment - 05
No ratings yet
NLP Lab Assignment - 05
6 pages
DL 22Q71A4206
No ratings yet
DL 22Q71A4206
65 pages
Case Study - Sentiment Analysis With RNNs
No ratings yet
Case Study - Sentiment Analysis With RNNs
8 pages
CCS355
No ratings yet
CCS355
29 pages
Keras NLP Encoding and Sentiment Analysis
No ratings yet
Keras NLP Encoding and Sentiment Analysis
8 pages
Shaurya DL File
No ratings yet
Shaurya DL File
75 pages
Machine Learning Assignment Guide
No ratings yet
Machine Learning Assignment Guide
6 pages
Python Deep Learning Basics
No ratings yet
Python Deep Learning Basics
8 pages
Deep Learning Lab Assignments - 6-9
No ratings yet
Deep Learning Lab Assignments - 6-9
14 pages
NLP Using RNN
No ratings yet
NLP Using RNN
15 pages
Deep DL Manual Nainish
No ratings yet
Deep DL Manual Nainish
8 pages
DL Lab - Merged
No ratings yet
DL Lab - Merged
60 pages
L2 - Basic ANN Model Building With TF-Keras
No ratings yet
L2 - Basic ANN Model Building With TF-Keras
16 pages
Satish Deep Learning Lab MAnual
No ratings yet
Satish Deep Learning Lab MAnual
85 pages
Wa0000.
No ratings yet
Wa0000.
40 pages
CTRL
No ratings yet
CTRL
5 pages
Q 3
No ratings yet
Q 3
2 pages
Sentiment Analysis Using LSTM
No ratings yet
Sentiment Analysis Using LSTM
5 pages
Sentiment Analysis Using LSTM
No ratings yet
Sentiment Analysis Using LSTM
5 pages
3-Sentiment Analysis BERT
No ratings yet
3-Sentiment Analysis BERT
5 pages
Final DL
No ratings yet
Final DL
26 pages
Case Study NLP
No ratings yet
Case Study NLP
4 pages
Final Code
No ratings yet
Final Code
16 pages
Exp 10 Sentiment Analysis BERT
No ratings yet
Exp 10 Sentiment Analysis BERT
5 pages
LP V GRPB 2a - Merged
No ratings yet
LP V GRPB 2a - Merged
32 pages
CNN Text Classification
No ratings yet
CNN Text Classification
12 pages
Assignment No 2
No ratings yet
Assignment No 2
8 pages
Experiment 10
No ratings yet
Experiment 10
5 pages
DL Lab1
No ratings yet
DL Lab1
15 pages
MN2
No ratings yet
MN2
17 pages
Sequence Classification With LSTM Recurrent Neural Networks
No ratings yet
Sequence Classification With LSTM Recurrent Neural Networks
6 pages
CS663-2024-Executive NLP - Assignment Sentiment Analysis
No ratings yet
CS663-2024-Executive NLP - Assignment Sentiment Analysis
4 pages
A Quick Recap: Artificial Intelligence LAB
No ratings yet
A Quick Recap: Artificial Intelligence LAB
29 pages
Unit 3 4
No ratings yet
Unit 3 4
6 pages
Step by Step Guide How To Rapidly Build Neural Networks
No ratings yet
Step by Step Guide How To Rapidly Build Neural Networks
6 pages
Second Term Exam 1ms
No ratings yet
Second Term Exam 1ms
3 pages
Conclusion For Financial Accounting Assignment
100% (2)
Conclusion For Financial Accounting Assignment
11 pages
5 Minute Speech Outline Template
No ratings yet
5 Minute Speech Outline Template
2 pages
Tavus
No ratings yet
Tavus
11 pages
English Quiz for Students
No ratings yet
English Quiz for Students
3 pages
Thepaper 1 Bible
100% (1)
Thepaper 1 Bible
2 pages
Frosty The Snowman Activity Book, Donnette E Davis, ST Aiden's Homeschool
100% (21)
Frosty The Snowman Activity Book, Donnette E Davis, ST Aiden's Homeschool
43 pages
Jawad Haidari CV
No ratings yet
Jawad Haidari CV
3 pages
14b Nyquist Stability Criterion
No ratings yet
14b Nyquist Stability Criterion
29 pages
Jesus and John Wayne How White Evangelicals Corrupted A Faith and Fractured A Nation by Kristin Kobes Du Mez
100% (5)
Jesus and John Wayne How White Evangelicals Corrupted A Faith and Fractured A Nation by Kristin Kobes Du Mez
320 pages
Poetry Analysis: Bazaars of Hyderabad
No ratings yet
Poetry Analysis: Bazaars of Hyderabad
4 pages
Mos Excel - Training 8: Training 8.xlxs Workbook Sheet1 Worksheet
No ratings yet
Mos Excel - Training 8: Training 8.xlxs Workbook Sheet1 Worksheet
2 pages
Photoelectric Effect Answers
No ratings yet
Photoelectric Effect Answers
4 pages
ICT SS1 WK9 PROGRAMMING LANGUAGE II (BASIC)
No ratings yet
ICT SS1 WK9 PROGRAMMING LANGUAGE II (BASIC)
5 pages
Eng g3 p1 Pupil's Book Reprint 2022
No ratings yet
Eng g3 p1 Pupil's Book Reprint 2022
109 pages
Entry Level Database Developer Resume
100% (2)
Entry Level Database Developer Resume
7 pages
Stopping by Woods On A Snowy Evening Summary
100% (3)
Stopping by Woods On A Snowy Evening Summary
3 pages
Database Management System Course
No ratings yet
Database Management System Course
37 pages
Cloze Test Set - 4 (Prelims)
No ratings yet
Cloze Test Set - 4 (Prelims)
15 pages
From ASCII To UTF-8-RolandSchock
No ratings yet
From ASCII To UTF-8-RolandSchock
52 pages
Bixlon SRP 350ii Mannual
No ratings yet
Bixlon SRP 350ii Mannual
42 pages
Hydraulic Press Design Guide
No ratings yet
Hydraulic Press Design Guide
11 pages
OOP Lab Document PDF
33% (3)
OOP Lab Document PDF
219 pages
Give The Devil His Due by Sheldon Emry
100% (1)
Give The Devil His Due by Sheldon Emry
89 pages
Indian Loan Words in English
No ratings yet
Indian Loan Words in English
2 pages
Accounting Information Systems
No ratings yet
Accounting Information Systems
15 pages
Helen Nort. Sophrosyne - Self-Knowledge and Self-Restraint in Greek Literature PDF
67% (3)
Helen Nort. Sophrosyne - Self-Knowledge and Self-Restraint in Greek Literature PDF
421 pages
NY2 Test Unit 2 Answer Key
No ratings yet
NY2 Test Unit 2 Answer Key
2 pages
ORAL COMMUNICATION11 - Q1 - Module3 - FINAL
100% (2)
ORAL COMMUNICATION11 - Q1 - Module3 - FINAL
34 pages
Applying Lean Six Sigma in Health Care A Practical Guide To Performance Improvement Illustrated Thomas K Ross Download
No ratings yet
Applying Lean Six Sigma in Health Care A Practical Guide To Performance Improvement Illustrated Thomas K Ross Download
40 pages