0% found this document useful (0 votes)

37 views30 pages

Practical BDA (1-7)

BDA PRACTICAL

Uploaded by

psp.ssaiet

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views30 pages

Practical BDA (1-7)

BDA PRACTICAL

Uploaded by

psp.ssaiet

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 30

Enrollment No.

[ Big data analysis (3170722) ]

211230107015

PRACTICAL: 1
AIM: To Demonstrate Installation and Configuration of MongoDB client
Server.
STEP: 1 — Download the MongoDB MSI Installer Package
Download the current version of MongoDB. Make sure you select MSI as the
package.

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

STEP: 2 — Install MongoDB with the Installation Wizard

A. Navigate to your downloads folder and double click on the .msi package you just
downloaded. This will launch the installation wizard.

B. Click Next to start installation.

C. Accept the licence agreement then click Next.

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

D. Select the Complete setup.

E. Select “Run service as Network Service user” and make a note of the data directory,
we’ll need this later.

F. We won’t need Mongo Compass, so deselect it and click Next.

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

G. Click Install to begin installation.

H. Hit Finish to complete installation.

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

STEP: 3 After installation we have to set the path of mongodb and mongoshell in
environment variable.
A. Open Environment variable and then open path and add the path of mongodb and
mongoshell.

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

B. After that command promat cheak the version of mongodb and run the
mongodb.

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

PRACTICAL: 2
AIM: Write the MongoDB queries for creating database, collection,
inserting documents, updating document, deleting documents.
Code: -

A. Creating database and collection

B. Inserting documents

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

C. Updating Documents

D. Deleting Documents

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

PRACTICAL: 3
AIM: Write the MongoDB queries for the given collection.
 Collection and Inserted Documents:

A. Find the document where in the name of collage has value ‘Neha’.

B. Display name of student from trupti collection

C. Display name of student with id of the student having id value 3.

D. Display documents with students id with 1 to 2.

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

PRACTICAL: 4
AIM: Write MongoDB queries for aggregate methods such as Count, Limit,
Sort etc.

A. Display documents in the ascending order of _id

B. Display documents in the descending order of name.

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

C. Display documents first in the ascending order of _id and then descending order of name.

D. Display all documents except first two trupti collection.

E. Display 2th and 3th documents from the trupti collection.

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

F. Display total number of documents in trupti collection.

G. Display last two documents from the trupti collection.

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

PRACTICAL: 5
AIM: Write MongoDB queries similar to LIKE predicate in SQL.

A. Find the ids of studnrts whose name begins with the letter “N”.

B. Display all documents in which student name ends with the letter ‘e’.

C. Find all documents in student name contains ‘h’in any position.

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

PRACTICAL: 6
AIM: To Demonstrate the Installation and Configuration of Single node
Hadoop.
1. Download Hadoop binaries

The first step is to download Hadoop binaries from the official website.
The binary package size is about 342 MB.

After finishing the file download, we should unpack the package using two steps. First, we
should extract the hadoop-3.2.1.tar.gz library, and then, we should unpack the extracted tar
file.

The tar file extraction may take some minutes to finish. In the end, you may see some
warnings about symbolic link creation. Just ignore these warnings since they are not related to
windows.

Since we are installing Hadoop 3.2.1, we should download the files locatedin
https://github.com/cdarlint/winutils/tree/master/hadoop-3.2.1/bin and copy them
into the “hadoop-3.2.1\bin” directory.

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

After unpacking the package, we should add the Hadoop native IO libraries, which
can be found in the following GitHub repository:
https://github.com/cdarlint/winutils.
2. Setting up environment variable.
After installing Hadoop and its prerequisites, we should configure the environment variables
to define Hadoop and Java default paths.

To edit environment variables, go to Control Panel > System and Security > System (or right-
click > properties on My Computer icon) and click on the “Advanced system settings” link.

There are two variables to define:

• JAVA_HOME: JDK installation folder path

• HADOOP_HOME: Hadoop installation folder path

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

3. Configuring Hadoop cluster.

There are four files we should alter to configure Hadoop cluster:

%HADOOP_HOME%\etc\hadoop\hdfs-site.xml %HADOOP_HOME%\etc\
hadoop\core-site.xml

%HADOOP_HOME%\etc\hadoop\mapred-site.xml %HADOOP_HOME%\etc\
hadoop\yarn-site.xml

As we know, Hadoop is built using a master-slave paradigm. Before altering the HDFS
configuration file, we should create a directory to store all master node (name node) data and
another one to store data (data node). In this example, we created the following directories:

• C:\hadoop-env\hadoop-3.2.1\data\dfs\namenode
• C:\hadoop-env\hadoop-3.2.1\data\dfs\datanode

Now we can edit our hdfs-site.xml file for further config. Open file add edit as
below:
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///C:/hadoop-env/hadoop3.2.1/data/dfs/namenode</value>

</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///E:/hadoop-env/hadoop-3.2.1/data/dfs/datanode</value>
</property>

Core site configuration

<name>fs.default.name</name>

<value>hdfs://localhost:9820</value>

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

</property>

Map Reduce site configuration:

<name>mapreduce.framework.name</name>

<description>MapReduce framework name</description>

</property>

Yarn site configuration:

<name>yarn.nodemanager.auxservices</name>

<value>mapreduce_shuffle</value>

<description>Yarn Node Manager Aux Service</description>

</property>

Formatting Name
node: hdfs
namenode -format

this commad may give you some error then we must fix those. If you had done all
well then you will get message like below:

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

Let start Hadoop services and see it’s working or not:

Just navigate to “%HADOOP_HOME%\sbin” directory. Then we will run the

following command to start the Hadoop nodes:

.\start-dfs.cmd
Two command prompt windows will open (one for the name node and one for the
data node) as follows:

./start-yarn.cmd

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

To make sure that all services started successfully, we can run the following command: Jps

14560 DataNode

4960 ResourceManager

5936 NameNode

768
NodeManager
14636 Jps

It will show above services running and it’s all for single node Hadoop setup.

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

PRACTICAL: 7
AIM: To demonstrate the configuration of multimode Hadoop
cluster.
Data, data and Data. Across every sectors people are dealing with huge and colossal amount
of data which is also termed as Big data. Hadoop is a very well known and widespread
distributed framework for big data processing . But when it comes to Hadoop installation,
most of us feel that it is quite cumbersome job. This article will provide you some easy and
quick steps for a multi node Hadoop cluster setup.

Multi-Node Cluster in Hadoop 3.x (3.1.3)

A Multi Node Cluster in Hadoop contains two or more data nodes in a distributed Hadoop
environment. This is used in organisations to store and analyse their massive amount of data.
So knowing how to setup a multi-node Hadoop cluster is an important task.

Prerequisites

We will need the following software and hardware as prerequisite to perform the
activities.

Ubuntu 18.04.3 LTS (Long Term Support)

● Hadoop-3.1.3

● JAVA 8

● SSH

● At least 2 laptop/desktop connected by LAN/Wi-Fi

Installations Steps

STEP: 1 Installation of Ubuntu/OS in the machines

This step is very self-explanatory, as a first step we need to install Ubuntu or any other flavor
of Linux you have chosen in both the nodes (Laptop/Desktop — will be referred as nodes
from hereon). You can also install a lighter version of Ubuntu — Lubuntu (Light weight
Ubuntu). if you are using old hardware where you are having difficulty installing Ubuntu.

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

In my case I was using an old laptop of mine as the slave node and I had to install Lubuntu
and it worked without any issues.

Please create an admin user in both the nodes preferably with the same username

STEP: 2 Configuring host names

Once OS is installed as a next step, we should set the hostname for both the nodes. In my case
I named the nodes as —

● masternode
● slave sudo vi /etc/hostname

Reboot of the node is required after the hostname is updated.

* This step is optional if you have already put the hostnames during OS installation

STEP: 3 Configuring IP address in the hosts file of the nodes

Next, we need to add the IPs of masternode and slave node in the /etc/hosts file in both the
nodes.

Command:

sudo vi
/etc/host

Comment out all other entries you have in the hosts file in both the nodes.

Command to see the IP of the node:

ip addr show

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

STEP:4 Restart the sshd service in both the nodes

Command:

service sshd
restart

STEP: 5 Create the SSH Key in the master node and publish it in the slave node.

For this activity follow the below steps:

● Command to generate SSH key in masternode: ssh-keygen

● It will ask for folder location where it will copy the keys, I entered
/home/username/.ssh/id_rsa

● It will ask for pass phrase, keep it empty for simplicity.

● Next copy the newly generated public key to auth file in your users
home/.ssh directory. Command: cat $HOME/.ssh/id_rsa.pub >>
$HOME/.ssh/authorized_keys

● Next execute — ssh localhost to check if the key is working.

● Next, we need to publish the key to the slave node. Command: ssh-
copy-id -i $HOME/.ssh/id_rsa.pub <username>@slave

● First time it will prompt you to enter the password and publish the
key.

● Execute ssh <username>@slave again to check if you are able to

loging without password. This is very important. Without public key
working,the slave node cannot be added to the cluster later.

STEP: 6 Download and install Java

Download and install Open JDK 8 and set the JAVA_HOME path in your .bashrc
file of the user under which you are installing hadoop.

STEP: 7 Download the Hadoop 3.1.3 package in all nodes.

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

wget http://apache.cs.utah.edu/hadoop/common/current/hadoop-3.1.3.tar.gz tar -xzf hadoop-

3.1.3.tar.gz

STEP: 8 Add the Hadoop and Java paths in the bash file (.bashrc) on all nodes.

Command: sudo vi .bashrc

Environment Variables to Set in .bashrc

STEP: 9 Set NameNode Location

Update your ~/hadoop/etc/hadoop/core-site.xml file to set the NameNode

location to node-master on port 9000:

STEP: 10 Set path for HDFS

Edit ~/hadoop/etc/hadoop/hdfs-site.conf to add the following for the masternode:

For the data node please put the following

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

Please note the difference between the configuration properties of masternode and
slave.

STEP: 11 Set YARN as Job Scheduler

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

STEP: 12 Configure YARN

Edit ~/hadoop/etc/hadoop/yarn-site.xml, which contains the configuration options for YARN.
In the value field for the yarn.resourcemanager.hostname, replace 192.168.1.4 with the IP
address of node-master that you have:

STEP: 13 Configure Workers

The file worker is used by startup scripts to start required daemons on all nodes.
Edit

~/hadoop/etc/hadoop/workers of the masternode to include hostnames of both of the

nodes.

STEP: 14 Update the JAVA_HOME in hadoop-env.sh

Edit ~/hadoop/etc/hadoop/hadoop-env.sh and update the value for the

JAVA_HOME of your installation for both the nodes.

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

STEP: 15 Format HDFS namenode

STEP: 16 Start and Stop HDFS
Ok. So now you are almost there. Only thing left is starting the daemons. To start all
the daemons and bring up your hadoop cluster use the below command:

Command: start-all.sh

Once the command prompt is back, to check the daemons running use the following
command: Command: jps

This is what you see in the masternode:

This is what you will see in the slave node:

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

If you are not seeing the above daemons running, then something has gone wrong in your
configuration. So, you need to check the previous steps again.
URL after modifying the IP with that of your masternode:
http://192.168.1.4:9870/dfshealth.html#tab- overview

SSAIET, Computer Dept, Navsari Page :

Enrollment No.
[ Big data analysis (3170722) ]
211230107015

STEP: 17 Put and Get Data to HDFS

To start with you have to create the user directory in your HDFS cluster. This user directory
should be in the same username under which you have installed and running the cluster. Use
the following command:

Command: hdfs dfs -mkdir /user/username

Once user directory is created you can use any of your hdfs dfs commands and start using
your HDFS cluster.

SSAIET, Computer Dept, Navsari Page :

Mangodb v1 Manual
No ratings yet
Mangodb v1 Manual
91 pages
NO SQL Abhi
No ratings yet
NO SQL Abhi
13 pages
Mongodb Instalation
No ratings yet
Mongodb Instalation
5 pages
Adt Record
No ratings yet
Adt Record
91 pages
BPM (1030106407) Computerengineering
No ratings yet
BPM (1030106407) Computerengineering
50 pages
BDT Lab Manual 2023 KRCE
No ratings yet
BDT Lab Manual 2023 KRCE
42 pages
Stream Processing Lab
No ratings yet
Stream Processing Lab
50 pages
MongoDB - Course Curriculum
No ratings yet
MongoDB - Course Curriculum
5 pages
ADBMS Practical
No ratings yet
ADBMS Practical
2 pages
BDA Experiment2
No ratings yet
BDA Experiment2
7 pages
Unit 4 - Mongodb
No ratings yet
Unit 4 - Mongodb
10 pages
IT3501 FSWD Topic20
No ratings yet
IT3501 FSWD Topic20
4 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
60 pages
4.2 Nep Nosql Using Mongodb Syllabus
No ratings yet
4.2 Nep Nosql Using Mongodb Syllabus
4 pages
Unit 2 (MongoDB)
No ratings yet
Unit 2 (MongoDB)
17 pages
NOSQL Lab Book
No ratings yet
NOSQL Lab Book
33 pages
IT3501 FSWD Topic23
No ratings yet
IT3501 FSWD Topic23
4 pages
Module 5
No ratings yet
Module 5
9 pages
Full Final
No ratings yet
Full Final
64 pages
Adt Manual-1
No ratings yet
Adt Manual-1
48 pages
Advanced Datbase Technology Lab-Final
No ratings yet
Advanced Datbase Technology Lab-Final
96 pages
m101j Homework 5.1 Answer
100% (1)
m101j Homework 5.1 Answer
6 pages
Updated Mongodb Lab Manual IV Sem
No ratings yet
Updated Mongodb Lab Manual IV Sem
48 pages
MongoDB Queries
No ratings yet
MongoDB Queries
16 pages
BDA - Manual - 1to6 Ayushi
No ratings yet
BDA - Manual - 1to6 Ayushi
22 pages
Mongo DB Lab Manual
No ratings yet
Mongo DB Lab Manual
43 pages
MongoDB Guide for Data Science Students
No ratings yet
MongoDB Guide for Data Science Students
24 pages
5-Using MongoDB Shell (E-Next - In)
No ratings yet
5-Using MongoDB Shell (E-Next - In)
41 pages
MongoDB (BDSL456B) Manual
No ratings yet
MongoDB (BDSL456B) Manual
31 pages
Big Data Practical 3
No ratings yet
Big Data Practical 3
4 pages
MongoDB Database Systems Guide
No ratings yet
MongoDB Database Systems Guide
23 pages
AWP Unit 6 MongoDB NodeJS
No ratings yet
AWP Unit 6 MongoDB NodeJS
39 pages
Mongodb Commands List and Analysis
No ratings yet
Mongodb Commands List and Analysis
6 pages
Ccs368-Stream Processing Lab Manual
No ratings yet
Ccs368-Stream Processing Lab Manual
50 pages
Ch2 Nosql Wordpress
No ratings yet
Ch2 Nosql Wordpress
9 pages
Wa0003.
No ratings yet
Wa0003.
9 pages
Homework 4.4 Mongodb
No ratings yet
Homework 4.4 Mongodb
6 pages
Ai22f Tanveer Younas B22f0047ai067
No ratings yet
Ai22f Tanveer Younas B22f0047ai067
13 pages
MongoDB MANUAL
No ratings yet
MongoDB MANUAL
25 pages
Jagadish MongoDB - Practical - File
No ratings yet
Jagadish MongoDB - Practical - File
33 pages
Big Data
No ratings yet
Big Data
32 pages
Big Data Journal
No ratings yet
Big Data Journal
50 pages
MongoDB Ex
No ratings yet
MongoDB Ex
3 pages
ADT Lab Manual - New 1
No ratings yet
ADT Lab Manual - New 1
63 pages
3 RK - NoSQL - MongoDB - V5
No ratings yet
3 RK - NoSQL - MongoDB - V5
61 pages
MongoDB Installation & NoSQL Guide
No ratings yet
MongoDB Installation & NoSQL Guide
14 pages
Homework 6.2 Mongodb
100% (1)
Homework 6.2 Mongodb
4 pages
BDA Manual SHUBHAM
No ratings yet
BDA Manual SHUBHAM
22 pages
Unit 2 - Bda Notes
No ratings yet
Unit 2 - Bda Notes
37 pages
MONGO DB Lab Manual-1
No ratings yet
MONGO DB Lab Manual-1
54 pages
MongoDB Basics for Developers
No ratings yet
MongoDB Basics for Developers
38 pages
MongoDB Homework Help for Students
100% (1)
MongoDB Homework Help for Students
5 pages
Dbms Assignment 9
No ratings yet
Dbms Assignment 9
6 pages
MongoDB Manual Master
No ratings yet
MongoDB Manual Master
618 pages
Mongo DB Running Notes
33% (3)
Mongo DB Running Notes
209 pages
Lab Praticals Adbms 22-23
No ratings yet
Lab Praticals Adbms 22-23
34 pages
MongoDB Installation & CRUD Guide
No ratings yet
MongoDB Installation & CRUD Guide
15 pages
A Project Report
No ratings yet
A Project Report
29 pages
KISI Teknologi Keamanan Siber (Cyber Security)
No ratings yet
KISI Teknologi Keamanan Siber (Cyber Security)
6 pages
E-Note 16784 Content Document 20240312112644AM
No ratings yet
E-Note 16784 Content Document 20240312112644AM
90 pages
Final Ds Rec
No ratings yet
Final Ds Rec
73 pages
Airodump NG Oui
No ratings yet
Airodump NG Oui
304 pages
Job Board Project Plan
No ratings yet
Job Board Project Plan
47 pages
Srs For Billing Sysetm
No ratings yet
Srs For Billing Sysetm
8 pages
Go Dino Game Complete Explanation
No ratings yet
Go Dino Game Complete Explanation
3 pages
Lecture 13 IoT Cloud Computing
No ratings yet
Lecture 13 IoT Cloud Computing
41 pages
Windows Logging Guide for IT Pros
No ratings yet
Windows Logging Guide for IT Pros
6 pages
Ecotank M1140-M1170-M1180 PDF
No ratings yet
Ecotank M1140-M1170-M1180 PDF
2 pages
SAP Nav Assessment Answers
No ratings yet
SAP Nav Assessment Answers
14 pages
Poulami Pal Resume Maximo
No ratings yet
Poulami Pal Resume Maximo
2 pages
PRONTO Foundation
No ratings yet
PRONTO Foundation
36 pages
CS602 Microprocessor and Microcontroller - 8086
No ratings yet
CS602 Microprocessor and Microcontroller - 8086
20 pages
Faq On Pointers
No ratings yet
Faq On Pointers
56 pages
705P01417 V2100-3100 EDOC Installation Instructions
No ratings yet
705P01417 V2100-3100 EDOC Installation Instructions
3 pages
Hello Python
No ratings yet
Hello Python
21 pages
Chapter 1: Introduction To Computers
No ratings yet
Chapter 1: Introduction To Computers
47 pages
Icpc Strategy
No ratings yet
Icpc Strategy
3 pages
Readme Nx51
No ratings yet
Readme Nx51
32 pages
Module 3 Computer Programming
No ratings yet
Module 3 Computer Programming
62 pages
Windows System Administrator Sample Resume
No ratings yet
Windows System Administrator Sample Resume
2 pages
Boosting Campus Network Design Using Cisco Packet Tracer
100% (2)
Boosting Campus Network Design Using Cisco Packet Tracer
12 pages
MS Project Keyboard Shortcuts
No ratings yet
MS Project Keyboard Shortcuts
1 page
Front Panel I/O Connectivity Design Guide
No ratings yet
Front Panel I/O Connectivity Design Guide
59 pages
Embedded System Final Project Report
No ratings yet
Embedded System Final Project Report
8 pages
Operating System
No ratings yet
Operating System
3 pages
MC2 User Guide 1
No ratings yet
MC2 User Guide 1
176 pages
DxDiag 192.1168.0.164 Lupita
No ratings yet
DxDiag 192.1168.0.164 Lupita
38 pages