0% found this document useful (0 votes)

10 views42 pages

1 Cluster Computing

The document provides an introduction to cluster computing, defining it as a group of linked computers that work together to enhance performance and availability. It discusses the history, configuration, advantages, types of clusters, and key design challenges associated with cluster computing. Additionally, it covers practical aspects such as secure connections using SSH, file transfer methods, and basic UNIX commands for managing files and directories.

Uploaded by

itstd.5415

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views42 pages

1 Cluster Computing

Uploaded by

itstd.5415

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

Introduction to

Cluster
Computing
Dr. Hrachya Astsatryan,
Institute for Informatics and Automation Problems,
National Academy of Sciences of Armenia,
E-mail: hrach@sci.am Slide 1
1
INTRO TO
CLUSER
COMPUTING
Slide 2
Definition

• A computer cluster is a group of linked computers, working

together closely so that in many respects they form a single
computer.

• The components of a cluster are commonly, but not always,

connected to each other through fast local area networks.

• Clusters are usually deployed to improve performance and/or

availability over that provided by a single computer, while
typically being much more cost-effective than single computers of
comparable speed or availability. Slide 3
History
• In the 1960s and 1970s, high-end
mainframes were standard in large
organizations and research
institutions but had limited
processing power and memory.
• In the 1980s, researchers and
engineers began experimenting
with connecting multiple low-cost
computers, often desktop PCs, to
form a cluster, thereby creating a
more powerful computing resource.

Slide 4
Configuration

High Speed Network Computing nodes - process the

user load

compute compute compute compute File Front-end - monitor the cluster

node node node node Server hardware and software, taking
node measures to reconfigure it
according to any event.
Service Network
Service network - where the
gateway Front-end communication between nodes
node takes place.
External Network
File server node - where the data is
available to all computing nodes.
Slide 5
Key Facts

As of November 2023, there are 60 MPP

supercomputers and 440 clusters in the
Top500 list.

Slide 6
Advantage

• Cost-Effectiveness - commodity hardware, making it a cost-effective alternative to

traditional supercomputers.
• Scalability - expand the computing power according to needs.
• Fault Tolerance - If one processor or node fails, the rest of the cluster can continue
to function without interruption.
• Performance - combined computational power of multiple nodes working in parallel.
• Easy Maintenance and Upgrades - composed of standard components,
maintenance and upgrades are generally straightforward.
• Flexibility - flexible to tailor the hardware and software configuration according to
specific requirements.
• Distributed Data Storage
• Widely Used Parallel Programming Models.

Slide 7
Design challenge
Which network to use?
• Latency
• Bandwidth
• Price

Which CPU architecture to use?

• Performance (FP)
• Price

Which node architecture to use?

• Performance: local and remote communication
• Price

Space Considerations
• Cooling/ventilation
• Power required

Slide 8
2
CLUSTER TYPES

Slide 9
MAIN TYPES

High performance
Load Balancing
computing (HPC)
High Availability

Slide 10
Load balancing

• PC cluster deliver load balancing performance by distributing computational tasks

and network traffic evenly across multiple nodes in a cluster

• It ensures that each node in the cluster receives a fair share of the workload,
preventing overloading on specific nodes

• Load balancing can be implemented at various levels, including application-level,

transport-level, and network-level.

• Commonly used with busy ftp and web servers with large client base

• Large number of nodes to share load

Slide 11
Kinds of clusters – load balancing
• Round Robin - distributes tasks
sequentially to each node in a
cyclic manner.
Node Node • Weighted Round Robin - assigns
Node different weights to nodes based
on their processing power, giving
more tasks to powerful nodes.
• Least Connections - routes tasks to
the node with the fewest active
connections, distributing the load
evenly.
• Weighted Least Connections -
Head Node Similar to the least connections
algorithm, but considers node
weights as well.
• ..
Slide 12
HPC (Beowulf)
• Start from 1994

• Donald Becker of NASA assemble the world’s first cluster with 16 sets of DX4 PCs and
10 Mb/s ethernet

• Also called Beowulf cluster

• Built from commodity off-the-shelf hardware

• Applications like data mining, simulations, parallel processing, weather modelling,

computer graphical rendering, etc.

• Beowulf cluster architecture remains a popular and cost-effective solution for high-
performance computing, allowing researchers and organizations to tackle complex
computational challenges efficiently. Slide 13
Kinds of clusters - HPC

Node Node Node

Head Node

Slide 14
Kinds of clusters - HPC

Data
sent

Slide 15
HPC clusters
Working
…
Working
…
Working
…

Working
…

Slide 16
HPC clusters
Finished
Results …

Get
results…
Finished
Results
…

Finished
…
Results

Slide 17
High availability

• Avoid downtime of services

• Avoid single point of failure

• Always with redundancy

• Almost all load balancing cluster are with HA capability

Slide 18
Menti 1: 1581 5048

Slide 19
HPC user environment

• Operation system: Linux (Redhat/CentOS, Ubuntu, etc), Unix.

• Access to HPC cluster: ssh
• File transfer: secure ftp (scp)
• Job scheduler: Slurm, PBS, SGE, Loadleveler
• Software management: module
• Compilers: Intel, GNU, PGI
• MPI implementations: OpenMPI, MPICH, MVAPICH, Intel MPI
• Debugging and profiling tools: Totalview, Tau, DDT, Vtune
• Programming Languages: C, C++, Fortran, Python, Perl, R, MATLAB,
Julia

Slide 20
3
ACCESS TRANSFER
TO FRONT-END
Slide 21
Secure Connection with SSH
SSH is a cryptographic network protocol for secure remote access and
data exchange.
It's widely used for connecting to HPC clusters, remote servers, and
cloud instances.
SSH is a fundamental tool for maintaining the privacy and integrity of
your interactions with remote machines.
• Encryption: SSH encrypts data in transit, preventing eavesdropping.
• Authentication: It ensures a secure login process using keys or
passwords.
• Secure File Transfer: SSH includes
SCP and SFTP for secure data transfer.

Slide 22
SSH clients
Linux/MacOS

• OpenSSH: - Most Linux distributions come with OpenSSH pre-

installed.

Windows
• PuTTY - a popular open-source SSH client for Windows. It's a
lightweight and easy-to-use tool for remote access.
• Cygwin - a large collection of GNU and Open Source tools that
provide functionality similar to a Linux distribution.
• PowerShell - a powerful and versatile command-line shell and
scripting language developed by Microsoft. PowerShell can also be
used as an SSH client on Windows to connect to HPC clusters.
• Filezilla
Slide 23
Connect ssh sever
To establish an SSH connection, we can use the ssh command
followed by the remote server's hostname or IP address and your
username:

• ssh username@remote-server

You may be prompted to enter your password or use SSH key-based

authentication, depending on the server's configuration.

Slide 24
SSH key-based authentication
Key pair generation
• You generate a pair of cryptographic keys - a public key and a
private key.
• The private key should be kept secure on your local machine.
• The public key is placed on the remote server or HPC cluster.

Authentication process
• When you attempt to connect to the remote server, your local SSH
client uses your private key to create a digital signature.
• The server checks if the digital signature matches the public key
stored on the server.
• If they match, you are granted access without the need for a
password.

Slide 25
SSH key-based authentication steps
Generate SSH key pair
• ssh-keygen -t rsa -b 2048 -f ~/.ssh/id_rsa

Copy public key to remote server

• ssh-copy-id user@remote-server

Use ssh-copy-id (usually in ~/.ssh/id_rsa.pub) to the

~/.ssh/authorized_keys file on the HPC cluster
• ssh-copy-id user@remote-server

Secure your private key

• chmod 600 ~/.ssh/id_rsa

Test Connection:
Slide 26
Best practices for ssh key passwords

Creating a strong passphrase

• Craft a memorable passphrase that's both secure and easy to
remember.
• "Consider using a passphrase with 32 characters or more,
incorporating punctuation marks and number-for-letter
substitutions.

Password managers for convenience

• Utilize a password manager like KeePass or BitWarden with built-in
password generators.

Slide 27
ACCESS
• Install Powershell / Cygwin

• Download pem certificate, https://shorturl.at/acsK8

• ssh -i Private.pem ubuntu@185.127.66.38

Slide 28
4
UNIX COMMANDS
AND HINTS
Slide 29
File system exploration

ls - list current directory contents

cd <directory-to-change-to> - change the current directory

• cd .. - change to “one level higher” in directory tree
• cd - (without argument) change to $HOME
• cd /shared/home

pwd - print full path of the current directory

Slide 30
File manipulation
• mkdir <new-directory-name> - create a directory
mkdir your_name

• touch <new-file-name> - create an empty new file

• cp <file-to-copy> <destination> - copy a file

• mv <file-to-move> <destination/new-file-name> - move or

rename a file

• rm <file-to-remove> - delete a file

Slide 31
Permissions

chmod <whowhatwhich> <file-name> - change file

permissions
• who -> u: user , g: group , o: others, a: all
• what -> -:remove permission, +: add permission
• which -> r: read, w: write, x: execute
• example chmod u+x my-batch-job-script.sh adds execution
rights for current user to the file

chgrp - change file/folder owner

Slide 32
Check file contents

• less <text-file> - see text file (exit with q)

• cat <file-name> - see file content

• head <file-name> - list ten first lines of the file

• tail -100 <file-name> - show the last 100 lines

Slide 33
Editors Vi

vi <file-name> - (create and) open file with vi

• press i to switch to “edit mode”

• edit your file

• when done, press esc to switch to “normal mode”

• press :wq to save (write) the file and exit (quit) the editor

Slide 34
4
FILE TRANSFER TO
FRONT-END
Slide 35
FTP
The TCP/IP protocol was developed in the late 1970s and early 1980s. The first FTP
standard was RFC 114, published in April 1971, before TCP and IP even existed. In
1980 the first standard to define FTP operation over modern TCP/IP was created at
around the same time as the other primary defining standards for TCP/IP.

• FTP was created with the overall goal of allowing indirect use of computers on a
network, by making it easy for users to move files from one place to another.
• Like most TCP/IP protocols, the FTP based on a client/server model, with an FTP
client on a user machine creating a connection to an FTP server to send and
retrieve files to and from the server.
• The main objectives of FTP were to make file transfer simple, and to shield the
user from implementation details of how the files are actually moved from one
place to another.
Slide 36
FTP model

• FTP server implementations enable simultaneous access by

multiple clients.

• Clients use TCP reliable protocol to connect to a server.

• The FTP server process awaits connections and creates a

slave process to handle each connection.

• The slave process accepts and handles a control connection

from the client.

Slide 37
FTP: port number and data

• The client uses a random protocol port number during the initial
connection to a server.

• The client contacts the server at a common port number (port

21).

• client sends that port number across the control connection to

the server.

• The client waits for the server to form a TCP connection to that
specified port. The server uses port 20 for the FTP data transfer.

Slide 38
FTP: connect

ftp <username>@<hostname>

• Client

• Web browser

Slide 39
FTP: commands
• CWD - change working directory.

• LIST - list remote files

• MKD - make a remote directory

• PWD - print working directory

• QUIT - terminate the connection

• SIZE - return the size of a file

• USER - send username

Slide 40
Copy files to front-end
SCP and SFTP both run over ssh and are thus encrypted.

Linux
• Copy files from your computer to the cluster: scp local_filename
username@remote server
• Copy files from to thecluster to your computer
• scp username@remote_server:/home/username/remote_filename .

Windows
• FileZilla

Slide 41
TRANSFER YOUR FILES

Copy the file ”file.txt" from the local host to a remote host
• scp -i Private.pem file.txt
ubuntu@185.127.66.38:/shard/home/your_dir

Copy the file ”file.txt" from a remote host to the local host
• scp -i Private.pem ubuntu@185.127.66.38:/home/ubuntu/
file.txt

Slide 42

HPC Introduction Lecture 2
No ratings yet
HPC Introduction Lecture 2
55 pages
Sun Grid Engine Tutorial
No ratings yet
Sun Grid Engine Tutorial
14 pages
Cluster Computing
No ratings yet
Cluster Computing
23 pages
Rehan Khan Roll No - 38
No ratings yet
Rehan Khan Roll No - 38
23 pages
Networking Commands and SSH
No ratings yet
Networking Commands and SSH
13 pages
Computing Cluster Design Guide
No ratings yet
Computing Cluster Design Guide
168 pages
22 Clusters Slides
No ratings yet
22 Clusters Slides
61 pages
Cluster and Grid Computing
No ratings yet
Cluster and Grid Computing
37 pages
Cluster Computing Overview
No ratings yet
Cluster Computing Overview
11 pages
PDC Lec 7
No ratings yet
PDC Lec 7
22 pages
Cluster 2
No ratings yet
Cluster 2
26 pages
Beowulf Cluster
No ratings yet
Beowulf Cluster
60 pages
Presented By: Veena.K.P Mca S5 Roll No:28
No ratings yet
Presented By: Veena.K.P Mca S5 Roll No:28
35 pages
Cluster Computing: DATE: 28 November 2013
No ratings yet
Cluster Computing: DATE: 28 November 2013
32 pages
1 s2.0 S1877042809004236 Main
No ratings yet
1 s2.0 S1877042809004236 Main
6 pages
Unit-Ii PPT
No ratings yet
Unit-Ii PPT
43 pages
Cluster Computer
No ratings yet
Cluster Computer
22 pages
Intro To Linux and HPC
No ratings yet
Intro To Linux and HPC
67 pages
Basic Usage Command Line Interface Shell Scripts
No ratings yet
Basic Usage Command Line Interface Shell Scripts
7 pages
Parallel and Cluster Computing
No ratings yet
Parallel and Cluster Computing
31 pages
Advance Computing Technology (170704)
No ratings yet
Advance Computing Technology (170704)
106 pages
High Performance Cluster Computing
No ratings yet
High Performance Cluster Computing
15 pages
Seminar
No ratings yet
Seminar
20 pages
Cluster Computing
No ratings yet
Cluster Computing
32 pages
CC - Unit 1
No ratings yet
CC - Unit 1
34 pages
Building A Supercomputer
No ratings yet
Building A Supercomputer
34 pages
HPC Cluster Setup for Beginners
No ratings yet
HPC Cluster Setup for Beginners
30 pages
The Poor Man's Super Computer
No ratings yet
The Poor Man's Super Computer
13 pages
G. B. Pant of Institute & Technology: Comparison of Parallel Processing Via HPC Cluster Vs Non Parallel Processor
No ratings yet
G. B. Pant of Institute & Technology: Comparison of Parallel Processing Via HPC Cluster Vs Non Parallel Processor
22 pages
Cluster
No ratings yet
Cluster
55 pages
DistributedComputing Rev2
No ratings yet
DistributedComputing Rev2
44 pages
Cluster Computing Technology: By: Mahesh Bhoop Nikhil Jamdade
No ratings yet
Cluster Computing Technology: By: Mahesh Bhoop Nikhil Jamdade
23 pages
ECommerce Infrastructure
No ratings yet
ECommerce Infrastructure
33 pages
Supercomputer Building Guide
No ratings yet
Supercomputer Building Guide
8 pages
High Performance Cluster Guide
No ratings yet
High Performance Cluster Guide
70 pages
Cluster Computing by Pritam Bhansali
100% (1)
Cluster Computing by Pritam Bhansali
17 pages
Cluster Computing
100% (6)
Cluster Computing
28 pages
Cluster Computing: by Aakash Kumar Singh
No ratings yet
Cluster Computing: by Aakash Kumar Singh
26 pages
A Seminar Report
100% (1)
A Seminar Report
13 pages
Cluster Stack Basics
No ratings yet
Cluster Stack Basics
25 pages
Cluster Computing Essentials
No ratings yet
Cluster Computing Essentials
7 pages
L1.1 HPC Environment
No ratings yet
L1.1 HPC Environment
27 pages
Clustering For Massive Parallelism
No ratings yet
Clustering For Massive Parallelism
3 pages
Module1 Part1
No ratings yet
Module1 Part1
26 pages
Chapter One 1.1 Background of Study
No ratings yet
Chapter One 1.1 Background of Study
26 pages
Compute Cluster Deployment Guide
No ratings yet
Compute Cluster Deployment Guide
35 pages
04 - Computer Clusters
No ratings yet
04 - Computer Clusters
66 pages
Mscluster 08 02 2024
No ratings yet
Mscluster 08 02 2024
14 pages
Windows HPC Guide for Researchers
No ratings yet
Windows HPC Guide for Researchers
23 pages
Unit 1
No ratings yet
Unit 1
58 pages
Cluster Computing
No ratings yet
Cluster Computing
10 pages
Amity University, Rajasthan: " Cluster Computing "
No ratings yet
Amity University, Rajasthan: " Cluster Computing "
3 pages
Day 2
No ratings yet
Day 2
24 pages
CCUnit 1
No ratings yet
CCUnit 1
83 pages
Cluster Computing
No ratings yet
Cluster Computing
43 pages
Guitar Competition Form
No ratings yet
Guitar Competition Form
7 pages
Python & Data Analytics Guide
100% (1)
Python & Data Analytics Guide
8 pages
Elite 500
No ratings yet
Elite 500
4 pages
Gym Shark
No ratings yet
Gym Shark
2 pages
Syllabus For Applied Electronics
No ratings yet
Syllabus For Applied Electronics
28 pages
Master Thesis Presentation
100% (3)
Master Thesis Presentation
6 pages
17
No ratings yet
17
100 pages
9 Speech Recognition
No ratings yet
9 Speech Recognition
26 pages
Skill Exchange - User Manual For Students
No ratings yet
Skill Exchange - User Manual For Students
10 pages
SDN Lab Exp 3C
No ratings yet
SDN Lab Exp 3C
5 pages
ETP Specs Review
No ratings yet
ETP Specs Review
3 pages
UD4 - Software
No ratings yet
UD4 - Software
18 pages
Lecture - 03 - Fundamental of Programming
No ratings yet
Lecture - 03 - Fundamental of Programming
36 pages
Repackage UWP Apps
No ratings yet
Repackage UWP Apps
3 pages
Risk Management in Financial Services
No ratings yet
Risk Management in Financial Services
4 pages
EOL - 7962 Ruckus
No ratings yet
EOL - 7962 Ruckus
2 pages
The Artificial Intelligence Handbook For Graphic Designers by Jeroen Erne
100% (2)
The Artificial Intelligence Handbook For Graphic Designers by Jeroen Erne
367 pages
Git Full PDF
No ratings yet
Git Full PDF
15 pages
Solucion-Parcial-.. - Jupyter Notebook
No ratings yet
Solucion-Parcial-.. - Jupyter Notebook
11 pages
Materials Requirements Planning
100% (1)
Materials Requirements Planning
73 pages
E - 20240126 (Planning Mode MD01n)
No ratings yet
E - 20240126 (Planning Mode MD01n)
3 pages
H3C Series Ethernet Switches - Login Password Recovery
No ratings yet
H3C Series Ethernet Switches - Login Password Recovery
10 pages
Chapter 1 - Functions in C++
No ratings yet
Chapter 1 - Functions in C++
21 pages
Growatt Server API Guide
No ratings yet
Growatt Server API Guide
34 pages
Downlink Speech Quality
No ratings yet
Downlink Speech Quality
8 pages
IJCRT24052
No ratings yet
IJCRT24052
12 pages
3crxjk10075 13apr2005
No ratings yet
3crxjk10075 13apr2005
3 pages
Open Source Used in Cisco Packet Tracer Mobile iOS 2.1
No ratings yet
Open Source Used in Cisco Packet Tracer Mobile iOS 2.1
39 pages
Outsourcing Solutions for Businesses
No ratings yet
Outsourcing Solutions for Businesses
23 pages
Ethernet Gateway and Galaxy Software Connection Instructions
No ratings yet
Ethernet Gateway and Galaxy Software Connection Instructions
6 pages

1 Cluster Computing

Uploaded by

1 Cluster Computing

Uploaded by

Introduction to

• A computer cluster is a group of linked computers, working

• The components of a cluster are commonly, but not always,

• Clusters are usually deployed to improve performance and/or

High Speed Network Computing nodes - process the

compute compute compute compute File Front-end - monitor the cluster

As of November 2023, there are 60 MPP

• Cost-Effectiveness - commodity hardware, making it a cost-effective alternative to

Which CPU architecture to use?

Which node architecture to use?

• PC cluster deliver load balancing performance by distributing computational tasks

• Load balancing can be implemented at various levels, including application-level,

• Large number of nodes to share load

• Also called Beowulf cluster

• Built from commodity off-the-shelf hardware

• Applications like data mining, simulations, parallel processing, weather modelling,

Node Node Node

• Avoid downtime of services

• Avoid single point of failure

• Always with redundancy

• Almost all load balancing cluster are with HA capability

• Operation system: Linux (Redhat/CentOS, Ubuntu, etc), Unix.

• OpenSSH: - Most Linux distributions come with OpenSSH pre-

You may be prompted to enter your password or use SSH key-based

Copy public key to remote server

Use ssh-copy-id (usually in ~/.ssh/id_rsa.pub) to the

Secure your private key

Creating a strong passphrase

Password managers for convenience

• Download pem certificate, https://shorturl.at/acsK8

• ssh -i Private.pem ubuntu@185.127.66.38

ls - list current directory contents

cd <directory-to-change-to> - change the current directory

pwd - print full path of the current directory

• touch <new-file-name> - create an empty new file

• cp <file-to-copy> <destination> - copy a file

• mv <file-to-move> <destination/new-file-name> - move or

• rm <file-to-remove> - delete a file

chmod <whowhatwhich> <file-name> - change file

chgrp - change file/folder owner

• less <text-file> - see text file (exit with q)

• cat <file-name> - see file content

• head <file-name> - list ten first lines of the file

• tail -100 <file-name> - show the last 100 lines

vi <file-name> - (create and) open file with vi

• press i to switch to “edit mode”

• edit your file

• when done, press esc to switch to “normal mode”

• FTP server implementations enable simultaneous access by

• Clients use TCP reliable protocol to connect to a server.

• The FTP server process awaits connections and creates a

• The slave process accepts and handles a control connection

• The client contacts the server at a common port number (port

• client sends that port number across the control connection to

• LIST - list remote files

• MKD - make a remote directory

• PWD - print working directory

• QUIT - terminate the connection

• SIZE - return the size of a file

• USER - send username

You might also like