0% found this document useful (0 votes)
134 views22 pages

Randomized Algorithms Explained

1. A randomized algorithm uses random numbers to guide its behavior and decisions, making the output unpredictable for a given input. 2. Randomized algorithms are useful for reducing the running time or resource usage of standard algorithms on average. They provide good performance in the average case rather than guaranteeing the best solution. 3. There are two main types - Las Vegas algorithms always produce the correct solution but the running time varies, while Monte Carlo algorithms have a probability of being incorrect or not finishing within a set time.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
134 views22 pages

Randomized Algorithms Explained

1. A randomized algorithm uses random numbers to guide its behavior and decisions, making the output unpredictable for a given input. 2. Randomized algorithms are useful for reducing the running time or resource usage of standard algorithms on average. They provide good performance in the average case rather than guaranteeing the best solution. 3. There are two main types - Las Vegas algorithms always produce the correct solution but the running time varies, while Monte Carlo algorithms have a probability of being incorrect or not finishing within a set time.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Unit-5

1. Deterministic algorithm is an algorithm in which for given particular input it will always produce the same output, with the
underlying machine always passing through the same sequence of states as shown in figure 1.

1. To overcome the computation problem of exploitation of running time of a deterministic algorithm, randomized algorithm is used.
2. Randomized algorithm uses uniform random bits also called as pseudo random number as an input to guides its behavior
(Output).
3. Randomized algorithms rely on the statistical properties of random numbers. One example of a randomized algorithm is quicksort.
4. It tries to achieve good performance in the average case.
5. Randomized algorithm is also called as probabilistic algorithm.
6. Randomized algorithm flow steps is shown figure 2.

Example:
The problem of finding ‘a’ in an array of n elements:
In a given array of n elements let half elements are ‘a’ and that of half are ‘b’. One approach of the approach is to look at each element of the
array and this is an expensive one and will take n/2 operations, if the array were ordered as ‘b’s first followed by ‘a’s. With this approach we
cannot guarantee that the algorithm will complete quickly.
Using randomized algorithm, if we look randomly then we can find ‘a’ quickly with high probability.
Advantage of Randomized Algorithm:

1. The algorithm is usually simple and easy to implement.


2. The algorithm is fast with very high probability, and/or it produces optimum output with very high probability.

Uses:
Randomized algorithms are particularly useful when faced with a malicious attacker or “adversary”

What is a Randomized Algorithm?


An algorithm that uses random numbers to decide what to do next anywhere in its logic is called Randomized Algorithm.. For example, in
Randomized Quick Sort, we use random number to pick the next pivot (or we randomly shuffle the array). And in Karger’s algorithm, we
randomly pick an edge.
How to analyse Randomized Algorithms?
Some randomized algorithms have deterministic time complexity. For example, this implementation of Karger’s algorithm has time complexity
as O(E). Such algorithms are called Monte Carlo Algorithms and are easier to analyse for worst case.
On the other hand, time complexity of other randomized algorithms (other than Las Vegas) is dependent on value of random variable. Such
Randomized algorithms are called Las Vegas Algorithms. These algorithms are typically analysed for expected worst case. To compute
expected time taken in worst case, all possible values of the used random variable needs to be considered in worst case and time taken by
every possible value needs to be evaluated. Average of all evaluated times is the expected worst case time complexity.

For example consider below a randomized version of QuickSort.

A Central Pivot is a pivot that divides the array in such a way that one side has at-least 1/4 elements.
// Sorts an array arr[low..high]
randQuickSort(arr[], low, high)

1. If low >= high, then EXIT.

2. While pivot 'x' is not a Central Pivot.


(i) Choose uniformly at random a number from [low..high].
Let the randomly picked number number be x.
(ii) Count elements in arr[low..high] that are smaller
than arr[x]. Let this count be sc.
(iii) Count elements in arr[low..high] that are greater
than arr[x]. Let this count be gc.
(iv) Let n = (high-low+1). If sc >= n/4 and
gc >= n/4, then x is a central pivot.

3. Partition arr[low..high] around the pivot x.

4. // Recur for smaller elements


randQuickSort(arr, low, sc-1)

5. // Recur for greater elements


randQuickSort(arr, high-gc+1, high)
The important thing in our analysis is, time taken by step 2 is O(n).
How many times while loop runs before finding a central pivot?
The probability that the randomly chosen element is central pivot is 1/2.
Therefore, expected number of times the while loop runs is 2 (See this for details)
Thus, the expected time complexity of step 2 is O(n).
What is overall Time Complexity in Worst Case?
In worst case, each partition divides array such that one side has n/4 elements and other side has 3n/4 elements. The worst case height of
recursion tree is Log 3/4 n which is O(Log n).

T(n) < T(n/4) + T(3n/4) + O(n)


T(n) < 2T(3n/4) + O(n)

Solution of above recurrence is O(n Log n)


Note that the above randomized algorithm is not the best way to implement randomized Quick Sort. The idea here is to simplify the analysis
as it is simple to analyse.
Typically, randomized Quick Sort is implemented by randomly picking a pivot (no loop). Or by shuffling array elements. Expected worst case
time complexity of this algorithm is also O(n Log n), but analysis is complex

………………………………………………………………………………………………………………………………………
A randomized algorithm is a technique that uses a source of randomness as part of its logic. It is typically used to reduce either the running time,
or time complexity; or the memory used, or space complexity, in a standard algorithm. The algorithm works by generating a random number, rr, within
a specified range of numbers, and making decisions based on rr's value.

A randomized algorithm could help in a situation of doubt by flipping a coin or a drawing a card from a deck in order to make a decision. Similarly, this
kind of algorithm could help speed up a brute force process by randomly sampling the input in order to obtain a solution that may not be totally optimal,
but will be good enough for the specified purposes.

Example

A superintendent is attempting to score a high school based on several metrics, and she wants to do so from information gathered by confidentially
interviewing students. However, the superintendent has to do this with all the schools in the district, so interviewing every single student would take a
time she cannot afford. What should she do?

The superintendent should employ a randomized algorithm, where, without knowing any of the kids, she’d select a few at random and interview them,
hoping that she gets a wide variety of students. This technique is more commonly known as random sampling, which is a kind of randomized algorithm.
Of course, she knows that there are diminishing returns from each additional interview, and should stop when the quantity of data collected measures
what she was trying to measure to an acceptable degree of accuracy. The way that the superintendent is determining the score of the school can be
thought of as a randomized algorithm.

Definitions of a Randomized Algorithm


Randomized algorithms are used when presented with a time or memory constraint, and an average case solution is an acceptable output. Due to the
potential erroneous output of the algorithm, an algorithm known as amplification is used in order to boost the probability of correctness by sacrificing
runtime. Amplification works by repeating the randomized algorithm several times with different random subsamples of the input, and comparing their
results. It is common for randomized algorithms to amplify just parts of the process, as too much amplification may increase the running time beyond
the given constraints.

Randomized algorithms are usually designed in one of two common forms: as a Las Vegas or as a Monte Carlo algorithm. A Las Vegas algorithm runs
within a specified amount of time. If it finds a solution within that timeframe, the solution will be exactly correct; however, it is possible that it runs out of
time and does not find any solutions. On the other hand, a Monte Carlo algorithm is a probabilistic algorithm which, depending on the input, has a slight
probability of producing an incorrect result or failing to produce a result altogether.

Practically speaking, computers cannot generate completely random numbers, so randomized algorithms in computer science are approximated using
a pseudorandom number generator in place of a true source of random number, such as the drawing of a card

Monte Carlo Algorithms


Intuitive Explanation

The game, Wheel of Fortune, can be played using a Monte Carlo randomized algorithm. Instead of mindfully choosing letters, a player (or computer)
picks randomly letters to obtain a solution, as shown in the image below. The more letters a player reveals, the more confident a player becomes in
their solution. However, if a player does not guess quickly, the chance that other players will guess the solution also increases. Therefore, a Monte
Carlo algorithm is given a deterministic amount of time, in which it must come up with a "guess" based on the information revealed; the best solution it
can come up with. This allows for the possibility of being wrong, maybe even a large probability of being wrong if the Monte Carlo algorithm did not
have sufficient time to reveal enough useful letters. But providing it with a time limit controls the amount of time the algorithm will take, thereby
decreasing the risk of another player guessing and getting the prize. It is important to note, however, that this game differs from a standard Monte Carlo
algorithm as the game has one correct solution, whereas for a Monte Carlo algorithm, the 'good enough' solution is an acceptable output.

Las Vegas Algorithms


Intuition

A Las Vegas algorithm is a randomized algorithm that always produces a correct result, or simply doesn’t find one, yet it cannot not guarantee a time
constraint; the time complexity varies on the input. It does, however, guarantee an upper bound in the worst-case scenario.

Las Vegas algorithms occur almost every time people look something up. Think of a Las Vegas algorithm as a task that when the solution is found, it
has full confidence that it is the correct solution, yet the path to get there can be murky. As an extremely generalized and simplified example, a Las
Vegas algorithm could be thought of as the strategy used by a user who searches for something online. Since searching every single website online is
extremely inefficient, the user will generally use a search engine to get started. The user will then surf the web until a website is found which contains
exactly what the user is looking for. Since clicking through links is a decently randomized process, assuming the user does not know exactly what’s
contained on the website at the other end of the, the time complexity ranges from getting lucky and reaching the target website on the first link, to being
unlucky and spending countless hours to no avail. What makes this a Las Vegas algorithm is that the user knows exactly what she is looking for, so
once the website is found, there is no probability of error. Similarly, if the user’s allotted time to surf the web is exceeded, she will terminate the process
knowing that the solution was not found.

The term for this algorithm, Las Vegas, is attributed to mathematician Laszlo Babai, who coined it in 1979 simply as a parallel to the much older Monte
Carlo algorithm, as both are major world gambling centers. However, the gambling styles of the two have nothing to do with the styles of the
algorithms, as it cannot be said that gambling in Las Vegas always gives a correct, or even positive, turnout. [3]

Randomized Quicksort

A common Las Vegas randomized algorithm is quicksort, a sorting algorithm that sorts elements in place, using no extra memory. Since this is a
comparison based algorithm, the worst case scenario will occur when performing pairwise comparison, taking O(n^2)O(n2), where the time taken
grows as a square of the number of digits to be sorted grows. However, through randomization, the runtime of this algorithm can be reduced up
to O(n\ log(n))O(n log(n)).

Quicksort applies a divide-and-conquer paradigm in order to sort an array of numbers, AA. It works in three steps: it first picks a pivot
element, A[q]A[q], using a random number generator (hence a randomized algorithm); then rearranges the array into two subarrays A[p \dots q-
1]A[p…q−1] and A[q+1 \dots r]A[q+1…r], where the elements in the first and second arrays are smaller and greater than A[q]A[q], respectively.
Array  A  sorted and partitioned around pivot element q.

…………………………………………………………………………………………………………………………………………………………………………………………………………..

Classification
Randomized algorithms are classified in two categories.
Las Vegas: These algorithms always produce correct or optimum result. Time complexity of these algorithms is based on a random value
and time complexity is evaluated as expected value. For example, Randomized QuickSort always sorts an input array and expected worst
case time complexity of QuickSort is O(nLogn).
Monte Carlo: Produce correct or optimum result with some probability. These algorithms have deterministic running time and it is generally
easier to find out worst case time complexity. For example this implementation of Karger’s Algorithm produces minimum cut with probability
greater than or equal to 1/n2 (n is number of vertices) and has worst case time complexity as O(E). Another example is Fermet Method for
Primality Testing.
Example to Understand Classification:
Consider a binary array where exactly half elements are 0
and half are 1. The task is to find index of any 1.
A Las Vegas algorithm for this task is to keep picking a random element until we find a 1. A Monte Carlo algorithm for the same is to keep
picking a random element until we either find 1 or we have tried maximum allowed times say k.
The Las Vegas algorithm always finds an index of 1, but time complexity is determined as expect value. The expected number of trials
before success is 2, therefore expected time complexity is O(1).
The Monte Carlo Algorithm finds a 1 with probability [1 – (1/2) k]. Time complexity of Monte Carlo is O(k) which is deterministic
Applications and Scope:
 Consider a tool that basically does sorting. Let the tool be used by many users and there are few users who always use tool for
already sorted array. If the tool uses simple (not randomized) QuickSort, then those few users are always going to face worst case
situation. On the other hand if the tool uses Randomized QuickSort, then there is no user that always gets worst case. Everybody gets
expected O(n Log n) time.
 Randomized algorithms have huge applications in Cryptography.
 Load Balancing.
 Number-Theoretic Applications: Primality Testing
 Data Structures: Hashing, Sorting, Searching, Order Statistics and Computational Geometry.
 Algebraic identities: Polynomial and matrix identity verification. Interactive proof systems.
 Mathematical programming: Faster algorithms for linear programming, Rounding linear program solutions to integer program
solutions
 Graph algorithms: Minimum spanning trees, shortest paths, minimum cuts.
 Counting and enumeration: Matrix permanent Counting combinatorial structures.
 Parallel and distributed computing: Deadlock avoidance distributed consensus.
 Probabilistic existence proofs: Show that a combinatorial object arises with non-zero probability among objects drawn from a
suitable probability space.
 Derandomization: First devise a randomized algorithm then argue that it can be derandomized to yield a deterministic algorithm.
Need for RA
Above, finding the repeated value in array A
example of sorting
Here we are using random no. to pick the pivot value
Expected running time is finite

incorrect result may be and the running time is fixed


x means loop will run x times and n=x also possible

String matching algo.

String Matching Introduction


String Matching Algorithm is also called "String Searching Algorithm." This is a vital class of string algorithm is declared as "this is the method to find a
place where one is several strings are found within the larger string."

Given a text array, T [1.....n], of n character and a pattern array, P [1......m], of m characters. The problems are to find an integer s, called valid
shift where 0 ≤ s < n-m and T [s+1......s+m] = P [1......m]. In other words, to find even if P in T, i.e., where P is a substring of T. The item of P and T
are character drawn from some finite alphabet such as {0, 1} or {A, B .....Z, a, b..... z}.Given a string T [1......n], the substrings are represented as T
[i......j] for some 0≤i ≤ j≤n-1, the string formed by the characters in T from index i to index j, inclusive. This process that a string is a substring of itself
(take i = 0 and j =m).

The proper substring of string T [1......n] is T [1......j] for some 0<i ≤ j≤n-1. That is, we must have either i>0 or j < m-1.

Using these descriptions, we can say given any string T [1......n], the substrings are

1. T [i.....j] = T [i] T [i +1] T [i+2]......T [j] for some 0≤i ≤ j≤n-1.  

And proper substrings are

1. T [i.....j] = T [i] T [i +1] T [i+2]......T [j] for some 0≤i ≤ j≤n-1.  

Note: If i>j, then T [i.....j] is equal to the empty string or null, which has length zero.

Algorithms used for String Matching:


There are different types of method is used to finding the string
1. The Naive String Matching Algorithm
2. The Rabin-Karp-Algorithm
3. Finite Automata
4. The Knuth-Morris-Pratt Algorithm
5. The Boyer-Moore Algorithm

The Naive String Matching Algorithm


The naïve approach tests all the possible placement of Pattern P [1.......m] relative to text T [1......n]. We try shift s = 0, 1.......n-m, successively and for
each shift s. Compare T [s+1.......s+m] to P [1......m].

The naïve algorithm finds all valid shifts using a loop that checks the condition P [1.......m] = T [s+1.......s+m] for each of the n - m +1 possible value of
s.

NAIVE-STRING-MATCHER (T, P)
1. n ← length [T]
2. m ← length [P]
3. for s ← 0 to n -m
4. do if P [1.....m] = T [s + 1....s + m]
5. then print "Pattern occurs with shift" s
Analysis: This for loop from 3 to 5 executes for n-m + 1(we need at least m characters at the end) times and in iteration we are doing m comparisons.
So the total complexity is O (n-m+1).

Example:
1. Suppose T = 1011101110  
2.         P = 111  
3.        Find all the Valid Shift  

Solution:
The Rabin-Karp-Algorithm
The Rabin-Karp string matching algorithm calculates a hash value for the pattern, as well as for each M-character subsequences of text to be compared.
If the hash values are unequal, the algorithm will determine the hash value for next M-character sequence. If the hash values are equal, the algorithm
will analyze the pattern and the M-character sequence. In this way, there is only one comparison per text subsequence, and character matching is only
required when the hash values match.

RABIN-KARP-MATCHER (T, P, d, q)
1. n ← length [T]
2. m ← length [P]
3. h ← dm-1 mod q
4. p ← 0
5. t0 ← 0
6. for i ← 1 to m
7. do p ← (dp + P[i]) mod q
8. t0 ← (dt0+T [i]) mod q
9. for s ← 0 to n-m
10. do if p = ts
11. then if P [1.....m] = T [s+1.....s + m]
12. then "Pattern occurs with shift" s
13. If s < n-m
14. then ts+1 ← (d (ts-T [s+1]h)+T [s+m+1])mod q

Example: For string matching, working module q = 11, how many spurious hits does the Rabin-Karp matcher encounters in Text T = 31415926535.......

1.   T = 31415926535.......  
2.   P = 26  
3.  Here T.Length =11 so Q = 11      
4.  And P mod Q = 26 mod 11 = 4  
5. Now find the exact match of P mod Q...  

Solution:

Complexity:
The running time of RABIN-KARP-MATCHER in the worst case scenario O ((n-m+1) m but it has a good average case running time. If the expected
number of strong shifts is small O (1) and prime q is chosen to be quite large, then the Rabin-Karp algorithm can be expected to run in time O
(n+m) plus the time to require to process spurious hits.

String Matching with Finite Automata


The string-matching automaton is a very useful tool which is used in string matching algorithm. It examines every character in the text exactly once and
reports all the valid shifts in O (n) time. The goal of string matching is to find the location of specific text pattern within the larger body of text (a
sentence, a paragraph, a book, etc.)

Finite Automata:
A finite automaton M is a 5-tuple (Q, q0,A,∑δ), where

o Q is a finite set of states,


o q0 ∈ Q is the start state,
o A ⊆ Q is a notable set of accepting states,
o ∑ is a finite input alphabet,
o δ is a function from Q x ∑ into Q called the transition function of M.

The finite automaton starts in state q0 and reads the characters of its input string one at a time. If the automaton is in state q and reads input character
a, it moves from state q to state δ (q, a). Whenever its current state q is a member of A, the machine M has accepted the string read so far. An input
that is not allowed is rejected.

A finite automaton M induces a function ∅ called the called the final-state function, from ∑* to Q such that ∅(w) is the state M ends up in after scanning
the string w. Thus, M accepts a string w if and only if ∅(w) ∈ A.

The function f is defined as

∅ (∈)=q0
∅ (wa) = δ ((∅ (w), a) for w ∈ ∑*,a∈ ∑)
FINITE- AUTOMATON-MATCHER (T,δ, m),
1. n ← length [T]
2. q ← 0
3. for i ← 1 to n
4. do q ← δ (q, T[i])
5. If q =m
6. then s←i-m
7. print "Pattern occurs with shift s" s

The primary loop structure of FINITE- AUTOMATON-MATCHER implies that its running time on a text string of length n is O (n).

Computing the Transition Function: The following procedure computes the transition function δ from given pattern P [1......m]

COMPUTE-TRANSITION-FUNCTION (P, ∑)
1. m ← length [P]
2. for q ← 0 to m
3. do for each character a ∈ ∑*
4. do k ← min (m+1, q+2)
5. repeat k←k-1
6. Until
7. δ(q,a)←k
8. Return δ

Example: Suppose a finite automaton which accepts even number of a's where ∑ = {a, b, c}
Solution:

q0 is the initial state.


The Boyer-Moore Algorithm
Robert Boyer and J Strother Moore established it in 1977. The B-M String search algorithm is a particularly efficient algorithm and has served as a
standard benchmark for string search algorithm ever since.

The B-M algorithm takes a 'backward' approach: the pattern string (P) is aligned with the start of the text string (T), and then compares the characters of
a pattern from right to left, beginning with rightmost character.

If a character is compared that is not within the pattern, no match can be found by analyzing any further aspects at this position so the pattern can be
changed entirely past the mismatching character.

For deciding the possible shifts, B-M algorithm uses two preprocessing strategies simultaneously. Whenever a mismatch occurs, the algorithm calculates
a variation using both approaches and selects the more significant shift thus, if make use of the most effective strategy for each case.

The two strategies are called heuristics of B - M as they are used to reduce the search. They are:

1. Bad Character Heuristics


2. Good Suffix Heuristics

1. Bad Character Heuristics


This Heuristics has two implications:

o Suppose there is a character in a text in which does not occur in a pattern at all. When a mismatch happens at this character (called as bad
character), the whole pattern can be changed, begin matching form substring next to this 'bad character.'
o On the other hand, it might be that a bad character is present in the pattern, in this case, align the nature of the pattern with a bad character
in the text.

Thus in any case shift may be higher than one.

Example1: Let Text T = <nyoo nyoo> and pattern P = <noyo>


Example2: If a bad character doesn't exist the pattern then.
Complexity Classes
Definition of NP class Problem: - The set of all decision-based problems came into the division of NP Problems who can't be solved or produced an
output within polynomial time but verified in the polynomial time. NP class contains P class as a subset. NP problems being hard to solve.

Note: - The term "NP" does not mean "not polynomial." Originally, the term meant "non-deterministic polynomial. It means according to the one input
number of output will be produced.

Definition of P class Problem: - The set of decision-based problems come into the division of P Problems who can be solved or produced an output
within polynomial time. P problems being easy to solve

Definition of Polynomial time: - If we produce an output according to the given input within a specific amount of time such as within a minute, hours.
This is known as Polynomial time.

Definition of Non-Polynomial time: - If we produce an output according to the given input but there are no time constraints is known as Non-
Polynomial time. But yes output will produce but time is not fixed yet.

Definition of Decision Based Problem: - A problem is called a decision problem if its output is a simple "yes" or "no" (or you may need this of this as
true/false, 0/1, accept/reject.) We will phrase many optimization problems as decision problems. For example, Greedy method, D.P., given a graph G=
(V, E) if there exists any Hamiltonian cycle.

Definition of NP-hard class: - Here you to satisfy the following points to come into the division of NP-hard

1. If we can solve this problem in polynomial time, then we can solve all NP problems in polynomial time
2. If you convert the issue into one form to another form within the polynomial time

Definition of NP-complete class: - A problem is in NP-complete, if

1. It is in NP
2. It is NP-hard

Pictorial representation of all NP classes which includes NP, NP-hard, and NP-complete
Or

A problem is in the class NPC if it is in NP and is as hard as any problem in NP. A


problem is NP-hard if all problems in NP are polynomial time reducible to it, even
though it may not be in NP itself.

If a polynomial time algorithm exists for any of these problems, all problems in NP
would be polynomial time solvable. These problems are called NP-complete. The
phenomenon of NP-completeness is important for both theoretical and practical
reasons.

Definition of NP-Completeness

A language B is NP-complete if it satisfies two conditions


 B is in NP

 Every A in NP is polynomial time reducible to B.

If a language satisfies the second property, but not necessarily the first one, the
language B is known as NP-Hard. Informally, a search problem B is NP-Hard if there
exists some NP-Complete problem A that Turing reduces to B.
The problem in NP-Hard cannot be solved in polynomial time, until P = NP. If a
problem is proved to be NPC, there is no need to waste time on trying to find an
efficient algorithm for it. Instead, we can focus on design approximation algorithm.

NP-Complete Problems

Following are some NP-Complete problems, for which no polynomial time algorithm is
known.
 Determining whether a graph has a Hamiltonian cycle
 Determining whether a Boolean formula is satisfiable, etc.

NP-Hard Problems

The following problems are NP-Hard


 The circuit-satisfiability problem
 Set Cover
 Vertex Cover
 Travelling Salesman Problem

In this context, now we will discuss TSP is NP-Complete


TSP is NP-Complete

The traveling salesman problem consists of a salesman and a set of cities. The
salesman has to visit each one of the cities starting from a certain one and returning to
the same city. The challenge of the problem is that the traveling salesman wants to
minimize the total length of the trip

Proof

To prove TSP is NP-Complete, first we have to prove that TSP belongs to NP. In


TSP, we find a tour and check that the tour contains each vertex once. Then the total
cost of the edges of the tour is calculated. Finally, we check if the cost is minimum.
This can be completed in polynomial time. Thus TSP belongs to NP.
Secondly, we have to prove that TSP is NP-hard. To prove this, one way is to show
that Hamiltonian cycle ≤  TSP (as we know that the Hamiltonian cycle problem is
p

NPcomplete).
Assume G = (V, E) to be an instance of Hamiltonian cycle.
Hence, an instance of TSP is constructed. We create the complete graph G  = (V, E ), ' '

where
E′={(i,j):i,j∈Vandi≠jE′={(i,j):i,j∈Vandi≠j

Thus, the cost function is defined as follows −


t(i,j)={01if(i,j)∈Eotherwiset(i,j)={0if(i,j)∈E1otherwise

Now, suppose that a Hamiltonian cycle h exists in G. It is clear that the cost of each
edge in h is 0 in G  as each edge belongs to E. Therefore, h has a cost of 0 in G . Thus,
' '

if graph G has a Hamiltonian cycle, then graph G  has a tour of 0 cost.


'

Conversely, we assume that G  has a tour h  of cost at most 0. The cost of edges
' '

in E  are 0 and 1 by definition. Hence, each edge must have a cost of 0 as the cost
'

of h  is 0. We therefore conclude that h  contains only edges in E.


' '

We have thus proven that G has a Hamiltonian cycle, if and only if G  has a tour of cost '

at most 0. TSP is NP-complete.

P & NP Classes

In Computer Science, many problems are solved where the objective is to maximize or
minimize some values, whereas in other problems we try to find whether there is a
solution or not. Hence, the problems can be categorized as follows −
Optimization Problem

Optimization problems are those for which the objective is to maximize or minimize
some values. For example,
 Finding the minimum number of colors needed to color a given graph.

 Finding the shortest path between two vertices in a graph.

Decision Problem

There are many problems for which the answer is a Yes or a No. These types of
problems are known as decision problems. For example,
 Whether a given graph can be colored by only 4-colors.

 Finding Hamiltonian cycle in a graph is not a decision problem, whereas checking a graph is Hamiltonian or not is a decision problem.

What is Language?

Every decision problem can have only two answers, yes or no. Hence, a decision
problem may belong to a language if it provides an answer ‘yes’ for a specific input. A
language is the totality of inputs for which the answer is Yes. Most of the algorithms
discussed in the previous chapters are polynomial time algorithms.
For input size n, if worst-case time complexity of an algorithm is O(n ), where k is a k

constant, the algorithm is a polynomial time algorithm.


Algorithms such as Matrix Chain Multiplication, Single Source Shortest Path, All Pair
Shortest Path, Minimum Spanning Tree, etc. run in polynomial time. However there are
many problems, such as traveling salesperson, optimal graph coloring, Hamiltonian
cycles, finding the longest path in a graph, and satisfying a Boolean formula, for which
no polynomial time algorithms is known. These problems belong to an interesting class
of problems, called the NP-Complete problems, whose status is unknown.
In this context, we can categorize the problems as follows −

P-Class

The class P consists of those problems that are solvable in polynomial time, i.e. these
problems can be solved in time O(n ) in worst-case, where k is constant.
k

These problems are called tractable, while others are called intractable or


superpolynomial.
Formally, an algorithm is polynomial time algorithm, if there exists a
polynomial p(n) such that the algorithm can solve any instance of size n in a
time O(p(n)).
Problem requiring Ω(n ) time to solve are essentially intractable for large n. Most
50

known polynomial time algorithm run in time O(n ) for fairly low value of k. k
The advantages in considering the class of polynomial-time algorithms is that all
reasonable deterministic single processor model of computation can be simulated
on each other with at most a polynomial slow-d

NP-Class

The class NP consists of those problems that are verifiable in polynomial time. NP is
the class of decision problems for which it is easy to check the correctness of a
claimed answer, with the aid of a little extra information. Hence, we aren’t asking for a
way to find a solution, but only to verify that an alleged solution really is correct.
Every problem in this class can be solved in exponential time using exhaustive search.

P versus NP

Every decision problem that is solvable by a deterministic polynomial time algorithm is


also solvable by a polynomial time non-deterministic algorithm.
All problems in P can be solved with polynomial time algorithms, whereas all problems
in NP - P are intractable.
It is not known whether P = NP. However, many problems are known in NP with the
property that if they belong to P, then it can be proved that P = NP.
If P ≠ NP, there are problems in NP that are neither in P nor in NP-Complete.
The problem belongs to class P if it’s easy to find a solution for the problem. The
problem belongs to NP, if it’s easy to check a solution that may have been very tedious
to find.
Approximate Algorithms

Introduction:
An Approximate Algorithm is a way of approach NP-COMPLETENESS for the optimization problem. This technique does not guarantee the best solution.
The goal of an approximation algorithm is to come as close as possible to the optimum value in a reasonable amount of time which is at the most
polynomial time. Such algorithms are called approximation algorithm or heuristic algorithm.

o For the traveling salesperson problem, the optimization problem is to find the shortest cycle, and the approximation problem is to find a short
cycle.
o For the vertex cover problem, the optimization problem is to find the vertex cover with fewest vertices, and the approximation problem is to
find the vertex cover with few vertices.

Performance Ratios
Suppose we work on an optimization problem where every solution carries a cost. An Approximate Algorithm returns a legal solution, but the cost of that
legal solution may not be optimal.

      For Example, suppose we are considering for a minimum size vertex-cover (VC). An approximate algorithm returns a VC for us, but the size (cost)
may not be minimized.

Another Example is we are considering for a maximum size Independent set (IS). An approximate Algorithm returns an IS for us, but the size (cost)
may not be maximum. Let C be the cost of the solution returned by an approximate algorithm, and C* is the cost of the optimal solution.

We say the approximate algorithm has an approximate ratio P (n) for an input size n, where
Intuitively, the approximation ratio measures how bad the approximate solution is distinguished with the optimal solution. A large (small) approximation
ratio measures the solution is much worse than (more or less the same as) an optimal solution.

      Observe that P (n) is always ≥ 1, if the ratio does not depend on n, we may write P. Therefore, a 1-approximation algorithm gives an optimal
solution. Some problems have polynomial-time approximation algorithm with small constant approximate ratios, while others have best-known
polynomial time approximation algorithms whose approximate ratios grow with n.

You might also like