[SUMEDH.
V]
[R17CS416]
DAA Assingment-1
Group:-1
1) PUBLIC-KEY CRYPTOGRAPHY
Public-key cryptography is also known as asymmetric-key
cryptography, to distinguish it from the symmetric-key
cryptography we have studied thus far.
Encryption and decryption are carried out using two different keys.
The two keys in such a key pair are referred to as the public key
and the private key.
With public key cryptography, all parties interested in secure
communications publish their public keys. [As to how that is done
depends on the protocol. In the SSH protocol, each server makes
available through its port 22 the public key it has stored for your
login id on the server. (See Section 12.10 for how an SSHD server
acquires the public key that the server would associate with your
login ID so that you can make a password-free connection with the
server. In the context of the security made possible by the SSH
protocol, the public key held by a server is commonly referred to as
the server’s host key.) When a client, such as your laptop, wants to
make a connection with an SSHD server, it sends a connection
request to port 22 of the server machine and the server makes its
host key available automatically. On the other hand, in the
SSL/TLS protocol, an HTTPS web server makes its public key
available through a certificate of the sort]
[SUMEDH.V]
[R17CS416]
RSA Algorithm for public key
The RSA algorithm is named after Ron Rivest, Adi Shamir, and
Leonard Adleman. The public-key cryptography that was made
possible by this algorithm was foundational to the e-commerce
revolution that followed.
• The starting point for learning the RSA algorithm is Euler’s
Theorem that was presented in. That theorem states that for every
positive integer n and every a that is coprime to n, the following
must be true a φ(n) ≡ 1 (mod n) where, φ(n) is the totient of n.
• An immediate consequence of this theorem is that, when a and n
are relatively prime, the exponents will behave modulo the totient
φ(n) in exponentiated forms like a k mod n.
• That is, if a and n are relatively prime, the following must be true
for some k1 and k2: a k ≡ a k1·φ(n)+k2 ≡ a k1·φ(n) a k2 ≡ a k2
(mod n).
• For example, consider a = 4 in arithmetic modulo 15. The totient
of 15 is 8. (Since 15 = 3 × 5, we have φ(15) = 2 × 4 = 8.) You can
easily verify the following: 4 7 · 4 4 mod 15 = 4(7+4) mod 8 mod
15 = 43 mod 15 = 64 mod 15 = 4 (43 ) 5 mod 15 = 4(3×5) mod 8
mod 15 = 47 mod 15 = 4 Note that in both cases the base of the
exponent, 4, is coprime to the modulus 15.
• The relationship shown above has some incredible ramifications
that point to practical possibilities: To see what I mean, say that M
is an integer that represents a message (note that any bit string in
the memory of a computer represents some integer, no matter how
large). Let’s now conjure up two integers e and d that are each
other’s multiplicative inverses modulo the totient φ(n). Assume
again that M is coprime to the modulus n. Since the exponents of
[SUMEDH.V]
[R17CS416]
M are going to behave modulo the totient φ(n), the following must
be true.
Me×d ≡ Me×d (mod φ(n)) ≡ M (mod n)
• The result shown above, which follows directly from Euler’s
theorem, requires that M and n be coprime. However, as will be
shown in Section 12.2.3, when n is a product of two primes p and
q, this result applies to all M, 0 ≤ M < n. In what follows, let’s now
see how this idea can be used for message encryption and
decryption.
• Considering arithmetic modulo n, let’s say that e is an integer
that is coprime to the totient φ(n) of n. Further, say that d is the
multiplicative inverse of e modulo φ(n). These definitions of the
various symbols are listed below for convenience: n = a modulus
for modular arithmetic φ(n) = the totient of n e = an integer that is
relatively prime to φ(n) [T his guarantees that e will possess a
multiplicative inverse modulo φ(n)] d = an integer that is the
multiplicative.
inverse of e modulo φ(n)
• Now suppose we are given an integer M, 0 ≤ M < n, that
represents our message, then we can transform M into another
integer C that will represent our ciphertext by the following
modulo exponentiation: C = Me mod n.
• We can recover back M from C by the following modulo
operation M = C d mod n since (Me ) d (mod n) = Med (mod φ(n))
≡ M (mod n).
[SUMEDH.V]
[R17CS416]
3) Amortized efficiency :-
In computer science, amortized analysis is a method
for analysing a given algorithm's complicity, or how much of a
resource, especially time or memory, it takes to execute. The
motivation for amortized analysis is that looking at the worst-
case run time per operation, rather than per algorithm, can be
too pessimistic.
While certain operations for a given algorithm may have a
significant cost in resources, other operations may not be as
costly. Amortized analysis considers both the costly and less
costly operations together over the whole series of operations of
the algorithm. This may include accounting for different types
of input, length of the input, and other factors that affect its
performance.
Amortized analysis requires knowledge of which series of
operations are possible. This is most commonly the case with
data structures, which have state that persists between
operations. The basic idea is that a worst case operation can alter
the state in such a way that the worst case cannot occur again for
a long time, thus "amortizing" its cost.
There are generally three methods for performing amortized
analysis: the aggregate method, the accounting method, and
the potential method. All of these give correct answers; the
choice of which to use depends on which is most convenient for
a particular situation.
Aggregate analysis determines the upper bound T(n) on the total
cost of a sequence of n operations, then calculates the amortized
cost to be T(n) / n.
The accounting method is a form of aggregate analysis which
assigns to each operation an amortized cost which may differ from
its actual cost. Early operations have an amortized cost higher than
their actual cost, which accumulates a saved "credit" that pays for
[SUMEDH.V]
[R17CS416]
later operations having an amortized cost lower than their actual cost.
Because the credit begins at zero, the actual cost of a
sequence of operations equals the amortized cost minus the accumulated
credit. Because the credit is required to be non-negative, the amortized cost
is an upper bound on the actual cost. Usually, many short-running operations
accumulate such a credit in small increments, while rare long-running
operations decrease it drastically.The potential method is a form of the
accounting method where the saved credit is computed as a function (the
"potential") of the state of the data structure. The amortized cost is the
immediate cost plus the change in potential.
• For some algorithms, we can't get a complete understanding of their
applied efficiency by just looking at a single action at a time.
• For an example, let's consider the difference between arrays in C++
and lists in Python...
Dynamic array:-
• A dynamic array is an array that will automatically grow as items
are added.
• It has both a capacity and a number of elements.
• If an item is added to a dynamic array with more capacity than
elements, the new item can be added directly.
• Otherwise, needs to allocate new storage, copy elements into new
array, and add to the resulting new array.
5) Analysis of quicksort using best,average,best
cases:-
Suppose that we're really unlucky and the partition sizes are really
unbalanced. In particular, suppose that the pivot chosen by the
partition function is always either the smallest or the largest element
in the n-element subarray. Then one of the partitions will contain no
elements and the other partition will contain n-1 elements all but the
pivot. So the recursive calls will be on sub arrays of sizes 0 and n-1.
As in merge sort, the time for a given recursive call on an n-element
sub array is Theta(n)Θ(n). In merge sort, that was the time for
merging, but in quicksort it's the time for partitioning.
Worst-case:-
When quicksort always has the most unbalanced partitions possible,
then the original call takes cn, n time for some constant c, the
recursive call on n-1 elements takes c(n-1)c(n−1)c,n-2 elements
takes c(n-2)and so on. Here's a tree of the subproblem sizes with their
partitioning times:
Diagram of worst case performance for Quick Sort
When we total up the partitioning times for each level, we get
cn+c(n−1)+c(n−2)+⋯+2c=c(n+(n−1)+(n−2)+⋯+2)=c((n+1)(n/2)−1) .
The last line is because 1 + 2 + 3 + \cdots + n1+2+3+⋯+n is the arithmetic
series, as we saw when we analyzed selection sort. (We subtract 1 because
for quicksort, the summation starts at 2, not 1.) We have some low-order
terms and constant coefficients, but when we use big-Θ notation, we ignore
them. In big-Θ notation, quicksort's worst-case running time
is Theta(n^2)Θ(n2).
Best-case
Quicksort's best case occurs when the partitions are as evenly
balanced as possible: their sizes either are equal or are within 1 of
each other. The former case occurs if the sub array has an odd number
of elements and the pivot is right in the middle after partitioning, and
each partition has (n-1)/2(n−1)elements. The latter case occurs if the
sub array has an even number n of elements and one partition
has n/2 elements with the other having n/2-1. In either of these cases,
each partition has at most n/2 elements, and the tree of sub problem
sizes looks a lot like the tree of sub problem sizes for merge sort, with
the partitioning times looking like the merging times:
Using big-Θ notation, we get the same result as for merge
sort: Theta(n \log_2 n)Θ(nlog2n).
Average-case
Showing that the average-case running time is also Theta(n log2
n)Θ(nlog2n) takes some pretty involved mathematics, and so we
won't go there. But we can gain some intuition by looking at a couple
of other cases to understand why it might be O(n \log2 n)O(nlog2n)O,
bound follows because the average-case running time cannot be better
than the best-case running time.) First, let's imagine that we don't
always get evenly balanced partitions, but that we always get at worst
a 3-to-1 split. That is, imagine that each time we partition, one side
gets 3n/4 elements and the other side (To keep the math clean, let's
not worry about the pivot.) Then the tree of sub problem sizes and
partitioning times would look like this:
The left child of each node represents a sub problem size 1/4 as large,
and the right child represents a sub problem size 3/4 as large. Since
the smaller sub problems are on the left, by following a path of left
children, we get from the root down to a sub problem size of 1 faster
than along any other path. As the figure shows.
Comparing Growth Rate for merge sort
Given that the merge function runs in Theta(n)Θ(n) time when
merging n elements, how do we get to showing that merge sort runs
in Theta(n \log_2 n)Θ(nlog2n) time. We start by thinking about the
three parts of divide-and-conquer and how to account for their
running times. We assume that we're sorting a total of n elements in
the entire array.
1. The divide step takes constant time, regardless of the sub array size.
After all, the divide step just computes the midpoint q of the
indices p and r. Recall that in big-Θ notation, we indicate constant
time by Theta(1)Θ(1).
2. The conquer step, where we recursively sort two sub arrays of
approximately n/2elements each, takes some amount of time, but we'll
account for that time when we consider the sub problems
The combine step merges a total of n elements,
taking Theta(n)Θ(n) time.
If we think about the divide and combine steps together,
the Theta(1)Θ(1) running time for the divide step is a low-order term
when compared with the Theta(n)Θ(n)running time of the combine
step. So let's think of the divide and combine steps together as
taking Theta(n)Θ(n) time. To make things more concrete, let's say that
the divide and combine steps together take cn, n time for some
constant c.
To keep things reasonably simple, let's assume that if n>1, is
greater than, 1, then n is always even, so that when we need to
think about n/2 it's an integer. (Accounting for the case in
which n is odd doesn't change the result in terms of big-Θ
notation.) So now we can think of the running time of merge
sort on an n-element sub array as being the sum of twice the
running time of merge sort on an (n/2)element sub array (for the
conquer step) plus cn, n (for the divide and combine steps—
really for just the merging).
Now we have to figure out the running time of two recursive calls
on n/2n elements. Each of these two recursive calls takes twice of the
running time of merge sort on an (n/4)(n/4) element sub array
(because we have to halve n/2 plus cn/2 to merge. We have two sub
problems of size n/2n and each takes cn/2c, time to merge, and so the
total time we spend merging for sub problems of size n/2n is 2. cn/2 =
cn2⋅cn/2=cn
Let's draw out the merging times in a "tree":
4)a) Longest Common Subsequence
If a set of sequences are given, the longest common subsequence
problem is to find a common subsequence of all the sequences that is of
maximal length.
The longest common subsequence problem is a classic computer science
problem, the basis of data comparison programs such as the diff-utility,
and has applications in bioinformatics. It is also widely used by revision
control systems, such as SVN and Git, for reconciling multiple changes
made to a revision-controlled collection of files.
This algorithm will print the longest common subsequence of X and Y.
4)b)Analysis
To populate the table, the outer for loop iterates m times and the
inner forloop iterates n times. Hence, the complexity of the
algorithm is O(m, n), where m and n are the length of two strings.
Example
In this example, we have two strings X = BACDB and Y = BDCB to
find the longest common subsequence.
Following the algorithm LCS-Length-Table-Formulation (as stated
above), we have calculated table C (shown on the left hand side) and
table B (shown on the right hand side).
In table B, instead of ‘D’, ‘L’ and ‘U’, we are using the diagonal
arrow, left arrow and up arrow, respectively. After generating table
B, the LCS is determined by function LCS-Print. The result is BCB.