0% found this document useful (0 votes)

76 views23 pages

Matrix Algebra Solution

The document provides solutions and hints for selected exercises from 'Matrix Algebra' by James E. Gentle, covering various topics such as linear independence, vector norms, and properties of matrices. It includes detailed explanations and examples for specific exercises, demonstrating mathematical proofs and computations. The exercises range from basic concepts to more advanced applications in matrix algebra.

Uploaded by

David Dinh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

76 views23 pages

Matrix Algebra Solution

Uploaded by

David Dinh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Solutions and Hints for Selected Exercises

Matrix Algebra by James E. Gentle

Exercises Beginning on Page 63

2.1a. Assume {v1 , v2 , . . . , vk } is a linearly independent set of vectors. Now
consider the pair of vectors vi and vk . If {vi , vj } is not linearly indepen-
dent, then there exist ai and aj , not both 0, such that ai vi + aj vj = 0.
In that case, however, we would have

a1 v1 + · · · + ai vi + · · · + aj vj + · · · + ak vk = 0,

in which one ai , aj 6= 0, contradicting the condition that the full set is

linearly independent.
2.1b. Let v1 = (1, 1, 0), v2 = (1, 0, 1), and v3 = (2, 1, 1). Now {v1 , v2 }, {v1 , v3 },
and {v2 , v3 } are all linearly independent sets. We can see this because
each pair has a vector with a zero that is unmatched by the other vector.
The full set, however, is not linearly independent because v1 +v2 −v3 = 0.
2.4b. Let one vector space consist of all vectors of the form (a, 0) and the
other consist of all vectors of the form (0, b). The vector (a, b) is not in
the union if a 6= 0 and b 6= 0.
2.9. Give a counterexample to the triangle inequality; for example, let x =
(9, 25) and y = (16, 144).
2.12a. We first observe that if kxkp = 0 or kykq = 0, we have x = 0 or y = 0,
and so the inequality is satisfied because both sides are 0; hence, we need
only consider the case kxkp > 0 and kykq > 0. We also observe that if
p = 1 or q = 1, we have the Manhattan and Chebyshev norms and the
inequality is satisfied; hence we need only consider the case 1 < p < ∞.
Now, for p and q as given, for any numbers ai and bi , there are numbers
si and ti such that |ai | = esi /p and |bi | = eti /q . Because ex is a convex
function, we have esi /p+ti /q ≤ p1 esi + 1q eti , or

ai bi ≤ |ai ||bi | ≤ |ai |p /p + |bi |q /q.

Now let
xi yi
ai = and bi = ,
kxkp kykq
and so
xi yi 1 |xi |p 1 |yi |q
≤ p + .
kxkp kykq p kxkp q kykqq
Now, summing these equations over i, we have

hx, yi 1 kxkpp 1 kykqq

≤ +
kxkp kykq p kxkpp q kykqq
= 1.
2

Hence, we have the desired result.

As we see from this proof, the inequality is actually a little stronger
than stated. If we define u and v by ui = |xi | and vi = |yi |, we have

hx, yi ≤ hu, vi ≤ kxkpkykq .

We observe that equality occurs if and only if

1q p1
|xi | |yi |
=
kxkp kykq

and
sign(xi ) = sign(yi )
for all i.
We note a special case by letting y = 1:

x̄ ≤ kxkp,

and with p = 2, we have a special case of the Cauchy-Schwarz inequality,

nx̄2 ≤ kxk22,

which guarantees that V(x) ≥ 0.

2.12b. Using the triangle inequality for the absolute value, we have |xi + yi | ≤
|xi | + |yi |. This yields the result for p = 1 and p = ∞ (in the limit).
Now assume 1 < p < ∞. We have
n
X n
X
kx + ykpp ≤ |xi + yi |p−1 |xi | + |xi + yi |p−1 |yi |.
i=1 i=1

Now, letting q = p/(p − 1), we apply Hölder’s inequality to each of the

terms on the right:

n n
! q1 n
! p1
X X X
p−1 (p−1)q p
|xi + yi | |xi | ≤ |xi + yi | |xi |
i=1 i=1 i=1

and
n n
! q1 n
! p1
X X X
p−1 (p−1)q p
|xi + yi | |xi | ≤ |xi + yi | |yi | ,
i=1 i=1 i=1

so
n n
! q1  n ! p1 n
! p1 
X X X X
|xi +yi |p ≤ |xi + yi | (p−1)q  |xi | p
+ |yi |p 
i=1 i=1 i=1 i=1
3

1
or, because (p − 1)q = p and 1 − q = p1 ,

n
! p1 n
! p1 n
! p1
X X X
|xi + yi |p ≤ |xi |p + |yi |p ,
i=1 i=1 i=1

which is the same as

kx + ykp ≤ kxkp + kykp ,

the triangle inequality.
2.21e. In IR3 ,
kx × yk
angle(x, y) = sin−1 .
kxk ky|
Because x × y = −y × x, this allows us to determine the angle from x
to y; that is, the direction within (−π, π] in which x would be rotated
to y.
2.23. Just consider the orthogonal vectors x = (1, 0) and y = (0, 1). The
centered vectors are xc = ( 21 , − 21 ) and yc = (− 12 , 21 ). The angle between
the uncentered vectors is π/2, while that between the centered vectors
is π.
2.26a.

x <- c(1,2,3)
y <- c(-1,2,-3)
sum(x*y)
# [1] -6

2.26b.

Dot <- function(x, y){

return(sum(x*y))
}
Dot(x, y)
# [1] -6

2.26c.

"%.%" <- function(x, y){

return(sum(x*y))
}
x %.% y
# [1] -6
4

2.27.

x <- c(1,2,3)
y <- c(-1,2,-3)
c(x[2]*y[3]-x[3]*y[2],
x[3]*y[1]-x[1]*y[3],
x[1]*y[2]-x[2]*y[1])
# [1] -12 0 4

2.28.

x <- c(1,2,3)
sum(abs(x))
# [1] 6
sqrt(sum(x*x))
# [1] 3.741657
max(abs(x))
# [1] 3

2.29.

x <- c(1,2,3)
y <- x/sum(x*x)
y
# [1] 0.07142857 0.14285714 0.21428571
sum(x*y)
# [1] 1

2.30a.

x <- c(1,2,3)
y <- c(-1,2,-3)
ypx <- sum(x*y)*x/sum(x*x)
ypx
# [1] -0.4285714 -0.8571429 -1.2857143

2.30b.

c(x[2]*ypx[3]-x[3]*ypx[2],
x[3]*ypx[1]-x[1]*ypx[3],
5

x[1]*ypx[2]-x[2]*ypx[1])
# [1] -4.440892e-16 2.220446e-16 0.000000e+00

The actual cross product is (0, 0, 0). In the solution above, two computed
zeroes are of order −16, which is essentially an additive zero.
2.31a.

x <- c(1,2,3)
y <- c(-1,2,-3)
z <- c(1,0,-3)
u1 <- x; u2 <- y; u3 <- z
u1 <- u1/sqrt(sum(u1*u1))
u2 <- u2 - sum(u1*u2)*u1
u3 <- u3 - sum(u1*u3)*u1
u2 <- u2/sqrt(sum(u2*u2))
u3 <- u3 - sum(u2*u3)*u2
u3 <- u3/sqrt(sum(u3*u3))

2.31b.

w <- c(5,4,3)
c1 <-sum(w*u1)
c2 <-sum(w*u2)
c3 <-sum(w*u3)
c1*u1 + c2*u2 + c3*u3
# [1] 5 4 3

2.32.

n <- 100
set.seed(12345)
u <- rnorm(n)
z <- rnorm(n)
v <- u+z

2.32a.

mu <- mean(u)
vu <- var(u)
mv <- mean(v)
vv <- var(v)
6

round(c(mu,vu,mv,vv),2)
# [1] 0.25 1.24 0.29 2.50

2.32b.

covuv <- cov(u,v)

coruv <- covuv/sqrt(vu*vv)
round(c(covuv,coruv),2)
# [1] 1.36 0.77

Exercises Beginning on Page 203

3.4. This exercise occurs in various guises in many different places, and the
simple approach is to add and subtract x̄:

(x − a)T (x − a) = (x − x̄ − (a + x̄))T (x − x̄ − (a + x̄))

= (xc − (a + x̄))T (xc − (a + x̄))
= xT T T
c xc + (a + x̄) (a + x̄)) − 2(a + x̄) xc
= xT 2
c xc + n(a + x̄) .

Finally, we get the expressions in equation (3.99) by writing xT

c xc as
tr(xc xT
c ).
3.16. Write the n × m matrix A as

A = (aij ) = [a∗1 , . . . , a∗m ].

If A is of rank one, the maximum number of linearly independent

columns is one; hence, for k = 2, . . . , m, a∗k = ck a∗1 , for some ck .
Now let x = a∗1 , which is an n-vector, and let y be an m-vector whose
first element is 1 and whose k = 2, . . . , m elements are the ck s. We see
that A = xyT by direct multiplication.
This decomposition is not unique, of course.
3.17. If the elements of the square matrix A are integers then each cofactor
a(ij) is an integer, and hence the elements of adj(A) are integers. If
|A| = ±1, then A−1 = ±adj(A), and so the elements of A−1 are integers.
An easy way to form a matrix whose determinant is ±1 is to form an
upper triangular matrix with either 1 or −1 on the diagonal. A more
“interesting matrix” that has the same determinant can then be formed
by use of elementary operations. (People teaching matrix algebra find
this useful!)
3.18. Because the inverse of a matrix is unique, we can verify each equation
by multiplication by the inverse at the appropriate place. We can of-
ten reduce an expression to a desired form by multiplication by the
7

identity M M −1 , where M is some appropriate matrix. For example,

equation (3.187) can be verified by the equations

(A + B)(A−1 − A−1 (A−1 + B −1 )−1 A−1 ) =

−1 −1 −1 −1 −1 −1 −1 −1 −1 −1
I − (A + B ) A + BA − BA (A + B ) A =
I + BA−1 − (I + BA−1 )(A−1 + B −1 )−1 A−1 =
(I + BA−1 )(I − (A−1 + B −1 )−1 A−1 ) =
B(B −1 + A−1 )(I − (A−1 + B −1 )−1 A−1 ) =
B(B −1 + A−1 ) − BA−1 = I

3.21. Express the nonzero elements of row i in the n × n matrix A as ai bk

for k = i, . . . , n, and the nonzero elements of column j as ak bj for
k = j, . . . , n. Then obtain expressions for the elements of A−1 . Show,
for example, that the diagonal elements of A−1 are (ai bi )−1 .
3.25. For property 8, let c be a nonzero eigenvalue of AB. Then there exists
v (6= 0) such that ABv = cv, that is, BABv = Bcv. But this means
BAw = cw, where w = Bv 6= 0 (because ABv 6= 0) and so c is an
eigenvalue of BA. We use the same argument starting with an eigenvalue
of BA. For square matrices, there are no other eigenvalues, so the set
of eigenvalues is the same.
For property 9, see the discussion of similarity transformations on
page 164.
3.37. Let A and B be such that AB is defined.
2
X X
kABk2F = aik bkj
ij k
! !
X X X
≤ a2ik b2kj (Cauchy-Schwarz)
ij k k
  
X X
= a2ik   b2kj 
i,k k,j

= kAk2F kBk2F .

3.40. Hints.
For inequality (3.325), use the Cauchy-Schwartz inequality.
For inequality (3.327), use Hölder’s inequality.
3.45a.

Epqa <- function(n,p=0,q=0,a=0){

# Function to create an elementary operator matrix.
# The operator matrix may be E_pq, E_p(a), or E_pq(a).
Epqa <- NA
8

if (n<2){
print(’n must be at least 2’)
return()
}
perm <- FALSE
mult <- FALSE
axpy <- FALSE
if (p>0&q>0&a==0) perm <- TRUE
if (p>0&q==0) mult <- TRUE
if (p>0&q>0&a!=0) axpy <- TRUE
if (perm+mult+axpy==0){
print(’No valid elementary operator matrix’)
return()
}
# Set Epqa to the identity
Epqa <- matrix(c(1,rep(c(rep(0,n),1),n-1)),ncol=n)
if (perm){ # permute rows
Epqa[p,p] <- 0
Epqa[p,q] <- 1
Epqa[q,q] <- 0
Epqa[q,p] <- 1
} else
if (mult){ # multiply one row by a
Epqa[p,p]=a
} else
if (axpy){ # axpy
Epqa[p,q] <- a
}
return(Epqa)
}

3.45b. Epq is its own inverse.

n <- 5
p <- 2
q <- 4
E <- Epqa(n,p,q)
E%*%E

3.45c. For a 6= 0, Ep (1/a)Ep(a) = I.

n <- 5
p <- 2
9

q <- 0
a <- 3
E1 <- Epqa(n,p,q,a)
E2 <- Epqa(n,p,q,1/a)
E1%*%E2

3.45d. An axpy operation, Epq (−a)Epq (a) = I.

n <- 5
p <- 2
q <- 4
a <- 3
E1 <- Epqa(n,p,q,a)
E2 <- Epqa(n,p,q,-a)
E1%*%E2

Exercises Beginning on Page 247

4.4. Let A is an n × n matrix, whose columns are the same as the vectors
aj , and let QR be the QR factorization of A. Because Q is orthogonal,
det(Q) = 1, and det(R) = det(A). Hence, we have

|det(A)| = |det(R)|
Yn
= |rjj |
j=1
Yn
≤ krj k2
j=1
Yn
= kaj k2 ,
j=1

where rj is the vector whose elements are the same as the elements in
the j th column of R.
If equality holds, then either some aj is zero, or else rjj = krj k for
j = 1, . . . , n. In the latter case, R is diagonal, and hence AT A is diagonal,
and so the columns of A are orthogonal.
4.10. The R code that will produce the graph is

x<-c(0,1)
y<-c(0,1)
z<-matrix(c(0,0,1,1),nrow=2)
10

trans<-persp(x, y, z, theta = 45, phi = 30)

bottom<-c(.5,0,0,1)%*%trans
top<-c(.5,1,1,1)%*%trans
xends<-c(top[,1]/top[,4],bottom[,1]/bottom[,4])
yends<-c(top[,2]/top[,4],bottom[,2]/bottom[,4])
lines(xends,yends,lwd=2)

4.12a.

Gaus <- function(n,p,ap,tol=sqrt(.Machine$double.eps)){

if (tol*min(abs(ap[-p]))>abs(ap[p])) return(NA)
Gaus <- Epqa(n,p,0,1/ap[p])
if (p>1) {
for (i in 1:(p-1)) Gaus <- Epqa(n,i,p,-ap[i])%*%Gaus
}
if (n>p) {
for (i in (p+1):n) Gaus <- Epqa(n,i,p,-ap[i])%*%Gaus
}
return(Gaus)
}

4.12b.

n <- 3
A <- matrix(c(1,3,2,2,3,1,2,1,2,4,2,1),nrow=n)
p <- 2
Gaus(n,p,A[,p])%*%A

4.12c.

m <- dim(A)[2]
A12 <- cbind(A,matrix(rep(0,n*n),nrow=n))
for (i in 1:n) A12[i,m+i]<-1
A12 <- Gaus(3,1,A12[,1])%*%A12
A12 <- Gaus(3,2,A12[,2])%*%A12
A12 <- Gaus(3,3,A12[,3])%*%A12
A12 <- rbind(A12[,5:7],c(0,0,0))
A%*%A12%*%A
A
A12%*%A%*%A12
A12
11

A12%*%A
t(A12%*%A)
A%*%A12
t(A%*%A12)

It is a g12 inverse, but it is not a Moore-Penrose inverse, because neither

A− A nor AA− is symmetric.
4.13a.

A <- matrix(c(1,3,2,2,3,1,2,1,2,4,2,1),nrow=3)
SVDA <- svd(A)
SVDA$u%*%diag(SVDA$d)%*%t(SVDA$v)
A

4.13b.

Ainv <- t(SVDA$u%%diag(1/SVDA$d)%%t(SVDA$v))

A%*%Ainv%*%A
A
Ainv%*%A%*%Ainv
Ainv
Ainv%*%A
t(Ainv%*%A)
A%*%Ainv
t(A%*%Ainv)

It is not a Moore-Penrose inverse. It is a g3 generalized inverse.

4.14a. B is nonnegative definite.

> A <- matrix(c(1,3,2,2,3,1,2,1,2,4,2,1),nrow=3)

> B <- t(A)%*%A
> B
[,1] [,2] [,3] [,4]
[1,] 14 13 9 12
[2,] 13 14 9 15
[3,] 9 9 9 12
[4,] 12 15 12 21
> SVDB <- svd(B)
> SVDB$d
[1] 5.069044e+01 5.352440e+00 1.957116e+00 1.089186e-15
> sqrtB <- SVDB$u%*%sqrt(diag(SVDB$d))%*%t(SVDB$v)
> sqrtB%*%sqrtB
12

[,1] [,2] [,3] [,4]

[1,] 14 13 9 12
[2,] 13 14 9 15
[3,] 9 9 9 12
[4,] 12 15 12 21

4.14b. D is positive definite.

> C <- matrix(c(4,2,2,2,3,1,2,1,4),nrow=3)

> D <- C%*%C
> D
[,1] [,2] [,3]
[1,] 24 16 18
[2,] 16 14 11
[3,] 18 11 21
> SVDD <- svd(D)
> SVDD$d
[1] 50.680541 6.596539 1.722920
> sqrtD <- SVDD$u%*%sqrt(diag(SVDD$d))%*%t(SVDD$v)
> sqrtD%*%sqrtD
[,1] [,2] [,3]
[1,] 24 16 18
[2,] 16 14 11
[3,] 18 11 21
> sqrtD
[,1] [,2] [,3]
[1,] 4 2 2
[2,] 2 3 1
[3,] 2 1 4

C is the square root of D.

4.14c. F is positive definite.

> E <- matrix(c(4,2,2,2,-3,1,2,1,4),nrow=3)

> F <- C%*%C
> SVDF <- svd(F)
> SVDF$d
[1] 50.680541 6.596539 1.722920
> sqrtF <- SVDF$u%*%sqrt(diag(SVDF$d))%*%t(SVDF$v)
> sqrtF%*%sqrtF
[,1] [,2] [,3]
[1,] 24 16 18
13

[2,] 16 14 11
[3,] 18 11 21
> sqrtF
[,1] [,2] [,3]
[1,] 4 2 2
[2,] 2 3 1
[3,] 2 1 4

E is not the square root of F .

Exercises Beginning on Page 294

5.1. First, show that
−1
kAxk kA−1 xk
max = min
x6=0 kxk x6=0 kxk
and −1
kA−1 xk kAxk
max = min .
x6=0 kxk x6=0 kxk

5.2a. The matrix prior to the first elimination is

 
2 5 3 19
 1 4 1 12  .
122 9

The solution is (3, 2, 1).

5.2b. The matrix prior to the first elimination is
 
5 2 3 19
 4 1 1 12  ,
212 9

and x1 and x2 have been interchanged.

5.2c.  
100
D = 0 5 0,
002
 
000
L = 2 0 0,
120
 
041
U = 0 0 3,
000
ρ((D + L)−1 U ) = 1.50.
14

e + L)
5.2e. ρ((D e −1 U
e ) = 0.9045.
5.2g. Some R code for this is

tildeD <- matrix(c(2, 0, 0,

0, 4, 0,
0, 0, 2), nrow=3, byrow=T)
tildeL <- matrix(c(0, 0, 0,
1, 0, 0,
1, 2, 0), nrow=3, byrow=T)
tildeU <- matrix(c(0, 5, 3,
0, 0, 1,
0, 0, 0), nrow=3, byrow=T)
b <- c(12,19,9)

omega <- 0.1

tildeDUadj <- (1-omega)*tildeD - omega*tildeU
tildeAk <- tildeD+omega*tildeL
badj <- omega*b
xk <- c(1,1,1)
nstep <- 2
for (i in 1:nstep){
bk <- tildeDUadj%*%xk + badj
xkp1 <- solve(tildeAk,bk)
dif <- sqrt(sum((xkp1-xk)^2))
print(dif)
xk <- xkp1
}

5.4a. nm(m + 1) − m(m + 1)/2. (Remember AT A is symmetric.)

5.4g. Using the normal equations with the Cholesky decomposition requires
only about half as many flops as the QR, when n is much larger than
m. The QR method oftens yields better accuracy, however.
5.5a. 1. X(X T X)−1 X T X = X
2. (X T X)−1 X T X(X T X)−1 X T = (X T X)−1 X T
3. X(X T X)−1 X T is symmetric (take its transpose).
4. (X T X)−1 X T X is symmetric.
Therefore, (X T X)−1 X T = X + .
5.5b. We want to show X T (y − XX + y) = 0. Using the properties of X + , we
have
15

X T (y − XX + y) = X T y − X T XX + y
= X T y − X T (XX + )T y because of symmetry
= X T y − X T (X + )T X T y
= X T y − X T (X T )+ X T y property of Moore-Penrose inverses and transposes
= X T y − X T y property of Moore-Penrose inverses
= 0

Exercises Beginning on Page 314

6.1a. 1.
6.1b. 1.
6.1d. 1. (All that was left was to determine the probability that cn 6= 0 and
cn−1 6= 0.)
6.2a. 11.6315.
6.3.  
3.08 −0.66 0 0
 −0.66 4.92 −3.27 0
 .
 0 −3.27 7.00 −3.74 
0 0 −3.74 7.00

Exercises Beginning on Page 353

7.7. This is because the subspace that generates a singular matrix is a lower
dimensional space than the full sample space, and so its measure is 0.
7.9. 2dn/2 Γd (n/2)|Σ|n/2 .
1 1
Make the change of variables W = 2Σ 2 XΣ 2 , determine the Jacobian,
and integrate.

Exercises Beginning on Page 425

8.3. 1. See Exercise 7.7, page 353.
8.10a.

p(c) = cm − α1 cm−1 − α2 σ1 cm−2 − α3 σ1 σ2 cm−3 − · · · − αm σ1 σ2 · · · σm−1 .

8.10b. Define
p(c)
f(c) = 1 − .
cm
This is a monotone decreasing continuous function in c, with f(c) → ∞
as c → 0+ and f(c) → 0 as c → ∞. Therefore, there is a unique value
c∗ for which f(c∗ ) = 1. The uniqueness also follows from Descartes’ rule
of signs, which states that the maximum number of positive roots of a
polynomial is the number of sign changes of the coefficients, and in the
case of the polynomial p(c), this is one.
16

8.18. This is a simple case of matrix multiplication.

To illustrate the use of R in complex matrices, I will show some code
that is relevant to this problem, for a given order of course.

omegajn <- function(n){

# Function to create the n^th roots of 1
omegajn <- complex(n)
omegajn[1] <- 1
if (n>=2) omegajn[2] <-
complex(re=cos(2*pi/n),im=sin(2*pi/n))
if (n>=3) for (j in 3:n) omegajn[j] <-
omegajn[2]*omegajn[j-1]
return(omegajn)
}

Fn <- function(n){
# Function to create a Fourier matrix
rts <- omegajn(n)
Fn <- matrix(c(rep(1,n),rts),nrow=n)
if (n>=3) for (j in 3:n) Fn <- cbind(Fn,rts^(j-1))
Fn <- Fn/sqrt(n)
return(Fn)
}

# perform multiplications to get the elementary

# circulant matrix of order 5
round(Conj(t(F5))%*%diag(rts5)%*%F5)

8.19. (−1)bn/2c nn , where b·c is the floor function (the greatest integer func-
tion). For n = 1, 2, 3, 4, the determinants are 1, −4, −27, 256.

Exercises Beginning on Page 506

9.3d. Assuming W is positive definite, we have

βbW,C = (X T W X)−1 X T W y +

(X T W X)−1 LT (L(X T W X)+ LT )−1 (c − L(X T W X)+ X T W y) .

9.8. Let X = [Xi | Xo ] and Z = XoT Xo − XoT Xi (XiT Xi )−1 XiT Xo . Note that
XoT Xi = XiT Xo . We have
17

XiT X(X T X)−1 X T

T −1
T Xi Xi XiT Xo T
= Xi [Xi | Xo ] [Xi | Xo ]
XoT Xi XoT Xo

= XiT Xi | XiT Xo
(XiT Xi )−1 − (XiT Xi )−1 (XoT Xi )Z −1 (XiT Xo )(XiT Xi )−1
−Z −1 (XoT Xi )(XiT Xi )−1
−(XiT Xi )−1 (XiT Xo )Z −1
T Z −1
Xi
XoT

= I − (XoT Xi )Z −1 (XiT Xo )(XiT Xi )−1 − XiT Xo Z −1 (XoT Xi )(XiT Xi )−1 |
T −XiT Xo Z −1 + XiT Xo Z −1
Xi
XoT

= XiT ,

9.9. One possibility is  

20 100
 5 25 
 
 5 25 
 
 10 NA 
 
 10 NA  .
 
 10 NA 
 
 NA 10 
 
 NA 10 
NA 10
The variance-covariance matrix computed from all pairwise complete
observations is
30 375
,
375 1230
while that computed only from complete cases is

75 375
.
375 1875

The correlation matrix computed from all pairwise complete observa-

tions is
1.00 1.95
.
1.95 1.00
Note that this example is not a pseudo-correlation matrix.
In the R software system, the cov and cor functions have an argu-
ment called “use”, which can take the values “all.obs”, “complete.obs”,
18

or “pairwise.complete.obs”. The value “all.obs” yields an error if the

data matrix contains any missing values. In cov, the values “com-
plete.obs” and “pairwise.complete.obs” yield the variance-covariances
shown above. The function cor with use="pairwise.complete.obs"
yields
1.00 1.00
.
1.00 1.00
However, if cov is invoked with use="pairwise.complete.obs" and
the function cov2cor is applied to the result, the correlations are 1.95,
as in the first correlation matrix above.
9.12b. The first step is to use the trick of equation (3.97), xT Ax = tr(AxxT ),
again to undo the earlier expression, and write the last term in equa-
tion (9.120) as
n n
− tr Σ −1 (ȳ − µ)(ȳ − µ)T = − (ȳ − µ)Σ −1 (ȳ − µ)T .
2 2
Now Σ −1 is positive definite, so (ȳ − µ)Σ −1 (ȳ − µ)T ≥ 0 and hence
is minimized for µ̂ = ȳ. Decreasing this term increases the value of
all positive definite Σ −1 .
l(µ, Σ; y), and so l(µ̂, Σ; y) ≥ l(µ, Σ; y) forP
n
Now, we consider the other term. Let A = i=1 (yi − ȳ)(yi − ȳ)T . The
first question is whether A is positive definite. We will refer to a text
on multivariate statistics for the proof that A is positive definite with
probability 1 (see Muirhead, 1982, for example). We have
n 1
l(µ̂, Σ; y) = c −log |Σ| − tr Σ −1 A
2 2
n
= c− log |Σ| + tr Σ −1 A/n .
2
Because c is constant, the function is maximized at the minimum of the
latter term subject to Σ being positive definite, which, as shown for
b = A/n.
expression (9.137), occurs at Σ
t−1 t−1
9.14. We can develop a recursion for pt11 based on p11 and p12 ,
pt11 = pt−1 t−1
11 (1 − α) + p12 β,

and because p11 + p12 = 1, we have pt11 = pt−1

11 (1 − α − β) + β. Putting
this together, we have

β/(α + β) α/(α + β)
lim P = ,
t→∞ β/(α + β) α/(α + β)
and so the limiting (and invariant) distribution is πs = (β/(α +
β), α/(α + β)).
9.15c. From the exponential growth, we have N (T ) = N (0) erT ; hence,
1 1
r = log N (T ) /N (0) = log(r0 ).
T T
19

9.23. This is an open question. If you get a proof of convergence, submit it

for publication. You may wish to try several examples and observe the
performance of the intermediate steps. I know of no case in which the
method has not converged.
9.24b. Starting with the correlation matrix given above as a possible solution
for Exercise 9.9, four iterations of equation (9.81) using δ = 0.05 and
f(x) = tanh(x) yield
1.000 0.997
.
0.997 1.000

Exercises Beginning on Page 573

10.1a. The computations do not overflow. The first floating-point number x
such that x + 1 = x is
0.10 · · · 0 × bp+1 .
Therefore, the series converges at the value of i such that i(i + 1)/2 = x.
Now solve for i.
10.2. The function is log(n), and Euler’s constant is 0.57721....
10.5. 2−56. (The standard has 53 bits normalized, so the last bit is 2−55 , and
half of that is 2−56 .)
10.6a. Normalized: 2bp−1 (b − 1)(emax − emin + 1) + 1.
Nonnormalized: 2bp−1 (b − 1)(emax − emin + 1) + 1 + 2bp−1 .
10.6b. Normalized: bemin −1 .
Nonnormalized: bemin −p .
10.6c. 1 + b−p+1 or 1 + b−p when b = 2 and the first bit is hidden.
10.6d. bp .
10.6e. 22.
10.11. First of all, we recognize that the full sum in each case is 1. We therefore
accumulate the sum from the direction in which there are fewer terms.
After computing the first term from the appropriate direction, take a
logarithm to determine a scaling factor, say sk . (This term will be the
smallest in the sum.) Next, proceed to accumulate terms until the sum
is of a different order of magnitude than the next term. At that point,
perform a scale adjustment by dividing by s. Resume summing, making
similar scale adjustments as necessary, until the limit of the summation
is reached.
10.13. The result is close to 1.
What is relevant here is that numbers close to 1 have only a very few
digits of accuracy; therefore, it would be better to design this program so
that it returns 1−Pr(X ≤ x) (the “significance level”). The purpose and
the anticipated use of a program determine how it should be designed.
10.16a. 2.
10.16b. 0.
10.16c. No (because the operations in the “for” loop are not chained).
20

10.17c.
a = x1
b = y1
s=0
for i = 2, n
{
d = (xi − a)/i
e = (yi − b)/i
a= d+a
b= e+b
s = i(i − 1)de + s
}.
10.20. 1. No; 2. Yes; 3. No; 4. No.
10.21. A very simple example is

1 1+
,
1 1

where < b−p , because in this case the matrix stored in the computer
would be singular. Another example is

1 a(1 + )
,
a(1 + ) a2 (1 + 2)

where is the machine epsilon.

10.23a.

> 1/0
[1] Inf
> -5*(1/0)
[1] -Inf
> (1/0)*(1/0)
[1] Inf
> 1/0==5*(1/0)
[1] TRUE

10.23b.

> 0/0
[1] NaN
> -5*(0/0)
[1] NaN
> 0/0==NaN
[1] NA
21

> x <- NaN

> x==NaN
[1] NA
> is.nan(x)
[1] TRUE
> is.na(x)
[1] TRUE

10.23c.

> x<-NA
> x==NA
[1] NA
> is.nan(x)
[1] FALSE
> is.na(x)
[1] TRUE

10.26. By decomposing 24 and multiplying some factors, or by using Stirling’s

approximation, we know that there are approximately 24 or 25 digits in
24! By considering the number of times 10 occurs as a factor, we know
that the last four digits are 0s.

> x24mpfr <- mpfr(24, prec=25*log2(10))

> x24mpfrfac <- factorial(x24mpfr)
> print(x24mpfrfac, digits=26)
1 ’mpfr’ number of precision 83 bits
[1] 620448401733239439360000

For comparison,

> x24 <- 24

> x24fac <- factorial(x24)
> print(x24fac, digits=22)
[1] 6.2044840173323941e+23

Exercises Beginning on Page 592

11.2a. O(nk).
11.2c. At each successive stage in the fan-in, the number of processors doing
the additions goes down by approximately one-half.
22

If p ≈ k, then O(n log k) (fan-in on one element of c at a time)

If p ≈ nk, then O(log k) (fan-in on all elements of c simultaneously)
If p is a fixed constant smaller than k, the order of time does not change;
only the multiplicative constant changes.
Notice the difference in the order of time and the order of the number of
computations. Often there is very little that can be done about the order
of computations.
11.2d. Because in a serial algorithm the magnitudes of the summands become
more and more different. In the fan-in, they are more likely to remain
relatively equal. Adding magnitudes of different quantities results in
benign roundoff, but many benign roundoffs become bad. (This is not
catastrophic cancellation.) Clearly, if all elements are nonnegative, this
argument would hold. Even if the elements are randomly distributed,
there is likely to be a drift in the sum (this can be thought of as a
random walk). There is no difference in the number of computations.
11.2e. Case 1: p ≈ n. Give each ci a processor – do an outer loop on each.
This would likely be more efficient because all processors are active at
once.
Case 2: p ≈ nk. Give each aij bj a processor – fan-in for each. This would
be the same as the other.
If p is a fixed constant smaller than n, set it up as in Case 1, using n/p
groups of ci ’s.
11.2f. If p ≈ n, then O(k).
If p ≈ nk, then O(log k).
If p is some small fixed constant, the order of time does not change; only
the multiplicative constant changes.

Exercises Beginning on Page 633

12.2. Here is a recursive Matlab function for the Strassen algorithm due to
Coleman and Van Loan. When it uses the Strassen algorithm, it requires
the matrices to have even dimension.
function C = strass(A,B,nmin)
%
% Strassen matrix multiplication C=AB
% A, B must be square and of even dimension
% From Coleman and Van Loan
% If n <= nmin, the multiplication is done conventionally
%
[n n ] = size(A);
if n <= nmin
C = A * B; % n is small, get C conventionally
else
m = n/2; u = 1:m; v = m+1:n;
P1 = strass(A(u,u)+A(v,v), B(u,u)+B(v,v), nmin);
23

P2 = strass(A(v,u)+A(v,v), B(u,u), nmin);

P3 = strass(A(u,u), B(u,v)-B(v,v), nmin);
P4 = strass(A(v,v), B(v,u)-B(u,u), nmin);
P5 = strass(A(u,u)+A(u,v), B(v,v), nmin);
P6 = strass(A(v,u)-A(u,u), B(u,u)+B(u,v), nmin);
P7 = strass(A(u,v)-A(v,v), B(v,u)+B(v,v), nmin);
C = [P1+P4-P5+P7 P3+P5; P2+P4 P1+P3-P2+P6];
end
12.5a.

real a(4,3)
data a/3.,6.,8.,2.,5.,1.,6.,3.,6.,2.,7.,1./
n = 4
m = 3
x1 = a(2,2) ! Temporary variables must be used because of
x2 = a(4,2) ! the side effects of srotg.
call srotg(x1, x2,, c, s)
call srot(m, a(2,1), n, a(4,1), n, c, s)
print *, c, s
print *, a
end

This yields 0.3162278 and 0.9486833 for c and s. The transformed matrix
is  
3.000000 5.000000 6.000000
 3.794733 3.162278 1.581139 
 .
 8.000000 6.000000 7.000000 
−5.059644 −0.00000002980232 −1.581139
12.7b. Using the Matrix package in R, after initializing rho and sig2, this is
Vinv <- sparseMatrix(i=c(1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6,7,7,7,8,8,8,9,
9,9,10,10),
j=c(1,2,1:3,2:4,3:5,4:6,5:7,6:8,7:9,8:10,9,10),
x=c(1,-rho,rep(c(-rho,1+rho^2,-rho),8),-rho,1))/
(1-rho^2)*sig2

12.9. 10.7461941829033 and 10.7461941829034.

12.11.
−(fh)+ei ch−bi −(ce)+bf
−(ceg)+bfg+cdh−afh−bdi+aei −(ceg)+bfg+cdh−afh−bdi+aei −(ceg)+bfg+cdh−afh−bdi+aei

fg−di −(cg)+ai cd−af

−(ceg)+bfg+cdh−afh−bdi+aei −(ceg)+bfg+cdh−afh−bdi+aei −(ceg)+bfg+cdh−afh−bdi+aei

−(eg)+dh bg−ah −(bd)+ae

−(ceg)+bfg+cdh−afh−bdi+aei −(ceg)+bfg+cdh−afh−bdi+aei −(ceg)+bfg+cdh−afh−bdi+aei

(Solution) Linear Algebra 2nd (Kwak, Hong) Birkhauser
31% (16)
(Solution) Linear Algebra 2nd (Kwak, Hong) Birkhauser
22 pages
Solutions: Problem Set 1: January 17, 2013
No ratings yet
Solutions: Problem Set 1: January 17, 2013
9 pages
EE263 Homework 3 Solutions
No ratings yet
EE263 Homework 3 Solutions
16 pages
Numerical Linear Algebra Solutions
100% (2)
Numerical Linear Algebra Solutions
29 pages
408 Note
No ratings yet
408 Note
57 pages
SFU MACM 409 Chapter 1 Notes
No ratings yet
SFU MACM 409 Chapter 1 Notes
11 pages
Slides
No ratings yet
Slides
428 pages
Solutions For Applied Numerical Linear Algebra PDF
No ratings yet
Solutions For Applied Numerical Linear Algebra PDF
75 pages
An Introduction To Optimization 4th Edition Solution Manual PDF
100% (1)
An Introduction To Optimization 4th Edition Solution Manual PDF
220 pages
Solution Optimization 2ed
58% (19)
Solution Optimization 2ed
138 pages
Homework 4 MATH2050
No ratings yet
Homework 4 MATH2050
7 pages
Linear Algebra Cheat Sheet
100% (1)
Linear Algebra Cheat Sheet
2 pages
Ahinf Norm Proof
No ratings yet
Ahinf Norm Proof
9 pages
(2022) 30407 - Exam - Solution
No ratings yet
(2022) 30407 - Exam - Solution
4 pages
2 - Numerical Methods For Solving Linear Systems of Equations
No ratings yet
2 - Numerical Methods For Solving Linear Systems of Equations
35 pages
Midsem Regular MFDS 22-12-2019 Answer Key PDF
No ratings yet
Midsem Regular MFDS 22-12-2019 Answer Key PDF
5 pages
CLA Week3
No ratings yet
CLA Week3
13 pages
Ecd 01
No ratings yet
Ecd 01
16 pages
MAST30025: Linear Statistical Models: Week 2 Lab
No ratings yet
MAST30025: Linear Statistical Models: Week 2 Lab
7 pages
Manual For Instructors: TO Linear Algebra Fifth Edition
No ratings yet
Manual For Instructors: TO Linear Algebra Fifth Edition
9 pages
HW 2
100% (1)
HW 2
29 pages
Direct Methods
No ratings yet
Direct Methods
79 pages
Financial Mathematics by Deep
No ratings yet
Financial Mathematics by Deep
5 pages
Bitstream 821479
No ratings yet
Bitstream 821479
6 pages
Iterative Linear
No ratings yet
Iterative Linear
10 pages
All The Math
No ratings yet
All The Math
6 pages
MIT System Theory Solutions
No ratings yet
MIT System Theory Solutions
75 pages
MA412 Final
No ratings yet
MA412 Final
82 pages
Solution 1
No ratings yet
Solution 1
9 pages
MA398 Script
No ratings yet
MA398 Script
115 pages
Linear Algebra Study Guide
No ratings yet
Linear Algebra Study Guide
25 pages
Linear Algebra Solutions Guide
No ratings yet
Linear Algebra Solutions Guide
8 pages
Liesen NLA1 PDF
No ratings yet
Liesen NLA1 PDF
63 pages
Exercise Book Financial Maths
No ratings yet
Exercise Book Financial Maths
95 pages
Book 1
No ratings yet
Book 1
20 pages
Stat 501 Homework 1 Solutions Spring 2005
No ratings yet
Stat 501 Homework 1 Solutions Spring 2005
9 pages
Math Data
No ratings yet
Math Data
117 pages
Matrix Perturbation Theory
No ratings yet
Matrix Perturbation Theory
18 pages
AIST3010&ESTR3112 Homework1
No ratings yet
AIST3010&ESTR3112 Homework1
14 pages
Comp 361
No ratings yet
Comp 361
492 pages
Mathematical Background Overview
No ratings yet
Mathematical Background Overview
31 pages
Linear Algebra Concepts & Methods
No ratings yet
Linear Algebra Concepts & Methods
2 pages
참고자료
No ratings yet
참고자료
4 pages
Eigenvalue Stability and QR Decomposition
No ratings yet
Eigenvalue Stability and QR Decomposition
9 pages
Bookwithindex
No ratings yet
Bookwithindex
96 pages
Exam2016 17s1
No ratings yet
Exam2016 17s1
9 pages
Assignment 2 Solutions
100% (1)
Assignment 2 Solutions
10 pages
Optimization Models Instructor S Solution Manual Solutions 1st Edition Giuseppe C. Calafiore
100% (16)
Optimization Models Instructor S Solution Manual Solutions 1st Edition Giuseppe C. Calafiore
90 pages
Linear Algebra II
No ratings yet
Linear Algebra II
97 pages
Lecture 1
No ratings yet
Lecture 1
53 pages
MML Book Additional Exercises
No ratings yet
MML Book Additional Exercises
9 pages
Inner Product Spaces & Theorems
No ratings yet
Inner Product Spaces & Theorems
3 pages
MA412 Q&A Part 4
No ratings yet
MA412 Q&A Part 4
53 pages
Matrix Diagonalization Solutions
No ratings yet
Matrix Diagonalization Solutions
7 pages
Inner Product Space Problems
No ratings yet
Inner Product Space Problems
7 pages
MA1522 Note Binder
No ratings yet
MA1522 Note Binder
43 pages
Lecture 10
100% (1)
Lecture 10
5 pages
Image Fusion
100% (2)
Image Fusion
15 pages
Introduction To Recursion
No ratings yet
Introduction To Recursion
36 pages
CS-1101 (ICP) Test 1
No ratings yet
CS-1101 (ICP) Test 1
2 pages
A2SV Graph Lecture
No ratings yet
A2SV Graph Lecture
83 pages
Lossy Compression Algorithms
100% (2)
Lossy Compression Algorithms
18 pages
Digital Signal Processing (BEC-42) : Unit-3 Lecture-1 (FIR Filter Design)
No ratings yet
Digital Signal Processing (BEC-42) : Unit-3 Lecture-1 (FIR Filter Design)
115 pages
Application of Acoustic Echo Cancellation
100% (1)
Application of Acoustic Echo Cancellation
81 pages
Functii Control
No ratings yet
Functii Control
51 pages
Wa0006.
No ratings yet
Wa0006.
2 pages
Semester - 6-Machine Learning
No ratings yet
Semester - 6-Machine Learning
4 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
2 pages
Deep Learning (Book)
No ratings yet
Deep Learning (Book)
130 pages
CNS 2102 - Data Structures and Algorithms - July 2022
No ratings yet
CNS 2102 - Data Structures and Algorithms - July 2022
5 pages
Maximal Frequent Itemsets Overview
No ratings yet
Maximal Frequent Itemsets Overview
18 pages
Deep Learning For Music
No ratings yet
Deep Learning For Music
310 pages
Red Black Trees: Erin Keith
No ratings yet
Red Black Trees: Erin Keith
24 pages
AI Search Techniques
No ratings yet
AI Search Techniques
6 pages
Indian Institute of Technology, Kharagpur Class Test 1, 2019-20
No ratings yet
Indian Institute of Technology, Kharagpur Class Test 1, 2019-20
4 pages
Binary Codes: Error Detection & Correction
No ratings yet
Binary Codes: Error Detection & Correction
7 pages
RMA Discussion and References
No ratings yet
RMA Discussion and References
2 pages
Induction & Recursion Guide
No ratings yet
Induction & Recursion Guide
15 pages
Optislang Inside Ansys PDF
No ratings yet
Optislang Inside Ansys PDF
32 pages
Pulse Code Modulation Explained
No ratings yet
Pulse Code Modulation Explained
13 pages
Flight Connection Prediction For Airline
No ratings yet
Flight Connection Prediction For Airline
22 pages
AI sp12 Final Solutions
No ratings yet
AI sp12 Final Solutions
19 pages
Dynamical Systems Sternberg S. No Waiting Time
No ratings yet
Dynamical Systems Sternberg S. No Waiting Time
86 pages
Biswas Lecture 1 Notes
No ratings yet
Biswas Lecture 1 Notes
58 pages
Solutions Image Processing Ulaby Yagle
0% (1)
Solutions Image Processing Ulaby Yagle
110 pages

Matrix Algebra Solution

Uploaded by

Matrix Algebra Solution

Uploaded by

Solutions and Hints for Selected Exercises

Matrix Algebra by James E. Gentle

Exercises Beginning on Page 63

in which one ai , aj 6= 0, contradicting the condition that the full set is

ai bi ≤ |ai ||bi | ≤ |ai |p /p + |bi |q /q.

hx, yi 1 kxkpp 1 kykqq

Hence, we have the desired result.

hx, yi ≤ hu, vi ≤ kxkpkykq .

We observe that equality occurs if and only if

and with p = 2, we have a special case of the Cauchy-Schwarz inequality,

which guarantees that V(x) ≥ 0.

Now, letting q = p/(p − 1), we apply Hölder’s inequality to each of the

which is the same as

kx + ykp ≤ kxkp + kykp ,

Dot <- function(x, y){

"%.%" <- function(x, y){

covuv <- cov(u,v)

Exercises Beginning on Page 203

(x − a)T (x − a) = (x − x̄ − (a + x̄))T (x − x̄ − (a + x̄))

Finally, we get the expressions in equation (3.99) by writing xT

A = (aij ) = [a∗1 , . . . , a∗m ].

If A is of rank one, the maximum number of linearly independent

identity M M −1 , where M is some appropriate matrix. For example,

(A + B)(A−1 − A−1 (A−1 + B −1 )−1 A−1 ) =

3.21. Express the nonzero elements of row i in the n × n matrix A as ai bk

Epqa <- function(n,p=0,q=0,a=0){

3.45b. Epq is its own inverse.

3.45c. For a 6= 0, Ep (1/a)Ep(a) = I.

3.45d. An axpy operation, Epq (−a)Epq (a) = I.

Exercises Beginning on Page 247

trans<-persp(x, y, z, theta = 45, phi = 30)

Gaus <- function(n,p,ap,tol=sqrt(.Machine$double.eps)){

It is a g12 inverse, but it is not a Moore-Penrose inverse, because neither

Ainv <- t(SVDA$u%*%diag(1/SVDA$d)%*%t(SVDA$v))

It is not a Moore-Penrose inverse. It is a g3 generalized inverse.

> A <- matrix(c(1,3,2,2,3,1,2,1,2,4,2,1),nrow=3)

[,1] [,2] [,3] [,4]

4.14b. D is positive definite.

> C <- matrix(c(4,2,2,2,3,1,2,1,4),nrow=3)

C is the square root of D.

> E <- matrix(c(4,2,2,2,-3,1,2,1,4),nrow=3)

E is not the square root of F .

Exercises Beginning on Page 294

5.2a. The matrix prior to the first elimination is

The solution is (3, 2, 1).

and x1 and x2 have been interchanged.

tildeD <- matrix(c(2, 0, 0,

omega <- 0.1

5.4a. nm(m + 1) − m(m + 1)/2. (Remember AT A is symmetric.)

Exercises Beginning on Page 314

Exercises Beginning on Page 353

Exercises Beginning on Page 425

p(c) = cm − α1 cm−1 − α2 σ1 cm−2 − α3 σ1 σ2 cm−3 − · · · − αm σ1 σ2 · · · σm−1 .

8.18. This is a simple case of matrix multiplication.

omegajn <- function(n){

# perform multiplications to get the elementary

Exercises Beginning on Page 506

(X T W X)−1 LT (L(X T W X)+ LT )−1 (c − L(X T W X)+ X T W y) .

XiT X(X T X)−1 X T

9.9. One possibility is  

The correlation matrix computed from all pairwise complete observa-

or “pairwise.complete.obs”. The value “all.obs” yields an error if the

and because p11 + p12 = 1, we have pt11 = pt−1

9.23. This is an open question. If you get a proof of convergence, submit it

Exercises Beginning on Page 573

where  is the machine epsilon.

> x <- NaN

10.26. By decomposing 24 and multiplying some factors, or by using Stirling’s

> x24mpfr <- mpfr(24, prec=25*log2(10))

> x24 <- 24

Exercises Beginning on Page 592

If p ≈ k, then O(n log k) (fan-in on one element of c at a time)

Exercises Beginning on Page 633

P2 = strass(A(v,u)+A(v,v), B(u,u), nmin);

12.9. 10.7461941829033 and 10.7461941829034.

fg−di −(cg)+ai cd−af

Ainv <- t(SVDA$u%%diag(1/SVDA$d)%%t(SVDA$v))

where is the machine epsilon.