Applied Optics
Professor Akhilesh Kumar Mishra
Department of Physics
Indian Institute of Technology, Roorkee
Lecture 06
Matrix Method in Paraxial Optics - 1
Hello everyone, welcome again to my class. Today we will learn about a very important
concept which is matrix method.
(Refer Slide Time: 0:35)
Now, we will start with matrix method. We were in module 1 and we talked about the reflection
and refraction. And while talking about this refraction and reflection, we consider 2 types of
interfaces, which is straight interface and spherical interfaces. We not only considered the
single spherical interfaces, but we also talked about double interfaces wherein we covered
refraction through thin lens. But whenever we do refraction through thin lens or suppose we
have multiple lenses which are stacked together with certain spacing then every time we will
have to use this lens formula for each interface, we will have to implement this lens formula
which is quite cumbersome process.
Therefore, whenever we have a series of lenses, we implement matrix method which ease out
the calculations. Now, in this module, we have matrix method in paraxial optics, then, we will
talk about thick and thin lenses, then we will talk about system of thin lenses, unit planes and
nodal planes. But to start with we will talk about matrix method in paraxial optics.
(Refer Slide Time: 2:10)
But before that, let us consider a spherical refracting surface which is represented by SQS’ this
is our spherical refracting surface and the ray PQ is met to incident. It falls on this refracting
surface and then it goes in a certain direction after getting refracted and this refracting surface
separating two media, one is of refractive index 𝑛1 and other is 𝑛2 , these are needless to say,
we know these things. These are shown here, the direction of this refracted ray, it is completely
determined by two conditions which we have imbibed very thoroughly and what are these two
conditions.
(Refer Slide Time: 2:56)
These two conditions are written here, the first is the incident ray, the refracted ray and the
normal to the interface, they all three lie in the same plane and what is the second condition,
the second condition is that angle of incidence and angle of refraction which are𝜃1 and 𝜃2
respectively. They should be such that 𝑠𝑖𝑛𝜃1 /𝑠𝑖𝑛𝜃2 = 𝑛2 /𝑛1 , this is your Snell’s law. If we
implement these 2 conditions, then we can easily decide the direction of the refracted ray.
Now, as I said earlier, now, in order to obtain the position of the final image due to complicated
optical systems, what do I mean by complicated optical systems? Many lenses if we stack
together many lenses and then shine a ray on it, then it would be very difficult to calculate the
direction of the refracted ray, which finally comes out of this complicated optical system or
this stack of lenses.
Because at each interface you will have to exercise these two conditions, at each interface we
have to calculate what is 𝜃1 ? What is 𝜃2 ? What is 𝑛1 ? What is 𝑛2 ? And they may keep changing
and then it is a time consuming process and one has to calculate step by step the position of the
image due to each surface and then this image will act as an object for the next surface.
This is a time consuming here. And therefore, to deal with these situation, the people devised
a very quick method and this is called matrix method and it is used to trace paraxial rays with
ease. In paraxial optics, it is used to trace the paraxial rays it can quickly tell us the direction
of the refracted ray irrespective of the complication in optical system involved. With this we
will move towards the matrix method in paraxial optics.
(Refer Slide Time: 5:28)
This is a method which is useful in dealing with the optical system consisting of several optical
elements like many convex lenses, many concave lenses or combination of the two. It is
restricted to the paraxial rays all these geometrical optics which we are studying about we are
studying it in the domain of paraxial optics here. Paraxial optics means, we are only considering
the rays which are close to the lens axis and which makes very small angle with the lens axis
and whenever we deviate from the paraxial optics aberration appears in the image here and
aberration is entirely different topic it is not in the purview of this course.
Therefore, to avoid aberration, we will only consider rays in paraxial approximation. Next point
is, a given optical system is represented by a single matrix, in this matrix method any optical
system, be it a concave lens, be it a convex lens or a single lens system would be represented
by a single matrix and it would be a 2 by 2 matrix, not only this lenses, but also a translation
or between the two lenses if there is some gap in which ray is travelling then this gap is also
represented by a matrix. For each lens one matrix for each gap one matrix.
And what would be the single matrix? This matrix would be a combination of matrices that
represents the individual refraction, reflection and translation. Here what is being said is that
we have a combination of lens system each lens is represented by one matrix each translation
is represented by another type of matrix and then the combined matrix would be the
multiplication of these matrices or let me reframe it. We are given an optical system in this
optical system each reflection would be represented by one matrix, each refraction would be
represented by another matrix, the translation would be represented by another matrix.
And if we have a combination of lens system then we would be have a series of refraction and
translation and the final resultant matrix would be the multiplication of all these matrices. Now,
when a ray travel, suppose, this is the ray which is travelling and it is inclined at angle 𝛼 with
the horizontal then we may also say that it is inclined at angle 𝜓 with the vertical. Suppose
this is point P at which these discussions are done. Now a ray at a point P is described in terms
of its height 𝑥1 and slope angle 𝛼1 , what does this height and slope angle means now? This ray
is travelling then it must be travelling towards or from some lens system.
Suppose that it is going towards some lens system and this horizontal axis represents the axis
of the lens and if this P point is at a height 𝑥1 from the axis of the lens then the point P can be
designated by two independent variables. I repeat there is a ray which is travelling in some
medium of some refractive index say n and this horizontal line represents the axis of some lens
system and we are measuring things with respect to this horizontal line. This ray is inclined at
angle 𝛼 with this horizontal line. Let us pick a point P on this ray.
Now to designate this point P on the ray, we require two parameter. One is the inclination of
the ray with the horizontal and the second is the distance of point P from the horizontal line,
this distance here is 𝑥1 and the angle suppose it is 𝛼1 , this 𝛼1 is from the horizontal and the
from vertical angle 𝜓1 say. Now instead of using two parameters 𝑥1 and 𝛼1 , one can specify a
ray using optical direction cosines, the word is optical direction cosines, instead of specifying
angle, we can use optical direction cosines what is optical direction cosines and how it is
defined? It is defined through this relation, 𝜆 = 𝑛𝑐𝑜𝑠𝜓 = 𝑛𝑠𝑖𝑛𝛼.
ψ is the angle from the vertical and 𝛼 is the angle from the horizontal this is angle measured
from vertical and this is an angle measured from horizontal. I repeat, suppose we have a ray
which is travelling in certain direction then to specify a point on this ray, suppose this point is
P, we just need to know the inclination of this point P from the horizontal and the distance of
point P from the axis of lens system or optical system. These are the two parameters which are
enough to specify point P.
Now, instead of using angle one can also define optical direction cosine, optical direction
cosine is defined by 𝜆 and it is given as 𝜆 = 𝑛𝑐𝑜𝑠𝜓. n is refractive index of the medium and
ψ is the angle from the vertical and equivalently it can also be written as 𝜆 = 𝑛𝑠𝑖𝑛𝛼. n is again
refractive index of the medium and 𝛼 is now the angle from the horizontal, this is 𝜆 is called
direction cosine.
(Refer Slide Time: 12:43)
Now, suppose we have this axis of symmetry which is horizontal and this is the direction of
the ray. Now, let us pick two points on the ray P and M. Here P and M are two points on the
ray and this point P is at a distance 𝑥1 from the horizontal and point M is at a distance 𝑥2 from
the horizontal, these are the heights of the points P and M. I repeat 𝑥1 and 𝑥2 are heights of
point P and point M respectively from this horizontal line and these points, the horizontal
separation between these point P and M is D, capital D. And it is also assumed that point P and
M, the ray which is passing through point P and M they are inclined at angles 𝛼1 and 𝛼2 with
the horizontal respectively.
Now, under these conditions, we can define the coordinate of P and M. The coordinate of P
would be 𝑥1 and 𝛼1 , 𝑥1 is the distance from the horizontal and 𝛼1 is the angle from the
horizontal and similarly its co-ordinate would be 𝑥2 and 𝛼2 .
The ray is supposed to be travelling in a medium which is homogeneous. Now, since the
medium is homogeneous therefore, 𝛼1 of course would be equal to 𝛼2 . And what would be the
relation between 𝑥1 and 𝑥2 . We can easily calculate, 𝑥2 = 𝑥1 + 𝐷 𝑡𝑎𝑛𝛼1 .. If you take tangent
of 𝛼1 than you will get the relation between this vertical distance and capital D and this gives
us the relation between 𝑥2 and 𝐷1 .
But we are in the paraxial domain. Therefore, 𝑡𝑎𝑛𝛼1 = 𝛼1 here therefore, we can safely write
𝑡𝑎𝑛𝛼1 = 𝛼1 . Therefore 𝜆1 which is the direction cosine which we defined here in our previous
slide, here we can say that λ=n𝛼 and instead of writing sin𝛼, we will write 𝑛𝛼 because 𝑠𝑖𝑛𝛼
would be equal to 𝛼 in paraxial domain.
For point P, we always use a subscript 1 here, for point P subscript 1 is used therefore, for point
P its direction cosine 𝜆1 = 𝑛1 𝛼1 . Similarly, for point M, its direction cosine will be given by
𝜆2 and for the expression for 𝜆2 = 𝑛2 𝛼2 . As I said before the medium is homogeneous
therefore, the refractive index at point P and around point P would be same as the refractive
index at point M. Therefore, 𝑛1 = 𝑛2 and if 𝑛1 = 𝑛2 and 𝛼1 = 𝛼2 . With these two relations,
𝜆2 = 𝜆1 , we can easily say that 𝜆2 = 𝜆1 .
W know now the relation between 𝜆2 and 𝜆1 and we know now the relation between 𝑥2 and
𝑥1 . Here under paraxial approximation we replaced tan𝛼1 with 𝛼1 . 𝛼1 from this relation this is
equal to 𝜆1 /𝑛1 . You substitute this expression of 𝛼1 from here to here then you get 𝑥2 = 𝑥1 +
𝐷/𝑛1 × 𝜆1 . These are the 2 relations which we got equation number 1 and equation number 2.
Equation number 1 is 𝜆2 = 𝜆1 and equation number 2 is 𝑥2 = 𝑥1 + 𝐷/𝑛1 × 𝜆1 .
Now, you see that on the left hand side of equation number 1 and 2 it is 𝜆2 and 𝑥2 while on the
right hand side of equation number 1 and 2 it is 𝜆1 and 𝑥1 . Now, the ray is travelling from P to
M now, once we know the coordinates of P then we can also know the coordinates of M. How?
through equation 1 and 2 here. Now, we can write equation 1 and 2 in form of a matrix we can
replace 1 and 2 with a single matrix.
(Refer Slide Time: 17:52)
𝜆
How to do this? Here it is a linear equation though we can write on left hand side [ 2 ] is equal
𝑥2
𝜆
to some coefficient and then [ 1 ], 𝜆1 and 𝑥1 is input. Once you know the input you can calculate
𝑥1
output. To calculate the output you will have to multiply the input with some matrix. What is
this matrix? We can fill this matrix once we know what our equation 1 and 2 and this is how
we have filled it. Now, you know that coordinates at one point then after translating a certain
distance D you can calculate the coordinates on the other point.
The effect of translation through a distance D in a homogeneous medium is therefore given by
this 2 by 2 translation matrix, this is your translation matrix. Once we know the coordinates of
a ray at some point in a homogeneous medium then we can calculate the coordinates of a point
which is farther from this initial point and where the ray will go after certain distance. And how
to calculate the coordinates of this farther point, the other point? By using this translation
matrix, this translation matrix will give us, this translation matrix has the information of this
translation.
And what are the elements of this transfer matrix? It is a 2 by 2 matrix, the first element is 1
second is 0, third is D/𝑛1 , D is the distance between the two points on the ray and 𝑛1 is the
refractive index of the medium. And the very important and notable property of this translation
matrix is that determinant of T is equal to 1, very important property. Then what we learn today
is that, if a ray is travelling in homogeneous medium of certain refractive index, then this
translation, this travel can be replaced by a 2 by 2 translation matrix. And once we know the
coordinates, then just by implementing this translation matrix, we can know the effect of
translation on the ray.
And using this translation matrix, we can trace the path of the ray. Today we learn about
translation matrix which takes into account the translation in a homogeneous medium and
therefore, if a ray is translating in a homogeneous medium, then using this translation matrix
we can easily calculate or we can easily trace the ray path. The same thing can also be done for
refraction through an interface which divides the two media of different refractive index. We
will talk about the matrix for refraction in the next class. Thank you all.