Rohan Kumar Singh: A Summer Training Report
Rohan Kumar Singh: A Summer Training Report
                 Submitted by
            Rohan Kumar Singh
        Enrolment Number: 40396402716
Date: 30.09.2019
Place: New Delhi
ACKNOWLEDGEMENT
First and foremost, I wish to express my profound gratitude to Mr. Sumeet Malik, Trainer, for
giving me the opportunity to carry out my summer training.
I find great pleasure to express my unfeigned thanks to Mr. Sumeet Malik, for his invaluable
guidance, support and useful suggestions at every stage of this training. It was under his guidance
that this training has turned up the way it is. My heartfelt thanks to him for his immense help and
support, useful discussions and valuable recommendations throughout the course of my training.
Last but not the least I thank the almighty for enlightening me with his blessings.
                                                                                             7C13
                                                                                                    4
ABSTRACT
This report Surveys the field of Augmented Reality, in which 3-D virtual objects are integrated
into a 3-D real environment in real time. It describes the medical, manufacturing, visualization,
path planning, entertainment and military applications that have been explored and application of
augmented reality in electronics. This report describes the characteristics of Augmented Reality
systems, including a detailed discussion of the tradeoffs between optical and video blending
approaches. Registration and sensing errors are two of the biggest problems in building effective
Augmented Reality systems, so this report summarizes current efforts to overcome these problems.
Future directions and areas requiring further research are discussed. This report provides a starting
point for anyone interested in researching or using Augmented Reality.
                                                         5
TABLE OF CONTENTS
List of Figures 6
1.1 Goals 7
1.2 Definition 7
1.2 Motivation 8
2 Applications 9
2.1 Medical 9
2.5 Entertainment 15
3 Application in Electronics 16
4 Characteristics 19
4.1 Augmentation 19
4.4 Portability 28
5 Conclusion 30
6 References 33
LIST OF FIGURES
1 1.1 Real desk with virtual lamp and two virtual chairs 7
CHAPTER 1
                       INTRODUCTION
1.1 Goals
                                                                                                          8
     building an Augmented Reality system. Currently, two of the biggest problems are in registration
     and sensing: the subjects of Sections 4 and 5. Finally, Section 6 describes some areas that require
     further work and research.
     1.2 Definition
     Augmented Reality (AR) is a variation of Virtual Environments (VE), or Virtual Reality as it is
     more commonly called. VE technologies completely immerse a user inside a synthetic
     environment. While immersed, the user cannot see the real world around him. In contrast, AR
     allows the user to see the real world, with virtual objects superimposed upon or composited with
     the real world. Therefore, AR supplements reality, rather than completely replacing it. Ideally, it
     would appear to the user that the virtual and real objects coexisted in the same space, similar to the
     effects achieved in the film "Who Framed Roger Rabbit?" Figure 1 shows an example of what this
     might look like. It shows a real desk with a real phone. Inside this room are also a virtual lamp and
     two virtual chairs. Note that the objects are combined in 3-D, so that the virtual lamp covers the
     real table, and the real table covers parts of the two virtual chairs. AR can be thought of as the
     "middle ground" between VE (completely synthetic) and telepresence (completely real).
Figure 1.1 : Real desk with virtual lamp and two virtual chairs.
     Some researchers define AR in a way that requires the use of Head-Mounted Displays (HMDs).
     To avoid limiting AR to specific technologies, this survey defines AR as systems that have the
     following three characteristics:
This definition allows other technologies besides HMDs while retaining the essential components of
AR. For example, it does not include film or 2-D overlays. Films like "Jurassic Park" feature
photorealistic virtual objects seamlessly blended with a real environment in 3-D, but they are not
interactive media. 2-D virtual overlays on top of live video can be done at interactive rates, but the
overlays are not combined with the real world in 3-D. However, this definition does allow monitor-
based interfaces, monocular systems, see-through HMDs, and various other combining
technologies. Potential system configurations are discussed further in Section 3.
1.3 Motivation
Why is Augmented Reality an interesting topic? Why is combining real and virtual objects in 3-D
useful? Augmented Reality enhances a user's perception of and interaction with the real world.
The virtual objects display information that the user cannot directly detect with his own senses.
The information conveyed by the virtual objects helps a user perform real-world tasks. AR is a
specific example of what Fred Brooks calls Intelligence Amplification (IA): using the computer as
a tool to make a task easier for a human to perform.
                                    CHAPTER 2
                                                                                                    10
APPLICATION
2.1 Medical
Doctors could use Augmented Reality as a visualization and training aid for surgery. It may be
possible to collect 3-D datasets of a patient in real time, using non-invasive sensors like Magnetic
Resonance Imaging (MRI), Computed Tomography scans (CT), or ultrasound imaging. These
datasets could then be rendered and combined in real time with a view of the real patient. In effect,
this would give a doctor "X-ray vision" inside a patient. This would be very useful during
minimally-invasive surgery, which reduces the trauma of an operation by using small incisions or
no incisions at all. A problem with minimally-invasive techniques is that they reduce the doctor's
ability to see inside the patient, making surgery more difficult. AR technology could provide an
internal view without the need for larger incisions.
AR might also be helpful for general medical visualization tasks in the surgical room. Surgeons
can detect some features with the naked eye that they cannot see in MRI or CT scans, and vice-
versa. AR would give surgeons access to both types of data simultaneously. This might also guide
precision tasks, such as displaying where to drill a hole into the skull for brain surgery or where to
perform a needle biopsy of a tiny tumor. The information from the non-invasive sensors would be
directly displayed on the patient, showing exactly where to perform the operation.
AR might also be useful for training purposes . Virtual instructions could remind a novice surgeon
of the required steps, without the need to look away from a patient to consult a manual. Virtual
objects could also identify organs and specify locations to avoid disturbing .
Several projects are exploring this application area. At UNC Chapel Hill, a research group has
conducted trial runs of scanning the womb of a pregnant woman with an ultrasound sensor,
generating a 3-D representation of the fetus inside the womb and displaying that in a see-through
HMD (Figure 2.2). The goal is to endow the doctor with the ability to see the moving, kicking
fetus lying inside the womb, with the hope that this one day may become a "3-D stethoscope" .
More recent efforts have focused on a needle biopsy of a breast tumor. Figure 2.3 shows a mockup
of a breast biopsy operation, where the virtual objects identify the location of the tumor and guide
                                                                                                 11
the needle to its target. Other groups at the MIT AI Lab , General Electric , and elsewhere are
investigating displaying MRI or CT data, directly registered onto the patient.
Another category of Augmented Reality applications is the assembly, maintenance, and repair of
complex machinery. Instructions might be easier to understand if they were available, not as
manuals with text and pictures, but rather as 3-D drawings superimposed upon the actual
equipment, showing step-by-step the tasks that need to be done and how to do them. These
superimposed 3-D drawings can be animated, making the directions even more explicit.
Several research projects have demonstrated prototypes in this area. Steve Feiner's group at
Columbia built a laser printer maintenance application , shown in Figures 2.4 and 2.5. Figure 2.4
shows an external view, and Figure 2.5 shows the user's view, where the computer-generated
wireframe is telling the user to remove the paper tray. A group at Boeing is developing AR
technology to guide a technician in building a wiring harness that forms part of an airplane's
electrical system. Storing these instructions in electronic form will save space and reduce costs.
Currently, technicians use large physical layout boards to construct such harnesses, and Boeing
requires several warehouses to store all these boards. Such space might be emptied for other use if
this application proves successful. Boeing is using a Technology Reinvestment Program (TRP)
grant to investigate putting this technology onto the factory floor. Figure 2.6 shows an external
                                                                                               12
view of Adam Janin using a prototype AR system to build a wire bundle. Eventually, AR might be
used for any complicated machinery, such as automobile engines .
Figure 2.2: External view of Columbia printer maintenance application. Note that all objects must
                                           be tracked.
Figure 2.3: Prototype laser printer maintenance application, displaying how to remove the paper
                                              tray.
   Figure 2.4: Adam Janin demonstrates Boeing's prototype wire bundle assembly application.
                                                                                                   13
AR could be used to annotate objects and environments with public or private information.
Applications using public information assume the availability of public databases to draw upon.
For example, a hand-held display could provide information about the contents of library shelves
as the user walks around the library. At the European Computer-Industry Research Centre
(ECRC), a user can point at parts of an engine model and the AR system displays the name of the
part that is being pointed at . Figure 2.7 shows this, where the user points at the exhaust manifold
on an engine model and the label "exhaust manifold" appears.
Figure 2.5: Engine model part labels appear as user points at them.
Alternately, these annotations might be private notes attached to specific objects. Researchers at
Columbia demonstrated this with the notion of attaching windows from a standard user interface
onto specific locations in the world, or attached to specific objects as reminders . Figure 2.8 shows
a window superimposed as a label upon a student. He wears a tracking device, so the computer
knows his location. As the student moves around, the label follows his location, providing the AR
user with a reminder of what he needs to talk to the student about.
                                                                                                  14
AR might aid general visualization tasks as well. An architect with a see-through HMD might be
able to look out a window and see how a proposed new skyscraper would change her view. If a
database containing information about a building's structure was available, AR might give
architects "X-ray vision" inside a building, showing where the pipes, electric lines, and structural
supports are inside the walls . Researchers at the University of Toronto have built a system called
Augmented Reality through Graphic Overlays on Stereovideo (ARGOS), which among other
things is used to make images easier to understand during difficult viewing conditions . Figure 2.8
shows wireframe lines drawn on top of a space shuttle bay interior, while in orbit. The lines make
it easier to see the geometry of the shuttle bay. Similarly, virtual lines and objects could aid
navigation and scene understanding during poor visibility conditions, such as underwater or in fog.
          Figure 2.7: Virtual lines help display geometry of shuttle bay, as seen in orbit.
                                                                                                         15
Teleoperation of a robot is often a difficult problem, especially when the robot is far away, with long
delays in the communication link. Under this circumstance,
instead of controlling the robot directly, it may be preferable to instead control a virtual version of the
robot. The user plans and specifies the robot's actions by manipulating the local virtual version, in real
time. The results are directly displayed on the real world. Once the plan is tested and determined, then
user tells the real robot to execute the specified plan. This avoids pilot-induced oscillations caused by
the lengthy delays. The virtual versions can also predict the effects of manipulating the environment,
thus serving as a planning and previewing tool to aid the user in performing the desired task. The
ARGOS system has demonstrated that stereoscopic AR is an easier and more accurate way of doing
robot path planning than traditional monoscopic interfaces . Others have also used registered overlays
with telepresence systems . Figure 10 shows how a virtual outline can represent a future location of a
robot arm.
2.5 Entertainment
At SIGGRAPH '95, several exhibitors showed "Virtual Sets" that merge real actors with virtual
backgrounds, in real time and in 3-D. The actors stand in front of a large blue screen, while a
computer-controlled motion camera records the scene.
Since the camera's location is tracked, and the actor's motions are scripted, it is possible to digitally
composite the actor into a 3-D virtual background. For example, the actor might appear to stand inside
a large virtual spinning ring, where the front part of the ring covers the actor while the rear part of the
ring is covered by the actor. The entertainment industry sees this as a way to reduce production costs:
creating and storing sets virtually is potentially cheaper than constantly building new physical sets
from scratch. The ALIVE project from the MIT Media Lab goes one step further by
populating the environment with intelligent virtual creatures that respond to user actions .
For many years, military aircraft and helicopters have used Head-Up Displays (HUDs) and
Helmet-Mounted Sights (HMS) to superimpose vector graphics upon the pilot's view of the real
world. Besides providing basic navigation and flight information, these graphics are sometimes
registered with targets in the environment, providing a way to aim the aircraft's weapons. For
example, the chin turret in a helicopter gunship can be slaved to the pilot's HMS, so the pilot can
aim the chin turret simply by looking at the target. Future generations of combat aircraft will be
developed with an HMD built into the pilot's helmet.
                                                                                                   17
CHAPTER 3
APPLICATION IN ELECTRONICS
Kids, who have a keen interest in electronics, will be really benefitted with augmented reality. This
technology will help them to explore their creative side and they can create new electronic devices
without any difficulty. They are not limited to finite thinking. Not only kids, people of every age
can get educated about electronic parts and their creation. This provides a good educational scope
for individuals of all ages.
Figure 3.2 : Augmented Reality Based Mobile App Showing Circuit And Its Components
                                                                                                 20
CHAPTER 4
CHARACTERISTICS
This section discusses the characteristics of AR systems and design issues encountered when
building an AR system. Section 3.1 describes the basic characteristics of augmentation. There are
two ways to accomplish this augmentation: optical or video technologies. Section 3.2 discusses
their characteristics and relative strengths and weaknesses. Blending the real and virtual poses
problems with focus and contrast (Section 3.3), and some applications require portable AR
systems to be truly effective (Section 3.4). Finally, Section 3.5 summarizes the characteristics by
comparing the requirements of AR against those for Virtual Environments.
4.1 Augmentation
Besides adding objects to a real environment, Augmented Reality also has the potential to remove
them. Current work has focused on adding virtual objects to a real environment. However, graphic
overlays might also be used to remove or hide parts of the real environment from a user. For
example, to remove a desk in the real environment, draw a representation of the real walls and
floors behind the desk and "paint" that over the real desk, effectively removing it from the user's
sight. This has been done in feature films. Doing this interactively in an AR system will be much
harder, but this removal may not need to be photorealistic to be effective.
Augmented Reality might apply to all senses, not just sight. So far, researchers have focused on
blending real and virtual images and graphics. However, AR could be extended to include sound.
                                                                                                     21
The user would wear headphones equipped with microphones on the outside. The headphones
would add synthetic, directional 3–D sound, while the external microphones would detect
incoming sounds from the environment. This would give the system a chance to mask or cover up
selected real sounds from the environment by generating a masking signal that exactly canceled
the incoming real sound . While this would not be easy to do, it might be possible. Another
example is haptics. Gloves with devices that provide tactile feedback might augment real forces in
the environment. For example, a user might run his hand over the surface of a real desk.
Simulating such a hard surface virtually is fairly difficult, but it is easy to do in reality. Then the
tactile effectors in the glove can augment the feel of the desk, perhaps making it feel rough in
certain spots. This capability might be useful in some applications, such as providing an additional
cue that a virtual object is at a particular location on a real desk .
A basic design decision in building an AR system is how to accomplish the combining of real and
virtual. Two basic choices are available: optical and video technologies. Each has particular
advantages and disadvantages. This section compares the two and notes the tradeoffs. For
additional discussion, see .
A see-through HMD is one device used to combine real and virtual. Standard closed-view HMDs
do not allow any direct view of the real world. In contrast, a see-through HMD lets the user see the
real world, with virtual objects superimposed by optical or video technologies.
Optical see-through HMDs work by placing optical combiners in front of the user's eyes. These
combiners are partially transmissive, so that the user can look directly through them to see the real
world. The combiners are also partially reflective, so that the user sees virtual images bounced off
the combiners from head-mounted monitors. This approach is similar in nature to Head-Up
Displays (HUDs) commonly used in military aircraft, except that the combiners are attached to the
head. Thus, optical see-through HMDs have sometimes been described as a "HUD on a head".
Figure 4.1 shows a conceptual diagram of an optical see-through HMD. Figure 4.2 shows two
optical see-through HMDs made by Hughes Electronics.
                                                                                                   22
The optical combiners usually reduce the amount of light that the user sees from the real world.
Since the combiners act like half-silvered mirrors, they only let in some of the light from the real
world, so that they can reflect some of the light from the monitors into the user's eyes. For
example, the HMD described in transmits about 30% of the incoming light from the real world.
Choosing the level of blending is a design problem. More sophisticated combiners might vary the
level of contributions based upon the wavelength of light. For example, such a combiner might be
set to reflect all light of a certain wavelength and none at any other wavelengths. This would be
ideal with a monochrome monitor. Virtually all the light from the monitor would be reflected into
the user's eyes, while almost all the light from the real world (except at the particular wavelength)
would reach the user's eyes. However, most existing optical see-through HMDs do reduce the
amount of light from the real world, so they act like a pair of sunglasses when the power is cut off.
In contrast, video see-through HMDs work by combining a closed-view HMD with one or two
head-mounted video cameras. The video cameras provide the user's view of the real world. Video
from these cameras is combined with the graphic images created by the scene generator, blending
the real and virtual. The result is sent to the monitors in front of the user's eyes in the closed-view
HMD. Figure 4.3 shows a conceptual diagram of a video see-through HMD. Figure 4.4 shows an
actual video see-through HMD, with two video cameras mounted on top of a Flight Helmet.
Video composition can be done in more than one way. A simple way is to use chroma-keying: a
technique used in many video special effects. The background of the computer graphic images is
set to a specific color, say green, which none of the virtual objects use. Then the combining step
replaces all green areas with the corresponding parts from the video of the real world. This has the
effect of superimposing the virtual objects over the real world. A more sophisticated composition
would use depth information. If the system had depth information at each pixel for the real world
images, it could combine the real and virtual images by a pixel-by-pixel depth comparison. This
would allow real objects to cover virtual objects and vice-versa.
AR systems can also be built using monitor-based configurations, instead of see-through HMDs.
Figure 15 shows how a monitor-based system might be built. In this case, one or two video
cameras view the environment. The cameras may be static or mobile. In the mobile case, the
cameras might move around by being attached to a robot, with their locations tracked. The video
of the real world and the graphic images generated by a scene generator are combined, just as in
the video see-through HMD case, and displayed in a monitor in front of the user. The user does not
wear the display device. Optionally, the images may be displayed in stereo on the monitor, which
then requires the user to wear a pair of stereo glasses. Figure 16 shows an external view of the
ARGOS system, which uses a monitor-based configuration.
                                                                                                   25
Figure 4.6: External view of the ARGOS system, an example of monitor-based AR.
Finally, a monitor-based optical configuration is also possible. This is similar to Figure 4.1 except
that the user does not wear the monitors or combiners on her head. Instead, the monitors and
combiners are fixed in space, and the user positions her head to look through the combiners. This
is typical of Head-Up Displays on military aircraft, and at least one such configuration has been
proposed for a medical application .
                                                                                                      26
       The rest of this section compares the relative advantages and disadvantages of optical and
video approaches, starting with optical. An optical approach has the following advantages over a
video approach:
       1.         Simplicity: Optical blending is simpler and cheaper than video blending. Optical
approaches have only one "stream" of video to worry about: the graphic images. The real world is
seen directly through the combiners, and that time delay is generally a few nanoseconds. Video
blending, on the other hand, must deal with separate video streams for the real and virtual images.
Both streams have inherent
delays in the tens of milliseconds. Digitizing video images usually adds at least one frame time of
delay to the video stream, where a frame time is how long it takes to completely update an image.
A monitor that completely refreshes the screen at 60 Hz has a frame time of 16.67 ms. The two
streams of real and virtual images must be properly synchronized or temporal distortion results.
Also, optical see-through HMDs with narrow field-of-view combiners offer views of the real
world that have little distortion. Video cameras almost always have some amount of distortion that
must be compensated for, along with any distortion from the optics in front of the display devices.
Since video requires cameras and combiners that optical approaches do not need, video will
probably be more expensive and complicated to build than optical-based systems.
       1.         Resolution: Video blending limits the resolution of what the user sees, both real
and virtual, to the resolution of the display devices. With current displays, this resolution is far less
than the resolving power of the fovea. Optical see-through also shows the graphic images at the
resolution of the display device, but the user's view of the real world is not degraded. Thus, video
reduces the resolution of the real world, while optical see-through does not.
       1.         No eye offset: With video see-through, the user's view of the real world is
provided by the video cameras. In essence, this puts his "eyes" where the video cameras are. In
most configurations, the cameras are not located exactly where the user's eyes are, creating an
offset between the cameras and the real eyes. The distance separating the cameras may also not be
                                                                                                          27
exactly the same as the user's interpupillary distance (IPD). This difference between camera
locations and eye locations introduces displacements from what the user sees compared to what he
expects to see. For example, if the cameras are above the user's eyes, he will see the world from a
vantage point slightly taller than he is used to. Video see-through can avoid the eye offset problem
through the use of mirrors to create another set of optical paths that mimic the paths directly into
the user's eyes. Using those paths, the cameras will see what the user's eyes would normally see
without the HMD. However, this adds complexity to the HMD design. Offset is generally not a
difficult design problem for optical see-through displays. While the user's eye can rotate with
respect to the position of the HMD, the resulting errors are tiny. Using the eye's center of rotation
as the viewpoint in the computer graphics model should eliminate any need for eye tracking in an
optical see-through HMD .
focus at only one point in the optical path: the user's eye. Any filter that would selectively block
out light must be placed in the optical path at a point where the image is in focus, which obviously
cannot be the user's eye. Therefore, the optical system must have two places where the image is in
focus: at the user's eye and the point of the hypothetical filter. This makes the optical design much
more difficult and complex. No existing optical see-through HMD blocks incoming light in this
fashion. Thus, the virtual objects appear ghost-like and semi-transparent. This damages the illusion
of reality because occlusion is one of the strongest depth cues. In contrast, video see-through is far
more flexible about how it merges the real and virtual images. Since both the real and virtual are
available in digital form, video see-through compositors can, on a pixel-by-pixel basis, take the
real, or the virtual, or some blend between the two to simulate transparency. Because of this
flexibility, video see-through may ultimately produce more compelling environments than optical
see-through approaches.
undistorted by applying image processing techniques to unwarp the image, provided that the
optical distortion is well characterized. This requires significant amounts of computation, but this
constraint will be less important in the future as computers become faster. It is harder to build wide
field-of-view displays with optical see-through techniques. Any distortions of the user's view of
the real world must be corrected optically, rather than digitally, because the system has no
digitized image of the real world to manipulate. Complex optics are expensive and add weight to
the HMD. Wide field-of-view systems are an exception to the general trend of optical approaches
being simpler and cheaper than video approaches.
       1.         Real and virtual view delays can be matched: Video offers an approach for
reducing or avoiding problems caused by temporal mismatches between the real and virtual
images. Optical see-through HMDs offer an almost instantaneous view of the real world but a
delayed view of the virtual. This temporal mismatch can cause problems. With video approaches,
it is possible to delay the video of the real world to match the delay from the virtual image stream.
For details, see Section 4.3.
       Both optical and video technologies have their roles, and the choice of technology depends
on the application requirements. Many of the mechanical assembly and repair prototypes use
optical approaches, possibly because of the cost and safety issues. If successful, the equipment
would have to be replicated in large numbers to equip workers on a factory floor. In contrast, most
of the prototypes for
medical applications use video approaches, probably for the flexibility in blending real and virtual
and for the additional registration strategies offered.
                                                                                                   29
Focus can be a problem for both optical and video approaches. Ideally, the virtual should match
the real. In a video-based system, the combined virtual and real image will be projected at the
same distance by the monitor or HMD optics. However, depending on the video camera's depth-
of-field and focus settings, parts of the real world may not be in focus. In typical graphics
software, everything is rendered with a pinhole model, so all the graphic objects, regardless of
distance, are in focus. To overcome this, the graphics could be rendered to simulate a limited
depth-of-field, and the video camera might have an autofocus lens.
In the optical case, the virtual image is projected at some distance away from the user. This
distance may be adjustable, although it is often fixed. Therefore, while the real objects are at
varying distances from the user, the virtual objects are all projected to the same distance. If the
virtual and real distances are not matched for the particular objects that the user is looking at, it
may not be possible to clearly view both simultaneously.
Contrast is another issue because of the large dynamic range in real environments and in what the
human eye can detect. Ideally, the brightness of the real and virtual objects should be appropriately
matched. Unfortunately, in the worst case scenario, this means the system must match a very large
range of brightness levels. The eye is a logarithmic detector, where the brightest light that it can
handle is about eleven orders of magnitude greater than the smallest, including both dark-adapted
and light-adapted eyes. In any one adaptation state, the eye can cover about six orders of
magnitude. Most display devices cannot come close to this level of contrast. This is a particular
problem with optical technologies, because the user has a direct view of the real world. If the real
environment is too bright, it will wash out the virtual image.
If the real environment is too dark, the virtual image will wash out the real world. Contrast
problems are not as severe with video, because the video cameras themselves have limited
                                                                                                  30
dynamic response, and the view of both the real and virtual is generated by the monitor, so
everything must be clipped or compressed into the monitor's dynamic range.
4.4 Portability
In almost all Virtual Environment systems, the user is not encouraged to walk around much.
Instead, the user navigates by "flying" through the environment, walking on a treadmill, or driving
some mockup of a vehicle. Whatever the technology, the result is that the user stays in one place in
the real world.
Some AR applications, however, will need to support a user who will walk around a large
environment. AR requires that the user actually be at the place where the task is to take place.
"Flying," as performed in a VE system, is no longer an
option. If a mechanic needs to go to the other side of a jet engine, she must physically move
herself and the display devices she wears. Therefore, AR systems will place a premium on
portability, especially the ability to walk around outdoors, away from controlled environments.
The scene generator, the HMD, and the tracking system must all be self-contained and capable of
surviving exposure to the environment. If this capability is achieved, many more applications that
have not been tried will become available. For example, the ability to annotate the surrounding
environment could be useful to soldiers, hikers, or tourists in an unfamiliar new location.
The overall requirements of AR can be summarized by comparing them against the requirements
for Virtual Environments, for the three basic subsystems that they require.
Scene generator: Rendering is not currently one of the major problems in AR. VE systems have
much higher requirements for realistic images because they completely replace the real world with
the virtual environment. In AR, the virtual images only supplement the real world. Therefore,
                                                                                                    31
fewer virtual objects need to be drawn, and they do not necessarily have to be realistically
rendered in order to serve the purposes of the application. For example, in the annotation
applications, text and 3-D wireframe drawings might suffice. Ideally, photorealistic graphic
objects would be seamlessly merged with the real environment (see Section 7), but more basic
problems have to be solved first.
       1.         Display device: The display devices used in AR may have less stringent
requirements than VE systems demand, again because AR does not replace the real world. For
example, monochrome displays may be adequate for some AR applications, while virtually all VE
systems today use full color. Optical see-through HMDs with a small field-of-view may be
satisfactory because the user can still see the real world with his peripheral vision; the see-through
HMD does not shut off the user's normal field-of-view. Furthermore, the resolution of the monitor
in an optical see-through HMD might be lower than what a user would tolerate in a VE
application, since the optical see-through HMD does not reduce the resolution of the real
environment.
                                                                                                   32
CHAPTER 4
CONCLUSION
Augmented Reality is far behind Virtual Environments in maturity. Several commercial vendors
sell complete, turnkey Virtual Environment systems. However, no commercial vendor currently
sells an HMD-based Augmented Reality system. A few monitor-based "virtual set" systems are
available, but today AR systems are primarily found in academic and industrial research
laboratories.
        The first deployed HMD-based AR systems will probably be in the application of aircraft
manufacturing. Both Boeing and McDonnell Douglas are exploring this technology. The former
uses optical approaches, while the latter is pursuing video approaches. Boeing has performed trial
runs with workers using a prototype system but has not yet made any deployment decisions.
Annotation and visualization applications in restricted, limited-range environments are deployable
today, although much more work needs to be done to make them cost effective and flexible.
Applications in medical visualization will take longer. Prototype visualization aids have been used
on an experimental basis, but the stringent registration requirements and ramifications of mistakes
will postpone common usage for many years. AR will probably be used for medical training
before it is commonly used in surgery.
The next generation of combat aircraft will have Helmet-Mounted Sights with graphics registered
to targets in the environment. These displays, combined with short-range steerable missiles that
can shoot at targets off-boresight, give a tremendous combat advantage to pilots in dogfights.
Instead of having to be directly behind his target in order to shoot at it, a pilot can now shoot at
anything within a 60-90 degree cone of his aircraft's forward centerline. Russia and Israel currently
have systems with this capability, and the U.S. is expected to field the AIM-9X missile with its
associated Helmet-Mounted Sight in 2002 . Registration errors due to delays are a major problem
in this application.
Augmented Reality is a relatively new field, where most of the research efforts have occurred in
the past four years, as shown by the references listed at the end of this paper. The SIGGRAPH
"Rediscovering Our Fire" report identified Augmented Reality as one of four areas where
                                                                                                     33
SIGGRAPH should encourage more submissions . Because of the numerous challenges and
unexplored avenues in this area, AR will remain a vibrant area of research for at least the next
several years.
One area where a breakthrough is required is tracking an HMD outdoors at the accuracy required
by AR. If this is accomplished, several interesting applications will become possible. Two
examples are described here: navigation maps and visualization of past and future environments.
The first application is a navigation aid to people walking outdoors. These individuals could be
soldiers advancing upon their objective, hikers lost in the woods, or tourists seeking directions to
their intended destination. Today, these individuals must pull out a physical map and associate
what they see in the real environment around them with the markings on the 2–D map. If
landmarks are not easily identifiable, this association can be difficult to perform, as anyone lost in
the woods can attest. An AR system makes navigation easier by performing the association step
automatically. If the user's position and orientation are known, and the AR system has access to a
digital map of the area, then the AR system can draw the map in 3-D directly upon the user's view.
The user looks at a nearby mountain and sees graphics directly overlaid on the real environment
explaining the mountain's name, how tall it is, how far away it is, and where the trail is that leads
to the top.
The second application is visualization of locations and events as they were in the past or as they
will be after future changes are performed. Tourists that visit historical sites, such as a Civil War
battlefield or the Acropolis in Athens, Greece, do not see these locations as they were in the past,
due to changes over time. It is often difficult for a modern visitor to imagine what these sites really
looked like in the past. To help, some historical sites stage "Living History" events where
volunteers wear ancient clothes and reenact historical events. A tourist equipped with an outdoors
AR system could see a computer-generated version of Living History. The HMD could cover up
modern buildings and monuments in the background and show, directly on the grounds at
Gettysburg, where the Union and Confederate troops were at the fateful moment of Pickett's
charge. The gutted interior of the modern Parthenon would be filled in by computer-generated
representations of what it looked like in 430 BC, including the long-vanished gold statue of
Athena in the middle. Tourists and students walking around the grounds with such AR displays
would gain a much better understanding of these historical sites and the important events that took
place there. Similarly, AR displays could show what proposed architectural changes would look
                                                                                                   34
like before they are carried out. An urban designer could show clients and politicians what a new
stadium would look like as they walked around the adjoining neighborhood, to better understand
how the stadium project will affect nearby residents.
After the basic problems with AR are solved, the ultimate goal will be to generate virtual objects
that are so realistic that they are virtually indistinguishable from the real environment.
Photorealism has been demonstrated in feature films, but accomplishing this in an interactive
application will be much harder. Lighting conditions, surface reflections, and other properties must
be measured automatically, in real time. More sophisticated lighting, texturing, and shading
capabilities must run at interactive rates in future scene generators. Registration must be nearly
perfect, without manual intervention or adjustments. While these are difficult problems, they are
probably not insurmountable. It took about 25 years to progress from drawing stick figures on a
screen to the photorealistic dinosaurs in "Jurassic Park." Within another 25 years, we should be
able to wear a pair of AR glasses outdoors to see and interact with photorealistic dinosaurs eating a
tree in our backyard.
                                                                                                  35
Note: some of these references are available electronically at the following sites on the World
Wide Web:
[1]https://en.wikipedia.org/wiki/Augmented_reality
[2]http://www.augmentedrealitytrends.com/augmented-reality/5-ways-to-use-augmented-reality-
in-electronics.html
[3]https://www.electronicdesign.com/embedded/6-things-know-about-augmented-reality
[4]http://www.cs.columbia.edu/graphics/
[5]http://www.ecrc.de/
[6]http://www.ai.mit.edu/
[7]http://www.cs.unc.edu/
[8]http://vered.rose.utoronto.ca/etc-lab.html/