0% found this document useful (0 votes)
96 views29 pages

High-Fidelity Data Embedding For Image Annotation

ITIMP19 Final

Uploaded by

Vij Ay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views29 pages

High-Fidelity Data Embedding For Image Annotation

ITIMP19 Final

Uploaded by

Vij Ay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 29

HIGH-FIDELITY DATA EMBEDDING FOR IMAGE

ANNOTATION

Abstract
In this particular project we propose a novel method for
embedding data in an image. High fidelity is a demanding requirement
for data hiding, especially for images with artistic or medical value.
This correspondence proposes a high-fidelity image watermarking for
annotation with robustness to moderate distortion. To achieve the high
fidelity of the embedded image, we introduce a visual perception
model that aims at quantifying the local tolerance to noise for arbitrary
imagery. Based on this model, we embed two kinds of watermarks: a
pilot watermark that indicates the existence of the watermark and an
information watermark that conveys a payload of several dozen bits.
The objective is to embed 32 bits of metadata into a single image in
such a way that it is robust to JPEG compression and cropping. We
demonstrate the effectiveness of the visual model and the application
of the proposed annotation technology using a database of challenging
photographs and medical images that contain a large amount of
smooth regions. The embedding and the performance are all carried
out in MATLAB simulation.

CHAPTER I
INTRODUCTION

The term digital image refers to processing of a two dimensional picture by a


digital computer. In a broader context, it implies digital processing of any two
dimensional data. A digital image is an array of real or complex numbers represented by a
finite number of bits. An image given in the form of a transparency, slide, photograph or
an X-ray is first digitized and stored as a matrix of binary digits in computer memory.
This digitized image can then be processed and/or displayed on a high-resolution
television monitor. For display, the image is stored in a rapid-access buffer memory,
which refreshes the monitor at a rate of 25 frames per second to produce a visually
continuous display.

1.1 THE IMAGE PROCESSING SYSTEM


A typical digital
image processing system is given in fig.1.1
Digitizer

Image
Processor

Digital
Computer

Display

Mass Storage

Operator
Console

Hard Copy
Device

Fig 1.1 Block Diagram of a Typical Image Processing System

1.1.1 DIGITIZER
A digitizer converts an image into a numerical representation suitable for input
into a digital computer. Some common digitizers are
1. Microdensitometer
2. Flying spot scanner
3. Image dissector
4. Videocon camera
5. Photosensitive solid- state arrays.

1.1.2 IMAGE PROCESSOR


An image processor does the functions of image acquisition, storage,
preprocessing, segmentation, representation, recognition and interpretation and finally
displays or records the resulting image. The following block diagram gives the
fundamental sequence involved in an image processing system

Problem
Domain

Image
Acquisition

Preprocessing
.

Segmentation

Representation
& Description

Knowledge
Base

Recognition &
interpretation

Result

Fig 1.2 Block Diagram of Fundamental Sequence involved in an image


Processing system

As detailed in the diagram, the first step in the process is image acquisition by an
imaging sensor in conjunction with a digitizer to digitize the image. The next step is the
preprocessing step where the image is improved being fed as an input to the other
processes. Preprocessing typically deals with enhancing, removing noise, isolating
regions, etc. Segmentation partitions an image into its constituent parts or objects. The
output of segmentation is usually raw pixel data, which consists of either the boundary of
the region or the pixels in the region themselves. Representation is the process of
transforming the raw pixel data into a form useful for subsequent processing by the
computer. Description deals with extracting features that are basic in differentiating one
class of objects from another. Recognition assigns a label to an object based on the
information provided by its descriptors. Interpretation involves assigning meaning to an
ensemble of recognized objects. The knowledge about a problem domain is incorporated
into the knowledge base. The knowledge base guides the operation of each processing
module and also controls the interaction between the modules. Not all modules need be
necessarily present for a specific function. The composition of the image processing
system depends on its application. The frame rate of the image processor is normally
around 25 frames per second.

1.1.3 DIGITAL COMPUTER


Mathematical processing of the digitized image such as convolution, averaging,
addition, subtraction, etc. are done by the computer.

1.1.4 MASS STORAGE


The secondary storage devices normally used are floppy disks, CD ROMs etc.

1.1.5 HARD COPY DEVICE

The hard copy device is used to produce a permanent copy of the image and for
the storage of the software involved.

1.1.6 OPERATOR CONSOLE


The operator console consists of equipment and arrangements for verification of
intermediate results and for alterations in the software as and when require. The operator
is also capable of checking for any resulting errors and for the entry of requisite data.

CHAPTER II
IMAGE PROCESSING FUNDAMENTALS
2.1 INTRODUCTION
Digital image processing refers processing of the image in digital form. Modern
cameras may directly take the image in digital form but generally images are originated
in optical form. They are captured by video cameras and digitalized. The digitalization
process includes sampling, quantization. Then these images are processed by the five
fundamental processes, at least any one of them, not necessarily all of them.

2.2 IMAGE PROCESSING TECHNIQUES


This section gives various image processing techniques.
Image Enhancement

Image Restoration

I
P

Image Analysis
Image Compression
Image Synthesis

Fig 2.2.1 Image processing Techniques

2.2.1 IMAGE ENHANCEMENT


Image enhancement operations improve the qualities of an image like improving
the images contrast and brightness characteristics, reducing its noise content, or sharpen
the details. This just enhances the image and reveals the same information in more
understandable image. It does not add any information to it.

2.2.2 IMAGE RESTORATION


Image restoration like enhancement improves the qualities of image but all the
operations are mainly based on known, measured, or degradations of the original image.
Image restorations are used to restore images with problems such as geometric distortion,
improper focus, repetitive noise, and camera motion. It is used to correct images for
known degradations.

2.2.3 IMAGE ANALYSIS


Image analysis operations produce numerical or graphical information based on
characteristics of the original image. They break into objects and then classify them. They
depend on the image statistics. Common operations are extraction and description of
scene and image features, automated measurements, and object classification. Image
analyze are mainly used in machine vision applications.

2.2.4 IMAGE COMPRESSION

Image compression and decompression reduce the data content necessary to


describe the image. Most of the images contain lot of redundant information,
compression removes all the redundancies. Because of the compression the size is
reduced, so efficiently stored or transported. The compressed image is decompressed
when displayed. Lossless compression preserves the exact data in the original image, but
Lossy compression does not represent the original image but provide excellent
compression.

2.2.5 IMAGE SYNTHESIS


Image synthesis operations create images from other images or non-image data.
Image synthesis operations generally create images that are either physically impossible
or impractical to acquire.

2.3 APPLICATIONS OF DIGITAL IMAGE PROCESSING


Digital image processing has a broad spectrum of applications, such as remote
sensing via satellites and other spacecrafts, image transmission and storage for business
applications, medical processing, radar, sonar and acoustic image processing, robotics
and automated inspection of industrial parts.

2.3.1 MEDICAL APPLICATIONS


In medical applications, one is concerned with processing of chest X-rays,
cineangiograms, projection images of transaxial tomography and other medical images
that occur in radiology, nuclear magnetic resonance (NMR) and ultrasonic scanning.
These images may be used for patient screening and monitoring or for detection of
tumours or other disease in patients.

2.3.2 SATELLITE IMAGING

Images acquired by satellites are useful in tracking of earth resources;


geographical mapping; prediction of agricultural crops, urban growth and weather; flood
and fire control; and many other environmental applications. Space image applications
include recognition and analysis of objects contained in image obtained from deep spaceprobe missions.

2.3.3 COMMUNICATION
Image transmission and storage applications occur in broadcast television,
teleconferencing, and transmission of facsimile images for office automation,
communication of computer networks, closed-circuit television based security monitoring
systems and in military communications.

2.3.4 RADAR IMAGING SYSTEMS


Radar and sonar images are used for detection and recognition of various types of
targets or in guidance and manoeuvring of aircraft or missile systems.

2.3.5 DOCUMENT PROCESSING


It is used in scanning, and transmission for converting paper documents to a
digital image form, compressing the image, and storing it on magnetic tape. It is also used
in document reading for automatically detecting and recognizing printed characteristics.

2.3.6 DEFENSE/INTELLIGENCE
It is used in reconnaissance photo-interpretation for automatic interpretation of
earth satellite imagery to look for sensitive targets or military threats and target

acquisition and guidance for recognizing and tracking targets in real-time smart-bomb
and missile-guidance systems.

BLOCK DIAGRAM

Input
Image

Multiplier

Scale and
Normalize

STD
Filter
(3*3)

Subtractor

Spike Filter

STD
Filter
(5*5)

Complexity Map

Entropy
Filter

Low-Pass
Filter

EXISTING SYSTEMS
Simplified

frequency

scaling

model

in

their

spread-spectrum

watermarking called as Realizing upper-resolution with superimposed

projection.

More

explicit

masking

models,

including

frequency

masking and spatial masking discussed in, On the resolution limits of


superimposed projection.

DISADVANTAGES

There are limits due to the infinite signal ranges of the subframes in practical cases.

The reconstruction filter and the sampling may independently


cause the signal limits to be exceeded, making it practically
impossible to perfectly reproduce all input signal frequencies and
amplitudes.

PROPOSED METHOD
This

correspondence

proposes

high-fidelity

image

watermarking for annotation with robustness to moderate distortion. To


achieve the high fidelity of the embedded image, we introduce a visual
perception model that aims at quantifying the local tolerance to noise
for arbitrary imagery. Based on this model, we embed two kinds of
watermarks: a pilot watermark that indicates the existence of the
watermark and an information watermark that conveys a payload of
several dozen bits. The objective is to embed 32 bits of metadata into
a single image in such a way that it is robust to JPEG compression and
cropping.

ADVANTAGES

The two outputs are mixed using a nonlinear function and a


smoothing low-pass filter in a post-processing step. As a result,

image local region with sharp transitions as well as uniformly or


smoothly colored areas are distinguished as highly sensitive to
noise, whereas areas with random texture are identified as
tolerant to noise.

The results on a database of highly challenging photographic


images and medical images show the effectiveness of the
proposed annotation technology.

DOMAIN: DIGITAL IMAGE PROCESSING


Digital Image processing is the use of computer algorithms to perform image
processing on Digital Images. As a subfield of digital signal processing, digital image
processing has many advantages over analog image processing; it allows a much wider
range of algorithms to be applied to the input data, and can avoid problems such as the
build-up of noise and signal distortion during processing.

SOFTWARE REQUIREMENT
MATLAB 7.0 and above
MATLAB is a high-performance language for technical computing. It integrates
computation, visualization, and programming in an easy-to-use environment where
problems and solutions are expressed in familiar mathematical notation. Typical uses
include:

Math and computation

Algorithm development

Modeling, simulation, and prototyping

Data analysis, exploration, and visualization

Scientific and engineering graphics

Application development, including Graphical User Interface building

MATLAB is an interactive system whose basic data element is an array that does not
require dimensioning. This allows you to solve many technical computing problems,
especially those with matrix and vector formulations, in a fraction of the time it would
take to write a program in a scalar non-interactive language such as C or FORTRAN.

INTRODUCTION:
Data embedding provides a means to seamlessly associate
annotation data with host images. One of the key requirements of data
embedding based annotation is high fidelity, in order to preserve the
artistic or medical value of host images. A proper visual model is
important

to

control

the

distortion

introduced

by

embedded

information. There have been a number of human visual models


proposed and employed in the watermarking literature which employs
a

simplified

watermarking

frequency
paper.

scaling

More

model

explicit

in

their

masking

spread-spectrum

models,

including

frequency masking and spatial masking were developed. Based on the


works this project proposed a refined visual model to reduce ringing
artifacts along edges. The stringent requirement in high-fidelity
applications calls for more research into further reducing perceptual
artifacts in data embedding process.
To achieve such a goal, it is to introduce a pixel domain visual
model which combines the outputs of two filters: an entropy filter and
a differential standard deviation filter to estimate visual sensitivity to
noise. The two outputs are mixed using a nonlinear function and a
smoothing low-pass filter in a post processing step. As a result, image
local region with sharp transitions as well as uniformly or smoothly
colored areas are distinguished as highly sensitive to noise, whereas
areas with random texture are identified as tolerant to noise. Based on

the developed visual model, we design a data embedding system that


annotates images with 32 bits of metadata in a way that is robust to
JPEG compression and cropping. The 32-bit metadata is sufficient to
represent the indices to a database of around 4 billion images.
To meet these common requirements in image annotation
applications the proposed system embeds two watermarks: a pilot
watermark and an information watermark. The pilot watermark is a
strong direct- sequence spread-spectrum (SS) watermark to signal the
existence of the metadata and it is tiled over the image in the lapped
biorthogonal transform (LBT) domain. On top of the pilot watermark,
we embed the information watermark denoting the metadata bits
using a regional statistical quantization method. The quantization noise
is optimized to ensure the detection of both pilot and information
watermarks, as well as obey the constraints imposed by the perceptual
model. The detection is performed blindly without the information from
the host signal.

SCOPE OF THE PROJECT:


The

main

objective

is

to

propose

high-fidelity

image

watermarking for annotation with robustness to moderate distortion. To


achieve this, a visual perception model is introduced to quantify the
localized tolerance to noise for arbitrary imagery. The model is built by
mixing the outputs from an entropy filter and a differential localized
standard deviation filter. It can achieve
high fidelity for natural images with PSNR around 50 dB with
robustness to moderate JPEG compression, and meet the requirements
of near-lossless and almost-lossless for medical images with
robustness to moderate JPEG compression.
STEGANOGRAPHY:

Steganography is the art and science of writing hidden messages in such a way
that no one, apart from the sender and intended recipient, suspects the existence of the
message,

form

of security

through

obscurity.

The

word steganography is

of Greek origin and means "concealed writing" from the Greek words steganos meaning
covered or protected, and graphein () meaning to write. The first recorded use of
the term was in 1499 by Johannes Trithemius in his Steganographia, a treatise on
cryptography and steganography disguised as a book on magic. Generally, messages will
appear to be something else: images, articles, shopping lists, or some other covertext and,
classically, the hidden message may be in invisible ink between the visible lines of a
private letter.
The advantage of steganography, over cryptography alone, is that messages do not
attract attention to themselves. Plainly visible encrypted messagesno matter how
unbreakablewill arouse suspicion, and may in themselves be incriminating in countries
where encryption is illegal.Therefore, whereas cryptography protects the contents of a
message, steganography can be said to protect both messages and communicating parties.
Steganography includes the concealment of information within computer files. In
digital steganography, electronic communications may include steganographic coding
inside of a transport layer, such as a document file, image file, program or protocol.
Media files are ideal for steganographic transmission because of their large size. As a
simple example, a sender might start with an innocuous image file and adjust the color of
every 100th pixel to correspond to a letter in the alphabet, a change so subtle that
someone not specifically looking for it is unlikely to notice it.

DIGITAL WATERMARKING:

Digital watermarking is the process of embedding information into a digital


signal in a way that is difficult to remove. The signal may be audio, pictures or video, for

example. If the signal is copied, then the information is also carried in the copy. A signal
may carry several different watermarks at the same time.
In visible watermarking, the information is visible in the picture or video.
Typically, the information is text or a logo which identifies the owner of the media. The
image on the right has a visible watermark. When a television broadcaster adds its logo to
the corner of transmitted video, this is also a visible watermark.
In invisible watermarking, information is added as digital data to audio, picture or
video, but it cannot be perceived as such (although it may be possible to detect that some
amount of information is hidden). The watermark may be intended for widespread use
and is thus made easy to retrieve or it may be a form of Steganography, where a party
communicates a secret message embedded in the digital signal. In either case, as in
visible watermarking, the objective is to attach ownership or other descriptive
information to the signal in a way that is difficult to remove. It is also possible to use
hidden embedded information as a means of covert communication between individuals.
One application of watermarking is in copyright protection systems, which are
intended to prevent or deter unauthorized copying of digital media. In this use a copy
device retrieves the watermark from the signal before making a copy; the device makes a
decision to copy or not depending on the contents of the watermark. Another application
is in source tracing. A watermark is embedded into a digital signal at each point of
distribution. If a copy of the work is found later, then the watermark can be retrieved
from the copy and the source of the distribution is known. This technique has been
reportedly used to detect the source of illegally copied movies.
Annotation of digital photographs with descriptive information is another
application of invisible watermarking.
While some file formats for digital media can contain additional information
called metadata, digital watermarking is distinct in that the data is carried in the signal
itself.The use of the word of watermarking is derived from the much older notion of
placing a visible watermark on paper.
Digital Watermarking can be used for a wide range of applications such as:

Copyright protection

Source Tracking (Different recipients get differently watermarked content)

Broadcast Monitoring (Television news often contains watermarked video from


international agencies)

Covert Communication

IMAGE ANNOTATION:
Automatic image annotation (also known as automatic image tagging) is the
process by which a computer system automatically assigns metadata in the form of
captioning or keywords to a digital image. This application of computer vision techniques
is used in image retrieval systems to organize and locate images of interest from
a database.
This method can be regarded as a type of multi-class image classification with a
very large number of classes - as large as the vocabulary size. Typically, image analysis in
the form of extracted feature vectors and the training annotation words are used
by machine learning techniques to attempt to automatically apply annotations to new
images. The first methods learned the correlations between image features and training
annotations, then techniques were developed using machine translation to try and
translate the textual vocabulary with the 'visual vocabulary', or clustered regions known
as blobs. Work following these efforts have included classification approaches, relevance
models and so on.
The advantages of automatic image annotation versus content-based image
retrieval are that queries can be more naturally specified by the user. CBIR generally (at
present) requires users to search by image concepts such as color and texture, or finding
example queries. Certain image features in example images may override the concept that
the user is really focusing on. The traditional methods of image retrieval such as those
used by libraries have relied on manually annotated images, which is expensive and timeconsuming, especially given the large and constantly-growing image databases in
existence.

MODULE SEPERATION:
MODULE 1: Introduction of pixel domain visual model
MODULE 2: Design of data embedded system
MODULE 3: Watermark detection

MODULE I:
Introduction of pixel domain visual model
1.1 Design of Noise Tolerance Model
The proposed visual perceptual model is evaluated in the pixel luminance domain,
and we use the combination of local statistics through two filters to quantify the noise
tolerance of each pixel. One filter estimates the entropy of a local region centered at the
pixel-of-interest, and the other computes the differential standard deviation.
The output of the entropy filter indicates the content complexity of the
neighborhood for a given pixel k and we call it entropy map E (k). The entropy map
identifies pixels that are perceptually less tolerant to noise, and usually works well for
pixels with low entropy, i.e., regions with smoothly changing luminance. It is important
to note that high value of entropy does not necessarily imply strong tolerance to noise.
Examples that do not conform to this rule are irregular edges such as the one illustrated in
block Diagram
The differential standard deviation filter is used to detect the effect of edges on
visual fidelity. The differential standard deviation is calculated as the difference of two
standard deviations centered at pixel k: D(k)=|S(k,r1)-S(k,r2)| with block size r1>r2. If
both S(k,r1) and S(k,r2) are low, then the r1-neighborhood centered around k is
considered not tolerant to noise similarly to the entropy filter. On the other hand, if both
S(k,r1) and S(k,r2) have high values, it is very likely that the visual content around is
noisy and that it is more noise tolerant. The interesting case occurs for disproportionate

S(k,r1) and S(k,r2); in most cases, this signals an edge in the neighborhood of k and, thus,
low tolerance to noise.
We then combine the output of the two filters by employing a mixing function
that resembles a smooth AND operator between E(k) and D(k) to arrive at the combined
signal of m(D,E). We note that, in m(D,E), pixels around sharp edges have high values
indicating high noise-tolerance. However, experiments show that changes made near
sharp edge areas are easily visible and we should avoid making changes around those
areas. To achieve this, we generate a scaling map m(D,E) which has high values at sharp
edge areas and medium to low values at texture and smooth regions to scale down the
value of m(D,E) at edge areas. The final output, called complexity map, is generated as
f(I)=m(D,E)/m(D,E). As a result, image local region with sharp transitions as well as
uniformly or smoothly colored areas are distinguished as highly sensitive to noise,
whereas areas with random texture are identified as tolerant to noise.
In this correspondence, we implement the proposed human visual model via the
following filters. Given an image I, for each of its pixels k(I), we examine its r by r
neighborhood centered at k and define the metric of entropy map E(k,r) as follows:

(1)
The standard deviation S(k,r) is defined as

(2)
The employed mixing function is nonlinear and has the shape of a 2-D Gaussian
distribution

..(3)

Where D and E are normalized to be within the range [0,1] and parameter s adjusts the
shape of the function. Low values of s raise the complexity value for the pixel with both
high E and D while suppressing other pixels. A large s allows pixels with moderate E and
D to have moderately high complexity value. We generate the scaling map m(D,E) by
applying a 3-by-3 spike filter on D(k) followed by a low-pass filter. The spike filter is
defined as F1={-1/8, -1/8, -1/8; -1/8, 1, -1/8; -1/8, -1/8, -1/8}, with 1 in the center and
-1/8 in the remaining places. The final complexity map becomes f(I)=m(D,E)/m(D,E).
The objective of the data embedding is to add 32 bits of metadata into an image so
that their detection is as robust as possible to JPEG compression and cropping. We apply
the developed perceptual model and build the system in two steps. We first embed a
spread-spectrum pilot watermark whose objective is to signal metadata presence. The
second layer of watermark would be the information watermark that encodes the
metadata.
Pilot Watermark
The pilot watermark (PW) serves two purposes: to detect the existence of the
metadata, and to enable image registration at the detector side due to potential cropping
or other type of misalignment. We design the pilot watermark to be a random ____
sequence taking values from ___ __ or drawn from a standard normal distribution ___
__. The PW is spread over a continuous image region of size _ _ __. We refer to this
region as the basic PW block. We create a full image watermark by tiling the same basic
PW block. Consequently, the smallest image that can be ugmented with metadata is the
size of the basic PW block. In order to reduce blocking effects, we embed the PW in the
Lapped Biorthogonal Transform (LBT) domain [6]. LBT reduces the blocking effect by
synthesizing a block overlapped with its neighboring blocks. The synthesis function
decays to zero at the block boundaries to facilitate the elimination of blocking effects. In
the LBT domain, we choose middle frequency components for embedding in order to
preserve high visual quality and ensure robustness to lossy compression.We
introduce a mask for each 4-by-4 block of the LBT image: _____
____________ _ ,

where ones indicate the LBT coefficients used for PW embedding.


Next, we use

_ to adjust the energy of the PW according to

the content of the image. Since the complexity map is in the pixel
domain, we take the inverse LBT transform of the watermarked image
and re-weight the watermark in the pixel domain using

_.

With

the above considerations, the PW embedding can be expressed as _ _


__

_ ___ __ ____ _ __ , where _ denotes the masked PW, and

_ represents the watermarked image. Parameter _ adjusts the


watermark energy to achieve the desired trade-off between visual
quality and robustness. In our experiments, we have used
_ __ _ in order to create images with 5055 dB PSNR with respect to the
original. The PW alone is hardly visible at these noise levels.
Information Watermark
Once the pilot watermark is added, we perform the embedding of the second
watermark layer, the information watermark (IW), which represents the metadata bits.
Since the detector has no access to the original image, the host signal acts as a strong
source of noise during detection. To reduce the interference from the host signal, we
perform quantization on the first-order statistics of the source to embed the IW.
Specifically,we take the image region where one basicPWblock is embedded and
partition it into small building blocks. If the size of the basic PW block is _ _ __, the
number of pixels included in each building block equals _ _ _ _ ____

_,

where

denotes the scaling constant. Typically, we choose _ such that the size of the building
block is between 4_4 and 8_8 pixels. For example, for a basic PW block of size 640_480,
and _ ___ , we obtain _ _ _ , corresponding to a building block of dimensions 8_6 pixels.
Then, we randomly assign exactly_ distinct building blocks for each metadata bit.We
denote the set of all pixels that belong to these building blocks as ___ ___ __
We compute the first-order statistic of the pixel values in each __ as
Watermark Detection:

_.

The detection process consists of two steps. First, we determine whether the test
image contains a PW. If yes, we move on to extract the embedded metadata.To detect a
pilot watermark, we first transform the test image into the LBT domain to get _ .
Because the received image may have been cropped, we first align the test image by
detecting the PWs. This is done by sliding the basic PW block_ over _ and examining
the normalized cross-correlation (NCC) values.

CONCLUSION:
In this correspondence, we propose a high-fidelity image watermarking for
annotation with robustness to moderate distortion. To achieve the high fidelity of the
embedded image, a visual perception model is introduced to quantify the localized
tolerance to noise for arbitrary imagery. The model is built by mixing the outputs from an
entropy filter and a differential localized standard deviation filter. We then employ the
proposed visual model to embed 32 bits of metadata into a single image in a way that is
robust to JPEG compression and cropping while maintaining high fidelity. The results on
a database of highly challenging photographic images and medical images show the
effectiveness of the proposed annotation technology. It can achieve high fidelity for
natural images with PSNR around 50 dB with robustness to moderate JPEG compression,
and meet the requirements of near-lossless and almost-lossless for medical images
with robustness to moderate JPEG compression.
APPENDIX I
OUTPUTS

You might also like