0% found this document useful (0 votes)
37 views15 pages

LBP Image Processing

This article proposes two new methods for combining local binary patterns (LBPs) from multiple color channels of an image for content-based image retrieval. Existing methods simply concatenate LBPs across channels, increasing dimensionality. The proposed methods are an adder-based schema and a decoder-based schema to combine LBPs more efficiently while maintaining discriminative power. Experiments on 12 benchmark image databases show the new multichannel LBP methods outperform existing approaches in retrieval precision and rate.

Uploaded by

Kamel Slimane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views15 pages

LBP Image Processing

This article proposes two new methods for combining local binary patterns (LBPs) from multiple color channels of an image for content-based image retrieval. Existing methods simply concatenate LBPs across channels, increasing dimensionality. The proposed methods are an adder-based schema and a decoder-based schema to combine LBPs more efficiently while maintaining discriminative power. Experiments on 12 benchmark image databases show the new multichannel LBP methods outperform existing approaches in retrieval precision and rate.

Uploaded by

Kamel Slimane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2577887, IEEE
Transactions on Image Processing

4018 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 9, SEPTEMBER 2016

Multichannel Decoded Local Binary Patterns for


Content-Based Image Retrieval
Shiv Ram Dubey, Student Member, IEEE, Satish Kumar Singh, Senior Member, IEEE,
and Rajat Kumar Singh, Senior Member, IEEE

Abstract— Local binary pattern (LBP) is widely adopted for effectiveness in several applications [7]–[10]. Inspired from the
efficient image feature description and simplicity. To describe recognition of LBP, several other LBP variants are proposed
the color images, it is required to combine the LBPs from each in the literature [11]–[17], [36], [37]. These approaches are
channel of the image. The traditional way of binary combination
is to simply concatenate the LBPs from each channel, but it introduced basically for gray images, in other words only for
increases the dimensionality of the pattern. In order to cope one channel and performed well but most of the times in real
with this problem, this paper proposes a novel method for image cases the natural color images are required to be characterize
description with multichannel decoded LBPs. We introduce which are having multiple channel.
adder- and decoder-based two schemas for the combination of the A performance evaluating of color descriptors such as
LBPs from more than one channel. Image retrieval experiments
are performed to observe the effectiveness of the proposed color SIFT (we have termed mSIFT for color SIFT in
approaches and compared with the existing ways of multichannel this paper), Opponent SIFT, etc. are made for object and
techniques. The experiments are performed over 12 bench- scene Recognition in [39]. These descriptors first find the
mark natural scene and color texture image databases, such regions in the image using region detectors, then compute
as Corel-1k, MIT-VisTex, USPTex, Colored Brodatz, and so on. the descriptor over each region and finally the descriptor is
It is observed that the introduced multichannel adder- and
decoder-based LBPs significantly improve the retrieval per- formed by using bag-of-words (BoW) model. Researchers
formance over each database and outperform the other are also working to upgrade the BoW model [45]. Another
multichannel-based approaches in terms of the average retrieval interesting descriptor is GIST which is basically a holistic
precision and average retrieval rate. representation of features and has gained wider publicity due
Index Terms— Image retrieval, local patterns, multichannel, its high discriminative ability [40]–[42]. In order to encode
LBP, color, texture. the region based descriptors into a single descriptor, a vector
locally aggregated descriptors (VLAD) has been proposed in
I. I NTRODUCTION
the literature [43]. Recently, it is used with deep networks

I MAGE indexing and retrieval is demanding more and more


attention due to its rapid growth in many places. Image
retrieval has several applications such as in object recognition,
for image retrieval [44]. Fisher kernels are also used with
deep learning for the classification [46], [47]. Very recently,
a hybrid classification approach is designed by combining
biomedical, agriculture, etc [1]. The aim of Content Based the fisher vectors with the neural networks [49]. Some other
Image Retrieval (CBIR) is to extract the similar images of recent developments are deep convolutional neural networks
a given image from huge databases by matching a given for imagenet classification [48], super vector coding [50],
query image with the images of the database. Matching of discriminative sparse neighbor coding [51], fast coding with
two images is facilitated by the matching of actually its feature neighbor-to-neighbor search [52], projected transfer sparse
descriptors (i.e. image signatures). It means the performance of coding [53] and implicitly transferred codebooks based visual
any image retrieval system heavily depends upon the image representation [54]. These methods generally better for the
feature descriptors being matched [2]. Color, texture, shape, classification problem, whereas we designed the descriptors
gradient, etc. are the basic type of features to describe the in this paper for image retrieval. Our methods do not require
image [2]–[4]. Texture based image feature description is very any training information in the descriptor construction process.
common in the research community. Recently, local pattern Still, we compared the results with SIFT and GIST for image
based descriptors have been used for the purpose of image retrieval.
feature description. Local binary pattern (LBP) [5], [6] has A recent trend of CBIR has been efficient search and
extensively gained the popularity due to its simplicity and retrieval for large-scale datasets using hashing and binary
Manuscript received November 7, 2015; revised February 13, 2016; coding techniques. Various methods proposed recently for the
accepted May 30, 2016. Date of publication June 7, 2016; date of current large scale image hashing for efficient image search such as
version July 1, 2016. The associate editor coordinating the review of this Multiview Alignment Hashing (MAH) [61], Neighborhood
manuscript and approving it for publication was Prof. Ling Shao.
The authors are with the Indian Institute of Information Technology, Discriminant Hashing (NDH) [62], Evolutionary Compact
Allahabad 211012, India (e-mail: shivram1987@gmail.com; sk.singh@ Embedding (ECE) [63] and Unsupervised Bilinear Local
iiita.ac.in; rajatsingh@iiita.ac.in). Hashing (UBLH) [64]. These methods can be used with the
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. high discriminative descriptors to improve the efficiency of
Digital Object Identifier 10.1109/TIP.2016.2577887 image search.
1057-7149 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2577887, IEEE
Transactions on Image Processing

DUBEY et al.: MULTICHANNEL DECODED LBPs FOR CONTENT-BASED IMAGE RETRIEVAL 4019

patterns of each channel into the single one as depicted in the


Fig. 1(b). The dimension of the final descriptor is very high
and not suited for the real time computer vision applications.
In the third category (see Fig. 1(c)), the histograms are com-
puted for each channel independently and finally aggregated
to form the feature descriptor, for example, [20]–[25].
Heng et al. [20] computed the multiple types of LBP
patterns over multiple channels of the image such as Cr, Cb,
Gray, Low pass and High pass channels and concatenated the
histograms of all LBPs to form the single feature descriptor.
To reduce the dimension of the feature descriptor, they selected
some features from the histograms of LBPs using shrink boost
method. Choi et al. [21] computed the LBP histograms over
each channel of a YIQ color image and finally concatenated to
from the final features. Zhu et al. [22] have extracted the multi-
scale LBPs by varying the number of local neighbors and
radius of local neighborhood over each channel of the image
and concatenated all LBPs to construct the single descriptor.
They also concatenated multiple LBPs extracted from each
channel of RGB color image [23]. The histograms of multi-
scale LBPs are also aggregated in [24] but over each channel
of multiple color spaces such as RGB, HSV, YCbCr, etc.
Fig. 1. Illustration of four types of the multichannel feature extraction
technique using two input channels, (a) Each channel is quantized and merged To reduce the dimension of the descriptor, Principle Com-
to form a single channel and then descriptor is computed over it, (b) Binary ponent Analysis is employed in [25]. A local color vector
patterns extracted over each channel are concatenated to form a single binary binary pattern is defined by Lee et al. for face recogni-
pattern and then histogram is computed over it, obviously this mechanism
results the high dimensional feature vector, (c) Histograms of binary patterns tion [25]. They computed the histogram of color norm pattern
extracted over each channel are concatenated to from the final feature vector, (i.e. LBP of color norm values) using Y, I and Q channels
obviously the mutual information among each is not utilized, and (d) Binary as well as the histogram of color angular pattern (i.e. LBP
patterns extracted over each channel are converted into other binary patterns
using some processing and finally histogram of generated binary patterns of color angle values) using Y and I channels and finally
are concatenated to form the final feature vector (generalized versions are concatenated these histograms to form the descriptor. The main
proposed in this paper). problem with these approaches is that the discriminative ability
is not much improved because these methods have not utilized
the inter channel information of the images very efficiently.
To describe the color images using local patterns, In order to overcome the drawback of the third category, the
several researchers adopted the multichannel feature extrac- fourth category comes into the picture where some of bits
tion approaches. These techniques can be classified in five of the binary patterns of two channels are transformed and
categories. The first category as shown in Fig. 1(a) first then the rest of the histogram computation and concatenation
quantizes each channel then merges each quantized channel takes place over the transformed binary patterns as portrayed
to form a single channel and form the feature vector over it. in the Fig. 1(d). The mCENTRIST [26] is an example of this
Some typical example of this category is Local Color Occur- category where Xiao et al. [26] have used at most two channels
rence Descriptor (LCOD) [18], Rotation and Scale Invariant at a time for the transformation. In this method, the problem
Hybrid Descriptor (RSHD) [35], Color Difference His- arises when more than two channels are required to model,
togram (CDH) [38] and Color CENTRIST [19]. LCOD basi- then the author suggested to apply the same mechanism over
cally quantized the Red, Green and Blue channels of the image each combination of two channels which in turn increases the
and formed a single image by pooling the quantized images computational cost of the descriptor.
and finally computed the occurrences of each quantized color In furtherance of solving the above mentioned problems
locally to form the feature descriptor [18]. Similarly, RSHD of multichannel based feature descriptors, we generalized the
computed the occurrences of textural patterns [35] and CDH 4th category of multichannel based descriptors where any num-
used the color quantization in its construction process [38]. ber of channels can be used simultaneously for the transforma-
Chu et al. [19] have quantized the H, S and V channels tion. In this scheme a transformation function is used to encode
of the HSV color image into 2, 4 and 32 values respec- the relationship among the local binary patterns of channels.
tively and represented by 1, 2 and 5 binary bits respectively. We proposed two new approaches of this category in this
They concatenated the 1, 2 and 5 binary bits of quantized paper, where transformation is done on the basis of adder and
H, S and V channels and converted back into the decimal decoder concepts. The Local Binary Pattern [6] is used in con-
to find the single channel image and finally the features junction with our methods as the feature description over each
are computed over this image. The major drawback of this Red, Green and Blue channel of the image. Consider the case
category is the loss of information in the process of quanti- of color LBP, where simply the LBP histogram over each
zation. The second category simply concatenates the binary channel are just concatenated, there is no cross-channel

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2577887, IEEE
Transactions on Image Processing

4020 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 9, SEPTEMBER 2016

TABLE I
T RUTH TABLE OF A DDER AND D ECODER M AP W ITH 3 I NPUT C HANNELS

Fig. 2. The local neighbors Itn (x, y) of a center pixel It (x, y) in t th channel
in polar coordinate system for n ∈ [1, N ] and t ∈ [1, c]. and f n is a weighting function defined by the following
equation,
co-occurrence information, whereas, if we want to preserve the
f n = (2)(n−1) , ∀n ∈ [1, N] (3)
cross-channel co-occurrence information then the dimension
of the final descriptor will be too high. So, in order to capture We have set of N binary values LBPnt (x, y) for a partic-
the cross-channel co-occurrence information to some extent, ular pixel (x, y) corresponding to each neighbor Itn (x, y) of
we proposed the adder and decoder based method with lower t t h channel. Now we apply the proposed concept of multichan-
dimensions. Moreover, the joint information of each channel is nel LBP adder and multichannel LBP decoder by considering
captured in each of the output channels of adder and decoder LBPnt (x, y) |∀t ∈ [1, c] as the c input channels.
before the computation of the histogram. We validated the Let, the multichannel adder based local binary patterns
proposed approach against the image retrieval experiments maLBPnt1 (x, y) and multichannel decoder based local binary
over twelve benchmark databases including natural scenes and patterns mdLBPnt2 (x, y) are the outputs of the multichannel
color textures. LBP adder and multichannel LBP decoder respectively, where
The rest of the paper is organized in following manner; t1 ∈ [1, c + 1] and t2 ∈ [1, 2c ]. Note that the values
Section II introduces the multichannel decoded Local Binary of LBPnt (x, y) are in the binary form (i.e. either 0 or 1).
Patterns; Section III discusses the distance measures and Thus, the values of maLBPnt1 (x, y) and mdLBPnt2 (x, y) are
evaluation criteria. Image retrieval experiments using proposed also in the binary form generated from the multichan-
methods are performed in section IV with results discussion; nel adder map maM n (x, y) and multichannel decoder map
and finally section V concludes the paper. mdM n (x, y) respectively corresponding to the each neighbor
n of pixel (x, y).
II. M ULTICHANNEL D ECODED L OCAL B INARY PATTERNS The truth map of maM n (x, y) and mdM n (x, y) for c = 3
are shown in Table 1 are having 4 and 8 distinct values
In this section, we proposed two multichannel decoded local respectively. Mathematically, the maM n (x, y) and mdM n (x, y)
binary pattern approaches namely multichannel adder based are defined as,
local binary pattern (maLBP) and multichannel decoder based c
local binary pattern (mdLBP) to utilize the local binary pattern maM n (x, y) = LBPnt (x, y) (4)
information of multiple channels in efficient manners. Total tc=1
mdM n (x, y) = 2(c−t ) ×LBPnt (x, y) (5)
c+1 and 2c number of output channels are generated by using t =1
multichannel adder and decoder respectively from c number We denote (x, y) for ∀n ∈ [1, N] and ∀t ∈ [1, c]
LBPnt
of input channels for c ≥ 2. by input patterns, maLBPnt1 (x, y) for ∀n ∈ [1, N] and ∀t1 ∈
Let It is the t t h channel of any image I of size u × v × c, [1, c+1] by adder patterns and mdLBPnt2 (x, y) for ∀n ∈ [1, N]
where t ∈ [1, c] and c is the total number of channels. If the and ∀t2 ∈ [1, 2c ] by decoder patterns respectively.
N neighbors equally-spaced at radius R of any pixel It (x, y) The multichannel adder based local binary pattern
for x ∈ [1, u] and y ∈ [1, v] are defined as Itn (x, y) also maLBPnt1 (x, y) for pixel (x, y) from multichannel adder map
depicted in Fig. 2, where n ∈ [1, N]. Then, according to the maM n (x, y) and t1 is defined as,
definition of the Local Binary Pattern (LBP) [6], a local binary 
pattern LBPt (x, y) for a particular pixel (x, y) in t t h channel 1, if maM n (x, y) = (t 1 − 1)
maLBPt1 (x, y) =
n
(6)
is generated by computing a binary value LBPnt (x, y) given 0, otherwise
by the following equation,
for ∀t1 ∈ [1, c + 1] and ∀n ∈ [1,N].

N Similarly, the multichannel decoder based local binary pat-
LBPt (x, y) = LBPnt (x, y) × f n , ∀t ∈ [1, c] (1) tern mdLBPnt2 (x, y) for pixel (x, y) from multichannel decoder
n=1 map mdM n (x, y) and t2 can be computed as,

where, 1, if mdM n (x, y) = (t 2 − 1)
 mdLBPnt2 (x, y) = (7)
1, Itn (x, y) ≥ It (x, y) 0, otherwise
LBPnt (x, y) = (2)
0, otherwise for ∀t2 ∈ [1, 2c ] and ∀n ∈ [1,N].

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2577887, IEEE
Transactions on Image Processing

DUBEY et al.: MULTICHANNEL DECODED LBPs FOR CONTENT-BASED IMAGE RETRIEVAL 4021

Fig. 3. An illustration of the computation of the adder/decoder binary patterns, and adder/decoder decimal values for c = 3 and N = 8.

and multichannel decoder map mdM n (x, y) generated using


input patterns are depicted in Fig. 3(b). The value of c is 3
in the example of Fig. 3 so the range of t1 and t2 is
[1, 4] and [1, 8] respectively. The adder patterns containing
maLBPnt1 (x, y) for t1 ∈ [1, 4] and n ∈ [1, 8] generated from
maM n (x, y) of Fig. 3(b) is depicted in Fig. 3(d) and the
decoder patterns containing mdLBPnt2 (x, y) for t2 ∈ [1, 8] and
n ∈ [1, 8] generated from mdM n (x, y) of Fig. 3(b) is illus-
trated in the Fig. 3(f). Note that, the values of maLBPnt1 (x, y)
will be 1 only if the value of maM n (x, y) is (t1 − 1) and
the values of mdLBPnt2 (x, y) will be 1 only if the value of
mdM n (x, y) is (t2 − 1).
Consider the RGB image, if the value of mdLBPnt2 is 1 for
t2 = 1, it means that n t h neighbor in red, green and blue
channels are smaller than center values in respective channels.
If the value of mdLBPnt2 is 1 for t2 = 4, it means that the
n t h neighbor in red channel is smaller than the center value in
that channel and the n t h neighbor in green and blue channels
are greater than the center values in respective channels.
In other words, mdLBPnt2 represents a unique combination of
red, green and blue channels i.e. encoding of the cross channel
Fig. 4. (a) RGB image, (b) R channel, (c) G channel, (d) B channel, information.
(e) LBP map over R channel, (f) LBP map over G channel, (g) LBP map over The multichannel adder based local binary patterns
B channel, (h-k) 4 output channels of the adder, and (l-s) 8 output channels (maLBPt 1 (x, y) |∀t1 ∈ [1, c + 1]) for the center pixel (x, y)
of the decoder using 3 input LBP map of R, G and B channels.
is computed using maLBPnt1 (x, y) in the following manner,
The computation of maLBPt1 (x, y) for ∀t1 ∈ [1, c + 1] and N
mdLBPt2 (x, y) for ∀t2 ∈ [1, 2c ] from input LBPnt (x, y) using maLBPt 1 (x, y) = maLBPnt1 (x, y) × f n (8)
n=1
an example is illustrated in Fig. 3 for c = 3 and N = 8.
The input patterns containing LBPnt (x, y) for ∀t ∈ [1, 3] are and multichannel decoder based local binary patterns
shown in Fig. 3(a). The multichannel adder map maM n (x, y) (mdLBPt 2 (x, y) |∀t2 ∈ [1, 2c ] is computed for the center pixel

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2577887, IEEE
Transactions on Image Processing

4022 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 9, SEPTEMBER 2016

(x, y) from mdLBPnt2 (x, y) using the following equation,


N
mdLBPt 2 (x, y) = mdLBPnt2 (x, y) × f n (9)
n=1

where f n is a weighting function defined in (3).


The four multichannel adder local binary patterns
(i.e. maLBPt 1 for t1 ∈ [1, 4]) and eight multichannel decoder
local binary patterns (i.e. mdLBPt 2 for t2 ∈ [1, 8]) of example
binary patterns (i.e. LBPnt (x, y) for t ∈ [1, 3] and n ∈ [1, 8])
of Fig. 3(a) are mentioned in the Fig. 3(e) and Fig. 3(g)
respectively. An illustration of the adder output channels
and decoder output channels are presented in the Fig. 4 for
an example image. An input image in RGB color space
(i.e. c = 3) is shown in Fig. 4(a). The corresponding
Red (R), Green (G) and Blue (B) channels are extracted in the
Fig. 4(b-d) respectively. Three LBPs corresponding to
the Fig. 4(b-d) are portrayed in the Fig. 4(e-g) for
R, G and B channels respectively. The four output channels
of the adder and eight output channels of the decoder are
displayed in Fig. 4(h-k) and Fig. 4(l-s) respectively. It can Fig. 5. The flowchart of computation of multichannel adder based local
be perceived from the Fig. 4 that the decoder channels are binary pattern feature vector (i.e. maLBP) and multichannel decoder based
local binary pattern feature vector (i.e. mdLBP) of an image from its Red (R),
having a better texture differentiation as compared to the adder Green (G) and Blue (B) channels.
channels and input channels while adder channels are better
differentiated than input channels. In other words, we can say
that by applying the adder and decoder transformation the channel respectively and given as,
inter channel de-correlated information among the adder and 1  maLBP1 maLBP2 
decoder channels increases as compared to the same among maLBP = H ,H , . . . , HmaLBPc+1 (13)
c +1
the input channels. 1
The feature vector (i.e. histogram) of t1t h output channel mdLBP = c HmdLBP1 , HmdLBP2 , . . . , HmdLBP2c (14)
2
of the adder (i.e. maLBPt1 ) is computed using the following
The process of computation of maLBP and mdLBP feature
equation,
descriptor of an image is illustrated in Fig. 5 with the help
1 of a schematic diagram. In this diagram, Red, Green and
HmaLBPt1 (ζ ) = Blue channels of the image are considered as the three input
(u − 2R) (v − 2R)
R v− R channels. Thus, four and eight output channels are produced

u−   
× ξ maLBPt1 (x, y), ζ (10) by the adder and decoder respectively.
x=R+1 y=R+1 III. D ISTANCE M EASURES AND E VALUATION C RITERIA
In this section, we discuss the several distance measures
for ∀ζ ∈ [0, 2 N − 1] and ∀t1 ∈ [1, c + 1], where, u × v is the and evaluation criteria to confirm the improved performance
dimension of the input image I (i.e. total number of pixels) of proposed feature descriptors for image retrieval.
and ξ (δ1 , δ2 ) is a function and given as follows,
 A. Distance Measure
1, i f δ1 = δ2
ξ (δ1 , δ2 ) = (11) The basic aim of distance measures is to find out the
0, Else
similarity between the feature vectors of two images. Six types
of distances used in this paper [55], [56] are as follows:
Similarly, the feature vector of t2t h output channel of the
1) Euclidean distance, 2) L1 or Manhattan distance,
decoder (i.e. mdLBPt2 ) is computed as follows,
3) Canberra distance, 4) Chi-square (Chisq) or χ 2 distance,
1 5) Cosine distance, and 6) D1 distance.
HmdLBPt2 (ζ ) =
(u − 2R) (v − 2R)
R v− R B. Evaluation Criteria

u−   
× ξ mdLBPt2 (x, y), ζ (12) In content based image retrieval, the main task is to find
x=R+1 y=R+1 most similar images of a query image in the whole database.
  We used each image of any database as a query image
for ∀ζ ∈ [0, 2 N − 1] and ∀t2 ∈ 1, 2c . and retrieved NR most similar images. We used Precision
The final feature vector of multichannel adder based LBP and Recall curves to represent the effectiveness of proposed
and multichannel decoder based LBP are given by concatenat- descriptor. For a particular database, the average retrieval
ing the histograms of maLBPs and mdLBPs over each output precision (ARP) and average retrieval rate (ARR) are given

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2577887, IEEE
Transactions on Image Processing

DUBEY et al.: MULTICHANNEL DECODED LBPs FOR CONTENT-BASED IMAGE RETRIEVAL 4023

TABLE II
I MAGE D ATABASES S UMMARY

Fig. 6. Example images of the (a) Corel-1k [27], (2) FTVL database [31].

Fig. 6(a-b) depicts some images of each category of the


Corel-1k and FTVL databases respectively. In the experiments,
as follows, each image of the database is turned as the query image.
C C
A P (i )
i=1 A R (i ) For each query image, the system retrieves top matching
ARP = & A R R = i=1 (15)
C C images from the database on the basis of the shortest similarity
Where C is the total number of categories in that database, score measured using different distances between the query
A P and A R are the average precision and average recall image and database images. If the returned image is from the
respectively for a particular category of that database and category of the query image, then we say that the system has
defined as follows for i t h category, appropriately retrieved the target image, else, the system has
failed to retrieve the target image.
Ci
j =1 Pr ( j ) The performances of different descriptors are investigated
A P (i ) = & A R (i ) using average precision, average recall, ARP and ARR.
Ci
Ci To demonstrate the effectiveness of the proposed approach,
j =1 Re ( j )
= ∀i ∈ [1, C] (16) we compared our results of Multichannel Adder and Decoder
Ci Local Binary Pattern (i.e. maLBP & mdLBP) with existing
Where Ci is the number of images in the i t h category of that methods such as Local Binary Pattern (LBP) [6], Color Local
database, Pr and Re are the precision and recall for a query Binary Pattern (cLBP) [21], Multi-Scale Color Local Binary
image and defined as follows for j t h image of i t h category, Pattern (mscLBP) [22], and mCENTRIST [26] over each
database. We also considered a uniform pattern (u2) and
NS NS rotation invariant uniform pattern (riu2) [14] of each descriptor
Pr ( j ) = & Re ( j ) = ∀ j ∈ [1, Ci ] (17)
NR ND and compared its performances. We have considered N = 8
where NS is the number of retrieved similar images, NR is the and R = 1 in all experiments.
number of retrieved images, and ND is the number of similar
images in the whole database. A. Experiments With Different Distance Measures
We performed an experiment to investigate that which
IV. E XPERIMENTS AND R ESULTS distance measure is better suited for the proposed scheme.
We conducted extensive CBIR experiments over twelve We compared the performance of image retrieval in terms
databases containing the color images of natural scenes, of the average precision rate (ARP) in percentage using
textures, etc. The number of images, number of categories, Euclidean, L1, Canberra, Chi-square (χ 2 ), Cosine and
number of images in each category and image resolutions used D1 distance measures and shown the results in Table III-V over
in the experiments are mentioned in Table II. The images of Corel-1k, MIT-VisTex and USPTex databases respectively for
a category of a particular database are semantically similar. 10 numbers of retrieved images. In Table III-V, all distance
For example, the Corel-1k database consists of 1000 images measures are evaluated for the introduced methods maLBP
from 10 categories namely ‘Buildings’, ‘Buses’, ‘Dinosaurs’, and mdLBP as well as existing methods LBP, cLBP, mscLBP,
‘Elephants’, ‘Flowers’, ‘Food’, ‘Horses’, ‘Africans’, ‘Beaches’ and mCENTRIST. Note that three color channels Red (R),
and ‘Mountains’ having 100 images each. FTVL data- Green (G) and Blue (B) are used to find the proposed
base is having 2612 images of fruits and vegetables of descriptors for this experiment. The ARP of each LBP, cLBP,
15 types such as ‘Agata Potato’, ‘Asterix Potato’, ‘Cashew’, mscLBP, and mCENTRIST is better with χ 2 distance over two
‘Diamond Peach’, ‘Fuji Apple’, ‘Granny Smith Apple’, databases, whereas it is better over each database for maLBP
‘Honneydew Melon’, ‘Kiwi’, ‘Nectarine’, ‘Onion’, ‘Orange’, and mdLBP. From the results of this experiment, it is drawn
‘Plum’, ‘Spanish Pear’, ‘Watermelon’ and ‘Taiti Lime’. that χ 2 distance measure is better suited with the introduced

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2577887, IEEE
Transactions on Image Processing

4024 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 9, SEPTEMBER 2016

TABLE III
ARP (%) U SING D IFFERENT D ISTANCE M EASURES
ON C OREL -1k D ATABASE

TABLE IV
ARP (%) U SING D IFFERENT D ISTANCE M EASURES
ON MIT-VisTex D ATABASE

Fig. 7. ARP (%) for different combinations of channels of RGB color space
using maLBP and mdLBP descriptors over (a) Corel-1k, (b) MIT-VisTex,
(c) Corel-Tex, and (d) ALOT database when 10 similar images are retrieved.

the c inputs are being transformed into 2c outputs (i.e. no


information loss), whereas, in case of adder the c inputs are
being transformed into c + 1 outputs (i.e. some information is
TABLE V
being lost). Moreover, the information loss becomes more for
ARP (%) U SING D IFFERENT D ISTANCE M EASURES
ON USPTex D ATABASE
higher values of c (i.e. the loss is more for c = 3 as compared
to c = 2 because the number of outputs in {adder, decoder}
is {3, 4} and {4, 8} for c = 2 and 3 respectively). One more
important assertion observed in this experiment is that the
performance of both maLBP and mdLBP is better for c = 3
(i.e. using three channels) as compared to the c = 2 (i.e. using
only two channels) except the case of maLBP over ALOT
database. In order to know the reason for this, we computed
the average standard deviation (ASD) for all the images
of each database over each channel. The ASD over {Red,
Green, Blue} channels is {59.44, 54.87, 55.45}, {52.26, 48.25,
concept of multichannel decoded patterns for image retrieval. 43.83}, {55.03, 53.78, 51.23}, and {36.98, 35.40, 34.89} for
It is also noticed that this distance is better suited for remaining Corel-1k, MIT-VisTex, Corel-Tex, and ALOT database respec-
descriptors also in most of the cases. In the rest of the results tively. It can be observed that the ASD for ALOT database is
of this paper χ 2 distance measure will be used to find the very low which means that the intensity values are very close
dissimilarity between two descriptors. to each other. This is the fact that maLBP for c = 3 is not
performing better than maLBP for c = 2 over ALOT database
B. Experiments With Varying Number of Input Channels because the actual combinations of input are less. It is also
The behavior of proposed multichannel based descriptors pointed out that the degree of performance improvement for
is also observed by conducting an experiment with varying c = 3 is much better for mdLBP as compared to maLBP
number of input channels (c). For c = 3, all three channels descriptor. From this experiment, it is determined that all three
Red (R), Green (G) and Blue (B) are used whereas for c = 2, channels play a crucial role and by removing any one channel
three combinations are considered; 1) (R, G), 2) (R, B) and the performance degrades drastically.
3) (G, B). The image retrieval experiments are performed In the rest of the paper, the value of c will be considered
over Corel-1k, MIT-VisTex, Corel-Tex and ALOT databases as 3 until or otherwise stated. We also compared the ARP
as depicted in Fig. 7(a-d) respectively. In this experiment, obtained using maLBP and mdLBP with mCENTRIST over
10 images are retrieved using maLBP and mdLBP with (R, G), four databases for only combinations of two channels in Fig. 8
(R, B), (G, B) and (R, G, B) input channels and ARP (%) and found that the performance of proposed descriptors are
is computed to demonstrate the performance. For c = 2, always better as compared to the mCENTRIST for c = 2.
the ARP of maLBP is generally better than the ARP of
mdLBP except over ALOT database. Whereas, for c = 3, C. Experiments With Different Color Spaces
the ARP of mdLBP is far better than the ARP of maLBP. One In order to find out the preferable color space for our
possible explanation of this behavior is that in case of decoder schemes, we performed the image retrieval experiments over

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2577887, IEEE
Transactions on Image Processing

DUBEY et al.: MULTICHANNEL DECODED LBPs FOR CONTENT-BASED IMAGE RETRIEVAL 4025

Fig. 8. ARP (%) for different combinations of two channels of RGB color
space using mCENTRIST, maLBP and mdLBP descriptors over (a) Corel-1k,
(b) MIT-VisTex, (c) USPTex, and (d) ALOT database when 10 similar images
are retrieved.

Corel-1k and MIT-VisTex databases in four different color


spaces namely RGB, HSV, L∗ a∗ b∗ and YCbCr. The results
of these experiments are represented in terms of the ARP Fig. 9. The comparison of performance among RGB, HSV, L∗ a∗ b∗ and
YCbCr color spaces in terms of the ARP (%) using (a) maLBP, (b) mdLBP,
in Fig. 9(a-f) using maLBP, mdLBP, maLBPu2, mdLBPu2, (c) maLBPu2, (d) mdLBPu2, (e) maLBPriu2, and (f) mdLBPriu2 descriptors
maLBPriu2, and mdLBPriu2 descriptors respectively for over Corel-1k and MIT-VisTex databases when 10 numbers of top similar
10 numbers of retrieved images. The performance of each images are retrieved.
descriptor in each database is better in RGB color space
because the channels of RGB color space are highly correlated
as compared to the HSV, L∗ a∗ b∗ and YCbCr color spaces. The MIT-VisTex, STex-512S, USPTex, and FTVL databases. The
performance of each descriptor is poor in L∗ a∗ b∗ color space performance of adder based maLBP descriptor is also better
over natural scene database (i.e. Corel-1k) and also in most of than other descriptors such as LBP, cLBP, mscLBP and
the experiments over color texture database (i.e. MIT-VisTex). mCENTRIST over Corel-1k, MIT-VisTex, and USPTex data-
The ARP of each descriptor in YCbCr color space is better bases as depicted in Fig. 10. The performance of each descrip-
as compared to the same in HSV color space over Corel-1k tor over each database is also compared using uniform (i.e. u2)
database. Whereas, it is reversed over MIT-VisTex database and rotation invariant uniform (i.e. riu2) patterns in Fig. 11-12
except using mdLBPriu2 descriptor. In the rest of the paper, respectively using ARP vs NR curves. It can be observed
the RGB color space will be used until or otherwise stated. that the ARP values using mdLBPu2 and mdLBPriu2 are
far better than the ARP values using remaining descriptors
under both u2 and riu2 transformation (see Fig. 11-12).
D. Comparison With Existing Approaches As far as adder based maLBP is concerned, its performance
We have performed extensive image retrieval experiments is more improved under u2 as compared to the riu2. The
over ten databases of varying number of categories as well as performance of mCENTRIST is drastically degraded under
varying number of images per category to report the improved u2 and riu2 conditions. One possible explanation for this
performance of proposed multichannel decoded local binary behavior of mCENTRIST is that it is computed by interchang-
patterns. We have reported the results using average retrieval ing the half of the binary pattern of two channels which causes
precision (ARP), average retrieval rate (ARR), average preci- the loss in rotation invariance property. While other hand,
sion per category (AP) and average recall per category (AR) the performance of mscLBP is improved under u2 and riu2
as the function of number of retrieved images (NR). Fig. 10 conditions which exhibit its rotation invariant property. The
shows the ARP vs NR plots for LBP, cLBP, mscLBP, image retrieval experimental results over remaining databases
mCENTRIST, maLBP and mdLBP descriptors over Corel-1k, are demonstrated in terms of the ARR vs NR plots using
MIT-VisTex, STex, USPTex, FTVL, KTH-TIPS, KTH-TIPS2a, each descriptor in Fig. 13. The performance of each descriptor
and Corel-Tex databases. The ARP values for each is compared over ALOT and ZuBuD databases without any
NR ∈ [1, 10] using decoder based mdLBP descriptor are transformation (see Fig. 13(a-b)), with u2 transformation (see
higher than other descriptors over each database (see Fig. 10). Fig. 13(c-d)) and with riu2 transformation (see Fig. 13(e-f)).
Moreover, the performance of mdLBP is much better over The ARR values using proposed maLBP are higher than

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2577887, IEEE
Transactions on Image Processing

4026 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 9, SEPTEMBER 2016

Fig. 10. The performance comparison of proposed maLBP and mdLBP descriptor with existing approaches such as LBP, cLBP, mscLBP, and mCENTRIST
descriptors using ARP vs number of retrieved images over Corel-1k, MIT-VisTex, STex, USPTex, FTVL, KTH-TIPS, KTH-TIPS2a, and Corel-Tex databases.

Fig. 11. The performance comparison of proposed maLBP and mdLBP descriptor with existing approaches such as LBP, cLBP, mscLBP, and mCENTRIST
descriptors under uniform transformation (u2) using ARP vs number of retrieved images plot over Corel-1k, MIT-VisTex, STex, USPTex, FTVL, KTH-TIPS,
KTH-TIPS2a, and Corel-Tex databases.

Fig. 12. The performance comparison of proposed multichannel decoded local binary patterns with existing approaches under rotation invariant uniform
transformation (riu2) using ARP vs number of retrieved images curve over Corel-1k, MIT-VisTex, STex, USPTex, FTVL, KTH-TIPS, KTH-TIPS2a, and
Corel-Tex databases.

existing approaches in most of the results of Fig. 13, while the degree of improvement in the performance of mdLBP is
the ARR values using proposed mdLBP are higher than higher over the ALOT database as compared to the ZuBuD
other approaches in each result of the Fig. 13, moreover, database. We also explored the categorical performance

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2577887, IEEE
Transactions on Image Processing

DUBEY et al.: MULTICHANNEL DECODED LBPs FOR CONTENT-BASED IMAGE RETRIEVAL 4027

It is noticed across the plots of the Fig. 14 that the


performance of mdLBP is better and consistent in most of
the categories of each database while at the other end the
performance of maLBP is also better in most of the categories
as compared to the existing multichannel based approaches.
We have drawn the following assertions from Fig. 10-14:
1. The proposed decoder based mdLBP descriptor
outperformed the other multichannel based descrip-
tors in terms of the ARP over Corel-1k, MIT-VisTex,
STex, USPTex, FTVL, KTH-TIPS, KTH-TIPS2a, and
Corel-Tex databases.
2. The mdLBP descriptor outperformed the other
multichannel based descriptors in terms of the ARR
also over ALOT and ZuBuD databases.
3. The mdLBPu2 and mdLBPriu2 are also outperformed
the remaining descriptors under u2 and riu2 conditions
respectively.
4. The performance of proposed maLBP descriptor is not
as improved as it is improved by mdLBP descriptor in
terms of the ARP and ARR values.
5. The categorical performance of the proposed methods
is also better as compared to the existing methods
including u2 and riu2 scenarios.
The top 10 similar retrieved images for the 9 query images
using each descriptor from the Corel-1k database is displayed
Fig. 13. The performance evaluation of proposed methods under (a-b) without
transformation, (c-d) u2 transformation, and (e-f) riu2 transformation in terms
in Fig. 15. The Corel-1k database is having 10 categories
of the ARR vs number of retrieved images over ALOT and ZuBuD databases. with 100 images in each category. Note that the rows of
each subfigure of Fig. 15 represent the retrieved images
using different descriptors in following manner: 1st row using
LBP, 2nd row using cLBP, 3rd row using mscLBP, 4th row
using mCENTRIST, 5th row using maLBP and 6th row using
mdLBP; the 10 columns in each subfigure corresponds to the
10 retrieved images in decreasing order of similarity (i.e. the
images in the 1st column are the top most similar images
while it is the query images also). Fig. 15(a) has shown the
retrieved images for a query image from the ‘Building’ cate-
gory. The {precision, recall} obtained by LBP, cLBP, mscLBP,
mCENTRIST, maLBP and mdLBP for this example are {30%,
3%}, {30%, 3%}, {30%, 3%}, {40%, 4%}, {60%, 6%}, and
{60%, 6%} respectively. A query image from the ‘Bus’
category is considered in the Fig. 15(b). In this example,
only proposed maLBP and mdLBP descriptors are able to
retrieve the all images from the ‘Bus’ category (i.e. 100%
precision). All the descriptors have gained 100% precision
for an example query image from ‘Dinosaurs’ category as
depicted in the Fig. 15(c). Whereas, the retrieved images using
Fig. 14. Experimental results using different descriptors in terms of the mscLBP and mdLBP are more semantically similar with the
average precision (%) over (a) Corel-1k, and (b) MIT-VisTex, and average query image because the orientation of the ‘Dinosaur’ is same
recall (%) over (c) FTVL, and (d) STex-512S databases for each category of
the database. in 3rd and 6th rows of the Fig. 15(c). The precision achieved
using LBP, cLBP, mscLBP, mCENTRIST, maLBP and mdLBP
descriptors for an example query image of ‘Elephant’ are
of the descriptors over Corel-1k, MIT-VisTex, FTVL and 60%, 60%, 70%, 60%, 60%, and 90% (see the Fig. 15(d)). In
STex-512S databases in Fig. 14. Average precision is measured Fig. 15(e), more semantic images are retrieved by the mdLBP
over Corel-1k (using descriptors without transformation) and for a query image from ‘Flower’ as all the retrieved images
MIT-VisTex (using descriptors with u2 transformation) data- are having similar appearance as illustrated in the 6th row
bases, while, average recall is measured over FTVL (using of the Fig. 15(e). The retrieval precision for a query image
descriptors without transformation) and STex-512S (using of type ‘Food’ is very high for the proposed approaches,
descriptors with riu2 transformation) databases. whereas, it is very low for the existing approaches as portrayed

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2577887, IEEE
Transactions on Image Processing

4028 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 9, SEPTEMBER 2016

Fig. 15. Top 10 retrieved images using each descriptor from Corel-1k database by considering the query image from (a) ‘Building’, (b) ‘Bus’, (c) ‘Dinosaurs’,
(d) ‘Elephant’, (e) ‘Flower’, (f) ‘Food’, (g) ‘Horse’, (h) ‘Africans’, and (i) ‘Beaches’ categories. Note that 6 rows in each subfigure corresponds to the different
descriptor such as LBP (1st row), cLBP (2nd row), mscLBP (3rd row), mCENTRIST (4th row), maLBP (5th row) and mdLBP (6th row) and 10 columns
in each subfigure corresponds to the 10 retrieved images in decreasing similarity order; images in the 1st column are query images as well as the top most
similar images.

in the Fig. 15(f). The number of correct images retrieved computer having Intel(R) Core(TM) i5 CPU 650@3.20 GHz
using LBP, cLBP, mscLBP, mCENTRIST, maLBP and mdLBP processor, 4 GB RAM, and 32-bit Windows 7 Ultimate operat-
descriptors for a query image from ‘Horse’ category are 7, 7, ing system with 4-cores active. The feature dimension of each
6, 6, 9, and 10 (see the Fig. 15(g)). The retrieval precision descriptor is mentioned in the Table VI with feature extraction
gained by proposed descriptors are also high as compared to and retrieval times over Corel-1k and MIT-VisTex databases.
the existing descriptors for the query images from categories The feature extraction time of mdLBP is {2.55, 1.93}, {0.73,
‘Africans’ and ‘Beaches’ as demonstrated in the Fig. 15(h-i) 0.96} and {1.78, 1.28} times slower than the feature extraction
respectively. time of cLBP, mscLBP and mCENTRIST respectively over
It is deduced from the retrieval results that the precision {Corel-1k, MIT-VisTex} databases. While at the other end, the
and recall using proposed multichannel based maLBP and feature extraction time of maLBP is nearly {−13%, −30%},
mdLBP descriptor is high as compared to the same using LBP {204%, 41%} and {25%, 5%} faster than the feature extrac-
and existing multichannel based approaches such as cLBP, tion time of cLBP, mscLBP and mCENTRIST respectively
mscLBP and mCENTRIST descriptors. It is also observed over {Corel-1k, MIT-VisTex} databases. The retrieval time
that the performance of mdLBP is better than the maLBP. using mdLBP is {4, 2.31}, {0.85, 0.77}, {1.44, 1.16}, and
It is shown by the experiments that proposed mdLBP method {2.9, 1.78} times slower than the retrieval time using cLBP,
outperforms other methods because mdLBP encodes each mscLBP and mCENTRIST, and maLBP descriptors over
combination of the red, green and blue channels locally from {Corel-1k, MIT-VisTex} databases. The feature extraction
its LBP binary values. The color in images is depicted by three time of each descriptor is nearly equal with u2 and riu2
values but most of methods process these values separately transformations also. The retrieval time using mdLBPu2 and
which loss the cross channel information. Whereas, mdLBP mdLBPriu2 is nearly 10 and 50 times better respectively than
takes all the combinations of LBP binary value computed over the retrieval time using mdLBP over Corel-1k database. The
each channel using a decoder based methodology. feature extraction time and retrieval time using each descriptor
is more over Corel-1k database because this database is having
E. Analysis Over Feature Extraction and Retrieval Time more number of images with large resolution as compared to
We analyzed the feature extraction time as well as the MIT-VisTex database images. From the Table VI, it is
the retrieval time for each descriptor over Corel-1k and explored that the maLBP is more time efficient whereas
MIT-VisTex databases. Both the feature extraction and mdLBP is less time efficient as the dimension of mdLBP is
retrieval time are computed in seconds using a personal higher than others except mscLBP.

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2577887, IEEE
Transactions on Image Processing

DUBEY et al.: MULTICHANNEL DECODED LBPs FOR CONTENT-BASED IMAGE RETRIEVAL 4029

TABLE VI TABLE VII


F EATURE D IMENSIONS VS T IME C OMPLEXITY IN S ECONDS T HE ARP VALUES FOR NR=10 U SING N ON -LBP
AND P ROPOSED D ESCRIPTORS

ALOT-Complete [33]. The Colored Brodatz database has


112 color texture images of dimension 640 × 640. Each image
is partitioned into 256 non-overlapping images of dimension
40 × 40 which represent one category of the database. Thus,
Colored Brodatz database consists of the 28672 images from
112 categories. The ALOT-Complete database is having total
25000 images from 250 categories with 100 images per
category. We have turned the first 10 images of each category
as the query for this experiment instead of random images to
F. Comparison With Non-LBP Based Descriptors
ensure the reproducibility of the results. The ARP and ARR
In order to depict the suitability of the proposed descrip-
values over these databases are depicted in Fig. 16 for LBP,
tors, we also compared with state-of-the-art non-LBP based
cLBP, mscLBP, mCENTRIST, mSIFT, mGIST, CDH, maLBP
descriptors such as multichannel SIFT [39], multichannel
and mdLBP descriptors. As expected, mdLBP outperforms
GIST [39], and color difference histogram (CDH) [38].
all the other descriptors over both color texture databases,
We have concatenated the SIFT and GIST descriptors com-
whereas maLBP is not that good. This analysis shows that
puted over the Red, Green and Blue channels to form the
decoder based multichannel LBP descriptor can also be used
mSIFT and mGIST descriptors. In order to compute the
for large-scale color texture image retrieval.
OpponentSIFT descriptor, we concatenated the SIFT descrip-
tor computed over the opponent Red, opponent Green and
opponent Blue channels. We have used the online available H. Analyzing the CLBP in Proposed Architecture
code of SIFT [58] used by Wang et al. [59] and GIST [60] In order to exhibit the generalized properties of the proposed
released by Oliva and Torralba [40], whereas implemented idea, we computed the CLBP (i.e. sign and magnitude both)
the CDH descriptor. The ARP values for 10 numbers of with adder and decoder mechanism under u2 transforma-
retrieved images using mSIFT, OpponentSIFT, mGIST, CDH, tion and termed as maCLBPu2 and mdCLBPu2 respectively.
maLBP and mdLBP descriptors are listed in Table VII. It is We also compared maCLBPu2 and mdCLBPu2 with
investigated that mdLBP outperforms other descriptors over maLBPu2 and mdLBPu2 in Table VIII over Corel-1k,
nearly each database except KTH-TIPS. The presence of KTH-TIPS, ALOT, ZuBuD, Corel-Tex and USPTex databases.
images with different scales in KTH-TIPS database is the It is obvious that the retrieval performance for CLBP based
reason behind this performance. It is our belief that the scaling descriptors is generally better as it utilizes both sign and
problem in our descriptors can be overcome by adopting magnitude information of local differences. This experiment
the multi-scale scenario of LBP in the proposed architecture. also explores the adoptability nature of proposed adder and
It is also observed that the performance of maLBP is bet- decoder for LBP based descriptors.
ter than non-LBP based descriptors in most of the cases.
Crucial information produced by this result is that the perfor-
I. Analyzing the Robustness of Proposed Descriptors
mance of mSIFT and OpponentSIFT descriptors are drastically
down over textural databases as compared to the natural In order to emphasize the performance of proposed descrip-
databases. One possible reason is that the local regions are tors under uniform illumination, rotation and image scale
not being detected very accurately over the textural databases. differences, we synthesized the Illumination, Rotation, and
Scale databases. The Illumination database is obtained by
G. Results Over Large Databases adding −60, −30, 0, 30, and 60 in each channel (i.e. Red,
We also tested the proposed descriptors over large Green and Blue) of the first 20 images of each category of
color texture databases such as Colored Brodatz [57] and Corel-1k database. The Rotate database is obtained by rotating

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2577887, IEEE
Transactions on Image Processing

4030 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 9, SEPTEMBER 2016

a) it considers the number of occurrences in local neighbor-


hood and b) it considers the larger neighborhood respectively.
The robustness of maLBP is also comparable with the similar
kind of descriptors.

V. C ONCLUSION
In this paper, two multichannel decoded local binary pat-
terns are introduced namely multichannel adder local binary
pattern (maLBP) and multichannel decoder local binary
pattern (mdLBP). Basically both maLBP and mdLBP have
utilized the local information of multiple channels on the basis
of the adder and decoder concepts. The proposed methods are
evaluated using image retrieval experiments over ten databases
having images of natural scene and color textures. The results
are computed in terms of the average precision rate and aver-
age retrieval rate and improved performance is observed when
Fig. 16. Comparison of proposed descriptors maLBP and mdLBP with LBP, compared with the results of the existing multichannel based
cLBP, mscLBP, mCENTRIST, mSIFT, mGIST and CDH over large databases approaches over each database. From the experimental results,
such as (a-b) Colored Brodatz and (c-d) ALOT-Complete in terms of the it is concluded that the maLBP descriptor is not showing the
ARP and ARR.
best performance in most of the cases while mdLBP descriptor
outperforms the existing state-of-the-art multichannel based
TABLE VIII
descriptors. It is also deduced that Chi-square distance measure
P ERFORMANCE A NALYSIS OF P ROPOSED I DEA W ITH CLBP IN T ERMS
OF THE ARP W HEN THE N UMBER OF R ETRIEVED I MAGES I S 10 is better suited with the proposed image descriptors. The
performance of the proposed descriptors is much improved
for three input channels and also in the RGB color space.
The performance of mdLBP is also superior to non-LBP
descriptors. It is also pointed out that mdLBP outperforms the
state-of-the-art descriptors over large databases. Experiments
also suggested that the introduced approach is generalized and
can be applied over any LBP based descriptor. The increased
dimension of the decoder based descriptor slows down the
retrieval time which is the future direction of this research. One
future aspect of this research is to make the descriptors noise
robust which can be achieved by using the noise robust binary
patterns over each channel as the input to the adder/decoder.

R EFERENCES
[1] A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain,
“Content-based image retrieval at the end of the early years,” IEEE
Trans. Pattern Anal. Mach. Intell., vol. 22, no. 12, pp. 1349–1380,
Dec. 2000.
[2] Y. Liu, D. Zhang, G. Lu, and W. Y. Ma, “A survey of content-based
Fig. 17. Comparison of proposed descriptors maLBP and mdLBP with LBP, image retrieval with high-level semantics,” Pattern Recognit., vol. 40,
cLBP, mscLBP, mCENTRIST, mSIFT, mGIST and CDH over the databases no. 1, pp. 262–282, 2007.
having uniform illumination, rotation and image scale changes. [3] S. R. Dubey, S. K. Singh, and R. K. Singh, “Rotation and illumination
invariant interleaved intensity order based local descriptor,” IEEE Trans.
Image Process., vol. 23, no. 12, pp. 5323–5333, Dec. 2014.
the first 25 images of each category of Corel-1k with angle [4] S. R. Dubey, S. K. Singh, and R. K. Singh, “A multi-channel based
illumination compensation mechanism for brightness invariant image
0, 90, 180, and 270 degrees. The Scale database is obtained retrieval,” Multimedia Tools Appl., vol. 74, no. 24, pp. 11223–11253,
by scaling the first 20 images of each category of Corel-1k at 2015.
the scales of 0.5, 0.75, 1, 1.25, and 1.5. Thus, each database [5] T. Ojala, M. Pietikäinen, and D. Harwood, “A comparative study of
texture measures with classification based on featured distributions,”
consists of the 1000 images with 100 images per category. Pattern Recognit., vol. 29, no. 1, pp. 51–59, 1996.
The retrieval results over these databases are displayed [6] T. Ojala, M. Pietikäinen, and T. Mäenpää, “Multiresolution gray-scale
in Fig. 17 in terms of ARP (%) when 20 images are and rotation invariant texture classification with local binary patterns,”
IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 7, pp. 971–987,
retrieved using each descriptor. The mdLBP outperforms the Jul. 2002.
remaining descriptors in case of uniform intensity change. [7] A. Hadid and G. Zhao, Computer Vision Using Local Binary Patterns,
The performance of mdLBP is also better in rotation and vol. 40. New York, NY, USA: Springer, 2011.
[8] T. Ahonen, A. Hadid, and M. Pietikäinen, “Face description with local
scaling conditions except CDH. The performance of CDH binary patterns: Application to face recognition,” IEEE Trans. Pattern
is very well in rotation and scaling conditions because Anal. Mach. Intell., vol. 28, no. 12, pp. 2037–2041, Dec. 2006.

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2577887, IEEE
Transactions on Image Processing

DUBEY et al.: MULTICHANNEL DECODED LBPs FOR CONTENT-BASED IMAGE RETRIEVAL 4031

[9] D. Huang, C. Shan, M. Ardabilian, Y. Wang, and L. Chen, “Local binary [34] Zurich Buildings Database (ZuBuD). [Online]. Available: http://www.
patterns and its application to facial image analysis: A survey,” IEEE vision.ee.ethz.ch/datasets/index.en.html
Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 41, no. 6, pp. 765–781, [35] S. R. Dubey, S. K. Singh, and R. K. Singh, “Rotation and scale invariant
Nov. 2011. hybrid image descriptor and retrieval,” Comput. Electr. Eng., vol. 46,
[10] C. Shan, S. Gong, and P. W. McOwan, “Facial expression recognition pp. 288–302, Sep. 2015.
based on local binary patterns: A comprehensive study,” Image Vis. [36] S. R. Dubey, S. K. Singh, and R. K. Singh, “Local bit-plane decoded
Comput., vol. 27, no. 6, pp. 803–816, 2009. pattern: A novel feature descriptor for biomedical image retrieval,” IEEE
[11] S. R. Dubey, S. K. Singh, and R. K. Singh, “Local diagonal extrema J. Biomed. Health Inf., in press.
pattern: A new and efficient feature descriptor for CT image retrieval,” [37] S. R. Dubey, S. K. Singh, and R. K. Singh, “Local wavelet pattern:
IEEE Signal Process. Lett., vol. 22, no. 9, pp. 1215–1219, Sep. 2015. A new feature descriptor for image retrieval in medical CT databases,”
[12] M. Heikkilä, M. Pietikäinen, and C. Schmid, “Description of interest IEEE Trans. Image Process., vol. 24, no. 12, pp. 5892–5903,
regions with local binary patterns,” Pattern Recognit., vol. 42, no. 3, Dec. 2015.
pp. 425–436, 2009. [38] G. H. Liu and J. Y. Yang, “Content-based image retrieval using color
[13] S. Liao, M. W. K. Law, and A. C. S. Chung, “Dominant local binary difference histogram,” Pattern Recognit., vol. 46, no. 1, pp. 188–198,
patterns for texture classification,” IEEE Trans. Image Process., vol. 18, 2013.
no. 5, pp. 1107–1118, May 2009. [39] K. E. A. van de Sande, T. Gevers, and C. G. M. Snoek, “Evaluating
[14] Z. Guo, L. Zhang, and D. Zhang, “Rotation invariant texture classi- color descriptors for object and scene recognition,” IEEE Trans. Pattern
fication using LBP variance (LBPV) with global matching,” Pattern Anal. Mach. Intell., vol. 32, no. 9, pp. 1582–1596, Sep. 2010.
Recognit., vol. 43, no. 3, pp. 706–719, 2010. [40] A. Oliva and A. Torralba, “Modeling the shape of the scene: A holistic
[15] Z. Guo and D. Zhang, “A completed modeling of local binary pattern representation of the spatial envelope,” Int. J. Comput. Vis., vol. 42,
operator for texture classification,” IEEE Trans. Image Process., vol. 19, no. 3, pp. 145–175, 2001.
no. 6, pp. 1657–1663, Jan. 2010. [41] M. Brown and S. Süsstrunk, “Multi-spectral SIFT for scene category
[16] X. Tan and B. Triggs, “Enhanced local texture feature sets for face recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
recognition under difficult lighting conditions,” IEEE Trans. Image Jun. 2011, pp. 177–184.
Process., vol. 19, no. 6, pp. 1635–1650, Jun. 2010. [42] M. Douze, H. Jégou, H. Sandhawalia, L. Amsaleg, and C. Schmid,
[17] B. Zhang, Y. Gao, S. Zhao, and J. Liu, “Local derivative pattern versus “Evaluation of gist descriptors for Web-scale image search,” in Proc.
local binary pattern: Face recognition with high-order local pattern ACM Int. Conf. Image Video Retr., Jul. 2009, pp. 19–27.
descriptor,” IEEE Trans. Image Process., vol. 19, no. 2, pp. 533–544, [43] H. Jégou, M. Douze, C. Schmid, and P. Pérez, “Aggregating local
Feb. 2010. descriptors into a compact image representation,” in Proc. IEEE Conf.
[18] S. R. Dubey, S. K. Singh, and R. K. Singh, “Local neighborhood based Comput. Vis. Pattern Recognit., Jun. 2010, pp. 3304–3311.
robust colour occurrence descriptor for colour image retrieval,” IET [44] J. Y.-H. Ng, F. Yang, and L. S. Davis, “Exploiting local features from
Image Process., vol. 9, no. 7, pp. 578–586, Jul. 2015. deep networks for image retrieval,” in Proc. IEEE Int. Conf. Com-
[19] W. T. Chu, C. H. Chen, and H. N. Hsu, “Color CENTRIST: Embedding put. Vis. Pattern Recognit., DeepVis. Workshop (CVPRW), Apr. 2015,
color information in scene categorization,” J. Vis. Commun. Image pp. 53–61.
Represent., vol. 25, no. 5, pp. 840–854, 2014. [45] H. Jégou, M. Douze, and C. Schmid, “Improving bag-of-features
[20] C. K. Heng, S. Yokomitsu, Y. Matsumoto, and H. Tamura, “Shrink for large scale image search,” Int. J. Comput. Vis., vol. 87, no. 3,
boost for selecting multi-LBP histogram features in object detection,” pp. 316–336, May 2010.
in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recognit., Jun. 2012, [46] V. Sydorov, M. Sakurada, and C. H. Lampert, “Deep Fisher kernels–end
pp. 3250–3257. to end learning of the Fisher kernel GMM parameters,” in Proc. IEEE
[21] J. Y. Choi, K. N. Plataniotis, and Y. M. Ro, “Using colour local binary Conf. Comput. Vis. Pattern Recognit., Jun. 2014, pp. 1402–1409.
pattern features for face recognition,” in Proc. 17th IEEE Int. Conf. [47] K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep Fisher networks
Image Process. (ICIP), Sep. 2010, pp. 4541–4544. for large-scale image classification,” in Proc. Adv. Neural Inf. Process.
[22] C. Zhu, C. E. Bichot, and L. Chen, “Multi-scale color local binary Syst., 2013, pp. 163–171.
patterns for visual object classes recognition,” in Proc. IEEE Int. Conf. [48] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
Pattern Recognit., Aug. 2010, pp. 3065–3068. with deep convolutional neural networks,” in Proc. Adv. Neural Inf.
[23] C. Zhu, C.-E. Bichot, and L. Chen, “Image region description using Process. Syst., 2012, pp. 1097–1105.
orthogonal combination of local binary patterns enhanced with color [49] F. Perronnin and D. Larlus, “Fisher vectors meet neural networks:
information,” Pattern Recognit., vol. 46, no. 7, pp. 1949–1963, Jul. 2013. A hybrid classification architecture,” in Proc. IEEE Conf. Comput. Vis.
[24] S. Banerji, A. Verma, and C. Liu, “Novel color LBP descriptors for Pattern Recognit., Jun. 2015, pp. 3743–3752.
scene and image texture classification,” in Proc. 15th Int. Conf. Image [50] X. Zhou, K. Yu, T. Zhang, and T. S. Huang, “Image classification using
Process., Comput. Vis., Pattern Recognit., Las Vegas, NV, USA, 2011, super-vector coding of local image descriptors,” in Proc. Eur. Conf.
pp. 537–543. Comput. Vis., Sep. 2010, pp. 141–154.
[25] S. H. Lee, J. Y. Choi, Y. M. Ro, and K. N. Plataniotis, “Local color vector [51] X. Bai, C. Yan, P. Ren, L. Bai, and J. Zhou, “Discriminative sparse
binary patterns from multichannel face images for face recognition,” neighbor coding,” Multimedia Tools Appl., vol. 75, no. 7, pp. 4013–
IEEE Trans. Image Process., vol. 21, no. 4, pp. 2347–2353, Apr. 2012. 4037, 2016.
[26] Y. Xiao, J. Wu, and J. Yuan, “mCENTRIST: A multi-channel feature [52] N. Inoue and K. Shinoda, “Fast coding of feature vectors using neighbor-
generation mechanism for scene categorization,” IEEE Trans. Image to-neighbor search,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38,
Process., vol. 23, no. 2, pp. 823–836, Feb. 2014. no. 6, pp. 1160–1184, 2016.
[27] Corel Photo Collection Color Image Database, accessed on Aug. 2014. [53] X. Li, M. Fang, and J. J. Zhang, “Projected transfer sparse coding for
[Online]. Available: http://wang.ist.psu.edu/docs/realted/ cross domain image representation,” J. Vis. Commun. Image Represent.,
[28] MIT Vision and Modeling Group, Cambridge, U.K. Vision vol. 33, pp. 265–272, Nov. 2015.
Texture, accessed on Aug. 2014. [Online]. Available: [54] C. Zhang, J. Cheng, J. Liu, J. Pang, Q. Huang, and Q. Tian, “Beyond
http://vismod.media.mit.edu/pub/ explicit codebook generation: Visual representation using implicitly
[29] Salzburg Texture Image Database, accessed on Aug. 2014. [Online]. transferred codebooks,” IEEE Trans. Image Process., vol. 24, no. 12,
Available: http://www.wavelab.at/sources/STex/ pp. 5777–5788, Dec. 2015.
[30] A. R. Backes, D. Casanova, and O. M. Bruno, “Color texture analy- [55] S. Murala and Q. J. Wu, “Local ternary co-occurrence patterns: A new
sis based on fractal descriptors,” Pattern Recognit., vol. 45, no. 5, feature descriptor for MRI and CT image retrieval,” Neurocomputing,
pp. 1984–1992, 2012. vol. 119, no. 7, pp. 399–412, 2013.
[31] FTVL Database, accessed on Aug. 2014. [Online]. Available: [56] Distance Measures MATLAB Code, accessed on Aug. 2014.
http://www.ic.unicamp.br/~rocha/pub/downloads/tropical-fruits-DB- [Online]. Available: http://www.cs.columbia.edu/~mmerler/project/code/
1024x768.tar.gz/ pdist2.m
[32] KTH-TIPS Texture Image Database. [Online]. Available: http://www. [57] Colored Brodatz Database, accessed on Aug. 2014. [Online]. Available:
nada.kth.se/cvap/databases/kth-tips/index.html http://multibandtexture.recherche.usherbrooke.ca/colored%20_brodatz.
[33] G. J. Burghouts and J.-M. Geusebroek, “Material-specific adaptation html
of color invariant features,” Pattern Recognit. Lett., vol. 30, no. 3, [58] LLC MATLAB Code, accessed on Aug. 2014. [Online]. Available:
pp. 306–313, 2009. http://www.ifp.illinois.edu/~jyang29/LLC.htm

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2577887, IEEE
Transactions on Image Processing

4032 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 9, SEPTEMBER 2016

[59] J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong, “Locality- Satish Kumar Singh (M’11–SM’14) received the
constrained linear coding for image classification,” in Proc. IEEE Conf. B.Tech., M.Tech., and Ph.D. degrees in 2003, 2005,
Comput. Vis. Pattern Recognit., Jun. 2010, pp. 3360–3367. and 2010, respectively. He is an Assistant Professor
[60] GIST MATLAB Code, accessed on Aug. 2014. [Online]. Available: with the Indian Institute of Information Technology
http://people.csail.mit.edu/torralba/code/spatialenvelope/ at Allahabad, India. He is having more than ten
[61] L. Liu, M. Yu, and L. Shao, “Multiview alignment hashing for efficient years of experience in academic and research insti-
image search,” IEEE Trans. Image Process., vol. 24, no. 3, pp. 956–966, tutions. He has several publications in international
Mar. 2015. journal and conference proceedings of repute. He is
[62] J. Tang, Z. Li, M. Wang, and R. Zhao, “Neighborhood discriminant a member of various professional societies, such as
hashing for large-scale image retrieval,” IEEE Trans. Image Process., the IEEE and IETE. He was an Executive Committee
vol. 24, no. 9, pp. 2827–2840, Sep. 2015. Member of the IEEE Uttar Pradesh Section-2014.
[63] L. Liu and L. Shao, “Sequential compact code learning for unsupervised He is serving as an Editorial Board Member and a Reviewer for many
image hashing,” IEEE Trans. Neural Netw. Learn. Syst., in press. international journals. His current research interests are in the areas of digital
[64] L. Liu, M. Yu, and L. Shao, “Unsupervised local feature hashing for image processing, pattern recognition, multimedia data indexing and retrieval,
image similarity search,” IEEE Trans. Cybern., in press. watermarking, and biometrics.

Rajat Kumar Singh (M’05–SM’15) received the


bachelor’s degree in electronics and instrumenta-
Shiv Ram Dubey (S’14) received the tion engineering from the Bundelkhand Institute of
B.Tech. degree in computer science and engineering Engineering and Technology, Jhansi, India, in 1999,
from Gurukul Kangari Vishwavidyalaya, Haridwar, the master’s degree in communication engineering
India, in 2010, and the M.Tech. degree in computer from the Birla Institute of Technology and Science,
science and engineering from GLA University, Pilani, India, in 2001, and the Ph.D. degree from
Mathura, India, in 2012. He is a Ph.D. Research IIT Kanpur, India, in 2007, with a focus on the
Scholar with the Indian Institute of Information architecture of optical packet switching incorporat-
Technology at Allahabad, India. His area of ing various buffering techniques. He is currently an
research interest is image processing, image feature Assistant Professor with the Division of Electronics
description, and computer vision. He was a Project Engineering, Indian Institute of Information Technology at Allahabad, India.
Officer with the Computer Vision Laboratory, His current research interests are in the areas of optical networking and
IIT Madras, India. switching, wireless sensor network, and image processing.

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

You might also like