Marina Ivašić-Kos, Mile Pavlić, Maja Matetić
University of Rijeka
            Department of Computer Science
                              marinai@uniri.hr
                             mile.pavlic@ris.hr
                                 maja@uniri.hr
                                 ITI 2010, Cavtat
Outline
 Introduction
 Automatic image annotation
 A Continuous Features Transformation
 A Formal Description of Domain Concepts
 Conclusion
                                            2
 Introduction
      The main challenge of content-based image retrieval
       (      ) systems is to meet the user needs for semantic
       image retrieval.
      User queries are usually formulated using semantic
       notions of a higher level than object labels.
                                                = problem of
       complexity, subjectivity and ambiguity of human image
       interpretation [10]
      The problem of CBIR is closely related to that of
                links numerical features automatically extracted from the images
            and corresponding concepts keywords.
[10] Hare JS, et al. Mind the Gap: Another look at the problem of the semantic gap in image retrieval, In Proc.
of Multimedia Content Analysis, Management and Retrieval, San Jose, California, 2006; 6073: 1-12.
                                                                                                                  3
Automatic Image Annotation
   A popular AIA approach is to use a segmentation algorithm to divide
    images into a number of regions and to operate on their low
                                                             low--level
    features. By combining vectors of features, objects are recognized
    and named after class which they belong to. Often, the labels of the
    concepts recognized in the image with the highest probability, are
    chosen to annotate the image.
 Since the early 1990s, numerous academic and industrial
  approaches have been proposed but problem of semantic
  interpretation still exists.
 For viewing and analysing high level semantics, ontology or
  description logic are often pointed out.
 For solving the uncertain reasoning problems fuzzy ontologies
  or ontologies with extension of description logic are proposed.
                                                                           4
A Continuous Features Transformation
   Image consists of pixels which have no meaning, so
    extracted features will show one of the visual properties of
    the image/segment.
     Visual image properties = the content of the image shown using low
      level features (colour, shape, texture).
   Data:
     400 outdoor images from Corel Photo Library
     images are segmented with Normalized cut (n-cut) algorithm and
      features of size, position, colour and shape are calculated
     each segment is manually associated with class label
                                                                           5
Transformation (cont.)
   For outdoor images the precise value of every feature does
    not play a crucial role in determining the class affiliation.
   In order to simplify the model, features are approximated
    with discrete variables,
                  variables, based on feature value quantization.
   After the quantization, every segment is described using m-
    dimensional vector [D1 D2 ... Dm] of discrete values.
     where Di, i∊1...m corresponds to a descriptive variable as follows: size
      (D1), horizontal (D2) and vertical (D3) position, convexity (D4),
      boundary-area ratio (D4), luminance (D6), green-red (D7) or blue-
      yellow (D9) intensity, and their skew coefficients (D9).
                                                                             6
 Transformation (cont.)
 To define the number of clusters                                         Clusters of value
  and value range which will be                         Descriptors        EM        K-means
  associated to every descriptive           D1 - size                        7          7
  variable, a k-mean
                means and                   D2 - horizontal position (x)    9           9
                                            D3 - vertical position (y)      6           7
  Expectation Maximization
                                            D4 - boundary/area              7           7
  algorithms (EM) computing a               D5 - convexity                  3           3
  max. log likelihood is used.              D6 - luminance (L)              5           4
 For the measure of distance we            D7 - green-red (a)              5           5
                                            D8 - blue-yellow (b)            6           4
  chose city block to reduce the
                                            D9 - skewness-Lab               10          10
  influence of data with extreme
  values:                                     The results of quantization by using the
       d(xr, xs) = ⅀ ||xrj - xsj||          above mentioned methods almost match,
                                            which shows that grouping is performed
                                            successfully.
  For example, variable ‘size’ has values {s1, s2, … s7} where each si is a
representative of a cluster of continuous features with the centre in: {0.03, 0.07,
0.11, 0.16, 0.23, 0.34, 0.51}.
                                                                                               7
 Transformation (cont.)
   Using the analysis of segments which belong to a certain class,
    values of certain descriptive variables typical for a certain class
    have been chosen and associated with a degree of probability,
    based on the Bayes’ Theorem):
   P(∪Dk | Ci) = ⅀ P(Dk ∩ Ci) / P(Ci)
        k            k
where: ∀i Ci∊ C (a set of classes); ∀k Dk∊D (a set of descriptors).
   Each of the attribute values is also associated with a degree of
    reliability like (s6, 0.58), (s2, 0.42) in order to model fuzzy facts
    correctly.
                                                                            8
A Formal Description of Concepts in
an Outdoor Image Domain
   The problem outlined in this paper is how to determine
    a precise model for recording knowledge by which an
    image can be described or interpreted.
   During model creation, classification and
    generalization principles of knowledge organization
    were used.
   Statical view of system (structural
                              tructural and hierarchical
    relationships among class) is presented using Class
    Diagram of Unified Modeling Language (UML)
    formalism .
                                                             9
Structural relations among class and
its descriptors
   Classes are represented
    as nodes, and relations
    as arches.
   Image is segmented into
    one or more segments.
   For each of the
    segments, features are
    extracted and descriptors
    defined.
   An image can have more
    descriptors like
    descriptors of size,
    position, shape and
    colour.
   The image and/or
    segment can be
    associated with a class
    label to which the
    segment and/or image
    belongs.
                                       10
Class hierarchy in outdoor domain
   Generalization relationship is defined according to expert knowledge
    on relations between concepts in the domain.
   To improve the image annotation expanding the relations among
    words, a lexical database like WordNe can be used.
     WordNet is a lexical database of English words organised as hierarchy of
      groups of synonymous words (synsets).
                                                                                 11
Part of the Protégé knowledge model
 UML class models can be
  implemented to the Protégé
  knowledge model.
 Protégé is an open source
  ontology editor and knowledge-
  base framework.
 One can use the UML plug-
                         plug-in
  for Protégé that provides an
  import and export mechanism
  between the Protégé
  knowledge model and the UML
  modelling language.
                                   A class hierarchy implemented
                                   in Protégé framework.
                                                                   12
Conclusion
   The problem of automatic semantic image interpretation is
    complex, even when it relates only to images of similar type
    and the context of a specific domain.
   The first step towards automatic image interpretation is the
    definition of a model which is able to show knowledge
    associated to the image domain.
   The paper shortly specifies the quantization of descriptor
    values using the k-means and EM algorithm. Further
    research should look into the impact of transforming
    numerical into descriptive variables on similarities among
    objects from the knowledge base.
   An analysis should be conducted on how the adjustment of
    descriptor values affects the results of classification and
    image annotation.
                                                                   13
Thank You!
             14