0% found this document useful (0 votes)
31 views3 pages

Milestone 1

This document discusses automatic image annotation using multi-kernel learning for image patch clustering. Image patches are extracted from images using dense sampling at multiple scales with overlap. Features are extracted from each patch and multi-kernel learning is applied to cluster visually similar patches into groups within each category. Multi-kernel learning is also used to discover cross-category patch groups. The relevance of each group to its category tag is determined. A "cell graph" is constructed with categories and patch groups to represent associations. Knowledge graph construction and contextual relationship discovery are discussed to annotate images. The main ideas were extracted from two referenced papers on multi-kernel learning for tracking and knowledge graph-based image classification.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views3 pages

Milestone 1

This document discusses automatic image annotation using multi-kernel learning for image patch clustering. Image patches are extracted from images using dense sampling at multiple scales with overlap. Features are extracted from each patch and multi-kernel learning is applied to cluster visually similar patches into groups within each category. Multi-kernel learning is also used to discover cross-category patch groups. The relevance of each group to its category tag is determined. A "cell graph" is constructed with categories and patch groups to represent associations. Knowledge graph construction and contextual relationship discovery are discussed to annotate images. The main ideas were extracted from two referenced papers on multi-kernel learning for tracking and knowledge graph-based image classification.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Milestone 1

Title : Automatic Image Annotation

1.1 Images patches generation

Image patches extraction can be carried out in multiple ways, including image segmentation,
dense sampling, and salient region detection, etc. Dense sampling is the most widely used for its
simplicity where patches are uniformly sampled with a step in an image. Dense sampling on a regular
grid results in a good coverage of the entire objects or scene and a constant number of features per
image area. Regions with less contrast contribute equally to the overall image representation.

Scene images often contain many objects of interest under various backgrounds. We first divide images
into patches and generate patches group with the same category. Although a rigid partition of an image
into grid preserves certain spatial information, it often breaks an object into several blocks or puts
different objects into a single block. Thus, visual information about object, with could be beneficial to
image categorization may be destroyed by a rigid partition. We imposed image patches with dense
sampling at multiple scales and each type of grid is densely scanned over the image with overlap.

In this experiment, the scale of the grids is set as 60 x 60, 120 x 120, and 180 x 180. The corresponding
overlap are set as 15, 30, and 45, respectively. Partitions in the horizontal and vertical directions are
added to preserve consistent structure information (Xie et al., 2018). This process produces a highly
redundant image patch collection in each image category. Each group is defined a collection of image
patches that are visually like one another.

After patches extraction, we can obtain a patch set for each category P cat={p1,p2,…..pi} where pi is
the ith patch and P is the number of patches.

Figure 1: Illustration of the overlapping slide window patch extraction.


1.2 Patches grouping within the category

The goal of this step is to obtain several discriminative dense patch groups for each image
category. Each group contain visually similar image patches. Finding clusters in data is a challenging task
when the clusters different widely in shapes, sizes, and densities. The state-of-the-art methods find
dense subgraph on the affinity graph as the dominant clusters. However, the time and space complexity
of those methods are dominated by the construction of the affinity graph which is quadratic with
respect of the number of data points, and thus impractical on large data set.

We extract three kinds of features to describe the visual content of an image patch, the 128 dimensional
SIFT feature, the 256-dimensional Local Binary Pattern (LBP), and the 128-dimensional color histogram.

 Rough idea

- Now multi kernel learning (Fan, H., & Xiang, J. – use the way discussed in this paper which is
first-stage multi kernel learning and second-stage multiple kernel learning for assigning different
weight for the patches – use apropriate clustering technique to cluster the patches) should be
applied to cluster patches to patches groups in the same category. Image patches that visually
similar to each other will compose to patch groups within the same category. Within the same
category, there will be many patches groups which contain visually similar patches. Number of
groups generated in each category can be varies. We first collect category name as tag.
- Now, we can discover sets of visually similar patch groups across categories. We apply multi
kernel learning. Apply the same technique which used to cluster patches in the same category
for this.
- Then, find relevance degree of each group to its course category tag.
o It is showed that the object of interest is often located near the center of image
o Size is relatively big
o It is located near of the image
- construct the “cell graph” with category as the center node and the image patches group as the
side node. The association value between “cell graph” is obtained from relevance degree
- use wordnet to find semantic association
- Every related “cell graph” is combined into the subgraph.
- Contextual relationship discovery in knowledge graph
- knowledge graph construction as in Paper 2
- Finally, annotate the image (refer to paper 1)
- Main paper where the ideas were extracted highlighted in red color under reference.
Dataset = same as paper 2

- Two contribution excepted from this research which is one on the Multi kernel Learning for
image patches clustering and knowledge graph construction using image patches.
References

Fan, H., & Xiang, J. (2015). Patch-based visual tracking with two-stage multiple kernel learning. Lecture
Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture
Notes in Bioinformatics), 9219(August), 20–33. https://doi.org/10.1007/978-3-319-21969-1_3
Xie, L., Lee, F., Liu, L., Yin, Z., Yan, Y., Wang, W., Zhao, J., & Chen, Q. (2018). Improved spatial pyramid
matching for scene recognition. Pattern Recognition, 82, 118–129.
https://doi.org/10.1016/j.patcog.2018.04.025
(Paper 2) Zhang, D., Cui, M., Yang, Y., Yang, P., Xie, C., Liu, D., Yu, B., & Chen, Z. (2019). Knowledge
Graph-Based Image Classification Refinement. IEEE Access, 7(c), 57678–57690.
https://doi.org/10.1109/ACCESS.2019.2912627
(Paper 1) Zhang, S., Tian, Q., Hua, G., Huang, Q., & Gao, W. (2014). ObjectPatchNet: Towards scalable
and semantic image annotation and retrieval. Computer Vision and Image Understanding, 118, 16–
29. https://doi.org/10.1016/j.cviu.2013.03.008

You might also like