Review Article: Data Mining For The Internet of Things: Literature Review and Challenges
Review Article: Data Mining For The Internet of Things: Literature Review and Challenges
Review Article
Data Mining for the Internet of Things:
Literature Review and Challenges
           Copyright © 2015 Feng Chen et al. This is an open access article distributed under the Creative Commons Attribution License,
           which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
           The massive data generated by the Internet of Things (IoT) are considered of high business value, and data mining algorithms can be
           applied to IoT to extract hidden information from data. In this paper, we give a systematic way to review data mining in knowledge
           view, technique view, and application view, including classification, clustering, association analysis, time series analysis and outlier
           analysis. And the latest application cases are also surveyed. As more and more devices connected to IoT, large volume of data should
           be analyzed, the latest algorithms should be modified to apply to big data. We reviewed these algorithms and discussed challenges
           and open research issues. At last a suggested big data mining system is proposed.
                                                                            Preprocessed        Patterns
                         Data source         Data            Target data                                       Knowledge
                                                                                data
    (ii) Data mining: apply algorithms to the data to find the                  The rest of the paper is organized as follows. In Section 2
         patterns and evaluate patterns of discovered knowl-                we survey the main data mining functions from knowl-
         edge.                                                              edge view and technology view, including classification,
                                                                            clustering, association analysis, and outlier analysis, and
    (iii) Data presentation: visualize the data and represent
                                                                            introduce which techniques can support these functions.
          mined knowledge to the user.
                                                                            In Section 3 we review the data mining applications in e-
                                                                            commerce, industry, health care, and public service and
    We can view data mining in a multidimensional view.
                                                                            discuss which knowledge and technology can be applied
     (i) In knowledge view or data mining functions view,                   to these applications. In Section 4, IoT and big data are
         it includes characterization, discrimination, classifi-            discussed comprehensively, the new technologies to mine big
         cation, clustering, association analysis, time series              data for IoT are surveyed, the challenges in big data era are
         analysis, and outlier analysis.                                    overviewed, and a new big data mining system architecture
                                                                            for IoT is proposed. In Section 5 we give a conclusion.
    (ii) In utilized techniques view, it includes machine learn-
         ing, statistics, pattern recognition, big data, support
                                                                            2. Data Mining Functionalities
         vector machine, rough set, neural networks, and
         evolutionary algorithms.                                           Data mining functionalities include classification, clustering,
    (iii) In application view, it includes industry, telecom-               association analysis, time series analysis, and outlier analysis.
          munication, banking, fraud analysis, biodata mining,
                                                                                  (i) Classification is the process of finding a set of models
          stock market analysis, text mining, web mining, social
                                                                                      or functions that describe and distinguish data classes
          network, and e-commerce [3].
                                                                                      or concepts, for the purpose of predicting the class of
     A variety of researches focusing on knowledge view,                              objects whose class label is unknown.
technique view, and application view can be found in the                         (ii) Clustering analyzes data objects without consulting a
literature. However, no previous effort has been made to                              known class model.
review the different views of data mining in a systematic way,
especially in nowadays big data [5–7]; mobile internet and                       (iii) Association analysis is the discovery of association
Internet of Things [8–10] grow rapidly and some data mining                            rules displaying attribute-value conditions that fre-
researchers shift their attention from data mining to big data.                        quently occur together in a given set of data.
There are lots of data that can be mined, for example, database                  (iv) Time series analysis comprises methods and tech-
data (relational database, NoSQL database), data warehouse,                           niques for analyzing time series data in order to
data stream, spatiotemporal, time series, sequence, text and                          extract meaningful statistics and other characteristics
web, multimedia [11], graphs, the World Wide Web, Internet                            of the data.
of Things data [12–14], and legacy system log. Motivated
by this, in this paper, we attempt to make a comprehensive                       (v) Outlier analysis describes and models regularities or
survey of the important recent developments of data mining                           trends for objects whose behavior changes over time.
research. This survey focuses on knowledge view, utilized
techniques view, and application view of data mining. Our                   2.1. Classification. Classification is important for manage-
main contribution in this paper is that we selected some well-              ment of decision making. Given an object, assigning it
known algorithms and studied their strengths and limita-                    to one of predefined target categories or classes is called
tions.                                                                      classification. The goal of classification is to accurately predict
     The contribution of this paper includes 3 parts: the first             the target class for each case in the data [15]. For example, a
part is that we propose a novel way to review data mining                   classification model could be used to identify loan applicants
in knowledge view, technique view, and application view;                    as low, medium, or high credit risks [16].
the second part is that we discuss the new characteristics                       There are many methods to classify the data, includ-
of big data and analyze the challenges. Another important                   ing decision tree induction, frame-based or rule-based
contribution is that we propose a suggested big data mining                 expert systems, hierarchical classification, neural networks,
system. It is valuable for readers if they want to construct a              Bayesian network, and support vector machines (see Figure
big data mining system with open source technologies.                       2).
International Journal of Distributed Sensor Networks                                                                                 3
Classification
                                                                                    Bayesian
                                         Decision tree                 KNN                         SVM
                                                                                    network
                                                           CHAID                      Selective
                               C4.5                                       ENNS       naïve Bayes    FSVM
                                                                         EENNS        Seminaïve
                                 SLIQ                                                   Bayes      TWSVMs
                                                                                        One-
                                SPRINT                                  EEENNS       dependence    VaR-SVM
                                                                                      Bayesian
                                                                                    k-dependence
                                                                                      Bayesian      RSVM
                                                                                      Bayesian
                                                                                      multinets
    (i) A decision tree is a flow-chart-like tree structure,                    Distance Search (WKPDS) algorithm [26], Equal-
        where each internal node is denoted by rectangles and                   Average Nearest Neighbor Search (ENNS) algorithm
        leaf nodes are denoted by ovals. All internal nodes                     [27], Equal-Average Equal-Norm Nearest Neighbor
        have two or more child nodes. All internal nodes                        code word Search (EENNS) algorithm [28], the
        contain splits, which test the value of an expression                   Equal-Average Equal-Variance Equal-Norm Nearest
        of the attributes. Arcs from an internal node to its                    Neighbor Search (EEENNS) algorithm [29], and
        children are labeled with distinct outcomes of the test.                other improvements [30].
        Each leaf node has a class label associated with it.
        Iterative Dichotomiser 3 or ID3 is a simple decision              (iii) Bayesian networks are directed acyclic graphs whose
        tree learning algorithm [17]. C4.5 algorithm is an                      nodes represent random variables in the Bayesian
        improved version of ID3; it uses gain ratio as splitting                sense. Edges represent conditional dependencies;
        criteria [18]. The difference between ID3 and C4.5                      nodes which are not connected represent vari-
        algorithm is that ID3 uses binary splits, whereas C4.5                  ables which are conditionally independent of each
        algorithm uses multiway splits. SLIQ (Supervised                        other. Based on Bayesian networks, these classifiers
        Learning In Quest) is capable of handling large data                    have many strengths, like model interpretability and
        sets with ease and lesser time complexity [19, 20],                     accommodation to complex data and classification
        SPRINT (Scalable Parallelizable Induction of Deci-                      problem settings [31]. The research includes naı̈ve
        sion Tree algorithm) is also fast and highly scalable,                  Bayes [32, 33], selective naı̈ve Bayes [34], seminaı̈ve
        and there is no storage constraint on larger data sets                  Bayes [35], one-dependence Bayesian classifiers [36,
        in SPRINT [21]. Other improvement researches are                        37], K-dependence Bayesian classifiers [38], Bayesian
        finished [22, 23]. Classification and Regression Trees                  network-augmented naı̈ve Bayes [39], unrestricted
        (CART) is a nonparametric decision tree algorithm.                      Bayesian classifiers [40], and Bayesian multinets [41].
        It produces either classification or regression trees,             (iv) Support Vector Machines algorithm is supervised
        based on whether the response variable is categor-                      learning model with associated learning algorithms
        ical or continuous. CHAID (chi-squared automatic                        that analyze data and recognize patterns, which is
        interaction detector) and the improvement researcher                    based on statistical learning theory. SVM produces
        [24] focus on dividing a data set into exclusive and                    a binary classifier, the so-called optimal separating
        exhaustive segments that differ with respect to the                     hyperplanes, through an extremely nonlinear map-
        response variable.                                                      ping of the input vectors into the high-dimensional
   (ii) The KNN (K-Nearest Neighbor) algorithm is intro-                        feature space [32]. SVM is widely used in text
        duced by the Nearest Neighbor algorithm which is                        classification [33, 42], marketing, pattern recogni-
        designed to find the nearest point of the observed                      tion, and medical diagnosis [43]. A lot of further
        object. The main idea of the KNN algorithm is to find                   research is done, GSVM (granular support vector
        the K-nearest points [25]. There are a lot of different                 machines) [44–46], FSVM (fuzzy support vector
        improvements for the traditional KNN algorithm,                         machines) [47–49], TWSVMs (twin support vector
        such as the Wavelet Based K-Nearest Neighbor Partial                    machines) [50–52], VaR-SVM (value-at-risk support
4                                                                               International Journal of Distributed Sensor Networks
Clustering
                                                                                                  High-
                                  Hierarchical         Partitioning Cooccurrence    Scalable
                                                                                               dimensional
DBSCAN ENCLUS
BANG
       vector machines) [53], and RSVM (ranking support                         as one phase of processing and perform space seg-
       vector machines) [54].                                                   mentation and then aggregate appropriate segments;
                                                                                researches include BANG [68].
2.2. Clustering. Clustering algorithms [55] divide data into               (iii) In order to handle categorical data, researchers
meaningful groups (see Figure 3) so that patterns in the same                    change data clustering to preclustering of items or
group are similar in some sense and patterns in different                        categorical attribute values; typical research includes
group are dissimilar in the same sense. Searching for clusters                   ROCK [69].
involves unsupervised learning [56]. In information retrieval,
for example, the search engine clusters billions of web pages              (iv) Scalable clustering research faces scalability prob-
into different groups, such as news, reviews, videos, and                       lems for computing time and memory requirements,
audios. One straightforward example of clustering problem                       including DIGNET [70] and BIRCH [71].
is to divide points into different groups [16].                             (v) High dimensionality data clustering methods are
                                                                                designed to handle data with hundreds of attributes,
    (i) Hierarchical clustering method combines data objects                    including DFT [72] and MAFIA [73].
        into subgroups; those subgroups merge into larger
        and high level groups and so forth and form a                  2.3. Association Analysis. Association rule mining [74]
        hierarchy tree. Hierarchical clustering methods have           focuses on the market basket analysis or transaction data
        two classifications, agglomerative (bottom-up) and             analysis, and it targets discovery of rules showing attribute-
        divisive (top-down) approaches. The agglomerative              value associations that occur frequently and also help in the
        clustering starts with one-point clusters and recur-           generation of more general and qualitative knowledge which
        sively merges two or more of the clusters. The divi-           in turn helps in decision making [75]. The research structure
        sive clustering in contrast is a top-down strategy;            of association analysis is shown in Figure 4.
        it starts with a single cluster containing all data
        points and recursively splits that cluster into appro-              (i) For the first catalog of association analysis algorithms,
        priate subclusters [57, 58]. CURE (Clustering Using                     the data will be processed sequentially. The a priori
        Representatives) [59, 60] and SVD (Singular Value                       based algorithms have been used to discover intra-
        Decomposition) [61] are typical research.                               transaction associations and then discover associa-
    (ii) Partitioning algorithms discover clusters either by                    tions; there are lots of extension algorithms. Accord-
         iteratively relocating points between subsets or by                    ing to the data record format, it clusters into 2 types:
         identifying areas heavily populated with data. The                     Horizontal Database Format Algorithms and Vertical
         related research includes SNOB [62], MCLUST                            Database Format Algorithms; the typical algorithms
         [63], k-medoids, and k-means related research [64,                     include MSPS [76] and LAPIN-SPAM [77]. Pattern
         65]. Density-based partitioning methods attempt                        growth algorithm is more complex but can be faster
         to discover low-dimensional data, which is dense-                      to calculate given large volumes of data. The typical
         connected, known as spatial data. The related research                 algorithm is FP-Growth algorithm [78].
         includes DBSCAN (Density Based Spatial Clustering                  (ii) In some area, the data would be a flow of events
         of Applications with Noise) [66, 67]. Grid based par-                   and therefore the problem would be to discover event
         titioning algorithms use hierarchical agglomeration                     patterns that occur frequently together. It divides into
International Journal of Distributed Sensor Networks                                                                                       5
                                                                    Association
                                                                      analysis
                                 Horizontal
                               database format                   Event-based        FP-Growth         Approximate
                                  algorithms                      algorithms
                                   Vertical
                               database format                                                           Genetic
                                 algorithms                                                             algorithm
Fuzzy set
Time series
                                                                                  Similarity
                                             Representation                                           Indexing
                                                                                   measure
                                               Non-data-                                Subsequence
                            Model based        adaptive        Data adaptive                              SAMs
                                                                                         matching
                                                                    Shapelets
                                                                     based
       2 parts: event-based algorithms and event-oriented                           three categories: model based representation, non-
       algorithms; the typical algorithm is PROWL [79, 80].                         data-adaptive representation, and data adaptive rep-
   (iii) In order to take advantage of distributed parallel                         resentation. The model based representations want
         computer systems, some algorithms are developed,                           to find parameters of underlying model for a repre-
         for example, Par-CSP [81].                                                 sentation. Important research works include ARMA
                                                                                    [84] and the time series bitmaps research [85]. In
2.4. Time Series Analysis. A time series is a collection of                         non-data-adaptive representations, the parameters of
temporal data objects; the characteristics of time series data                      the transformation remain the same for every time
include large data size, high dimensionality, and updating                          series regardless of its nature, related research includ-
continuously. Commonly, time series task relies on 3 parts of                       ing DFT [86], wavelet functions related topic [87],
components, including representation, similarity measures,                          and PAA [72]. In data adaptive representations, the
and indexing (see Figure 5) [82, 83].                                               parameters of a transformation will change according
                                                                                    to the data available and related works including
    (i) One of the major reasons for time series representa-                        representations version of DFT [88]/PAA [89] and
        tion is to reduce the dimension, and it divides into                        indexable PLA [90].
6                                                                         International Journal of Distributed Sensor Networks
    (ii) The similarity measure of time series analysis is         improve accuracy of product demand forecasting, assortment
         typically carried out in an approximate manner; the       optimization, product recommendation, and ranking across
         research directions include subsequence matching          retailers and manufacturers [108, 109]. Researchers leverage
         [91] and full sequence matching [92].                     SVM [110], support vector regression [111], or Bass model
                                                                   [112] to forecast the products’ demand.
    (iii) The indexing of time series analysis is closely asso-
          ciated with representation and similarity measure
                                                                   3.3. Data Mining in Health Care. In health care, data min-
          part; the research topic includes SAMs (Spatial Access
                                                                   ing is becoming increasingly popular, if not increasingly
          Methods) and TS-Tree [93].
                                                                   essential [113–118]. Heterogeneous medical data have been
                                                                   generated in various health care organizations, including
2.5. Other Analysis. Outlier detection refers to the problem       payers, medicine providers, pharmaceuticals information,
of finding patterns in data that are very different from the       prescription information, doctor’s notes, or clinical records
rest of the data based on appropriate metrics. Such a pat-         produced day by day. These quantitative data can be used to
tern often contains useful information regarding abnormal          do clinical text mining, predictive modeling [119], survival
behavior of the system described by the data. Distance-            analysis, patient similarity analysis [120], and clustering, to
based algorithms calculate the distances among objects in the      improve care treatment [121] and reduce waste. In health care
data with geometric interpretation. Density-based algorithms       area, association analysis, clustering, and outlier analysis can
estimate the density distribution of the input space and then      be applied [122, 123].
identify outliers as those lying in low density. Rough sets            Treatment record data can be mined to explore ways to
based algorithms introduce rough sets or fuzzy rough sets to       cut costs and deliver better medicine [124, 125]. Data mining
identify outliers [94].                                            also can be used to identify and understand high-cost patients
                                                                   [126] and applied to mass of data generated by millions of
3. Data Mining Applications                                        prescriptions, operations, and treatment courses to identify
                                                                   unusual patterns and uncover fraud [127, 128].
3.1. Data Mining in e-Commerce. Data mining enables the
businesses to understand the patterns hidden inside past pur-      3.4. Data Mining in City Governance. In public service area,
chase transactions, thus helping in planning and launching         data mining can be used to discover public needs and improve
new marketing campaigns in prompt and cost-effective way           service performance, decision making with automated sys-
[95]. e-commerce is one of the most prospective domains            tems to decrease risks, classification, clustering, and time
for data mining because data records, including customer           series analysis which can be developed to solve this area
data, product data, users’ action log data, are plentiful; IT      problem.
team has enriched data mining skill and return on investment           E-government improves quality of government service,
can be measured. Researchers leverage association analy-           cost savings, wider political participation, and more effective
sis and clustering to provide the insight of what product          policies and programs [129, 130], and it has also been
combinations were purchased; it encourages customers to            proposed as a solution for increasing citizen communication
purchase related products that they may have been missed or        with government agencies and, ultimately, political trust [131].
overlooked. Users’ behaviors are monitored and analyzed to         City incident information management system can integrate
find similarities and patterns in Web surfing behavior so that     data mining methods to provide a comprehensive assessment
the Web can be more successful in meeting user needs [96]. A       of the impact of natural disasters on the agricultural produc-
complementary method of identifying potentially interesting        tion and rank disaster affected areas objectively and assist
content uses data on the preference of a set of users, called      governments in disaster preparation and resource allocation
collaborative filtering or recommender systems [97–99], and        [132].
it leverages user’s correlation and other similarity metrics           By using data analytics, researchers can predict which
to identify and cluster similar user profiles for the purpose      residents are likely to move away from the city [133], and it
of recommending informational items to users. And the              helps to infer which factors of city life and city services lead
recommender system also extends to social network [100],           to a resident’s decision to leave the city [134].
education area [101], academic library [102], and tourism              A major challenge for the government and law-
[103].                                                             enforcement is how to quickly analyze the growing volumes
                                                                   of crime data [135]. Researchers introduce spatial data
3.2. Data Mining in Industry. Data mining can highly benefit       mining technique to find out the association rules between
industries such as retail, banking, and telecommunications;        the crime hot spots and spatial landscape [136]; other
classification and clustering can be applied to this area [104].   researchers leverage enhanced k-means clustering algorithm
     One of the key success factors of insurance organizations     to discover crime patterns and use semisupervised learning
and banks is the assessment of borrowers’ credit worthiness in     technique for knowledge discovery and to help increase
advance during the credit evaluation process. Credit scoring       the predictive accuracy [137]. Also data mining can be
becomes more and more important and several data mining            used to detect criminal identity deceptions by analyzing
methods are applied for credit scoring problem [105–107].          people information such as name, address, date of birth,
     Retailers collect customer information, related transac-      and social-security number [138] and to uncover previously
tions information, and product information to significantly        unknown structural patterns from criminal networks [139].
International Journal of Distributed Sensor Networks                                                                                   7
Table 1: The data mining application and most popular data mining functionalities.
Application            Classification        Clustering         Association analysis        Time series analysis        Outlier analysis
e-commerce                                       ✓                       ✓
Industry                     ✓                   ✓                       ✓
Health care                                      ✓                       ✓                                                     ✓
City governance              ✓                   ✓                       ✓                           ✓
    In transport system, data mining can be used for                          deal with the variety, heterogeneity, and noise of the
map refinement according to GPS traces [140–142], and                         data, and it is a big challenge to find the fault and even
based on multiple users’ GPS trajectories researchers dis-                    harder to correct the data. In data mining algorithms
cover the interesting locations and classical travel sequences                area, how to modify traditional algorithms to big data
for location recommendation and travel recommendation                         environment is a big challenge.
[143].                                                                    (ii) Second challenge is how to mine uncertain and
                                                                               incomplete data for big data applications. In data
3.5. Summary. The data mining application and most popu-                       mining system, an effective and security solution to
lar data mining functionalities can be summarized in Table 1.                  share data between different applications and systems
                                                                               is one of the most important challenges, since sen-
4. Challenges and Open Research Issues in                                      sitive information, such as banking transactions and
   IoT and Big Data Era                                                        medical records, should be a matter of concern.
With the rapid development of IoT, big data, and cloud                4.2. Open Research Issues. In big data era, there are some open
computing, the most fundamental challenge is to explore               research issues including data checking, parallel program-
the large volumes of data and extract useful information or           ming model, and big data mining framework.
knowledge for future actions [144]. The key characteristics of
the data in IoT era can be considered as big data; they are as             (i) There are lots of researches on finding errors hidden
follows.                                                                       in data, such as [148]. Also the data cleaning, filtering,
                                                                               and reduction mechanisms are introduced.
     (i) Large volumes of data to read and write: the amount
         of data can be TB (terabytes), even PB (petabytes) and           (ii) Parallel programming model is introduced to data
         ZB (zettabyte), so we need to explore fast and effective              mining and some algorithms are adopted to be
         mechanisms.                                                           applied in it. Researchers have expanded existing
                                                                               data mining methods in many ways, including the
    (ii) Heterogeneous data sources and data types to inte-                    efficiency improvement of single-source knowledge
         grate: in big data era, the data sources are diverse; for             discovery methods, designing a data mining mecha-
         example, we need to integrate sensors data [145–147],                 nism from a multisource perspective, and the study
         cameras data, social media data, and so on and all                    of dynamic data mining methods and the analysis of
         these data are different in format, byte, binary, string,             stream data [149]. For example, parallel association
         number, and so forth. We need to communicate with                     rule mining [150, 151] and parallel k-means algorithm
         different types of devices and different systems and                  based on Hadoop platform are good practice. But
         also need to extract data from web pages.                             there are still some algorithms which are not adapted
   (iii) Complex knowledge to extract: the knowledge is                        to parallel platform, this constraint on applying data
         deeply hidden in large volumes of data and the                        mining technology to big data platform. This would
         knowledge is not straightforward, so we need to                       be a challenge for data mining related researchers and
         analyze the properties of data and find the association               also a great direction.
         of different data.                                              (iii) The most important work for big data mining system
                                                                               is to develop an efficient framework to support big
4.1. Challenges. There are lots of challenges when IoT and                     data mining. In the big data mining framework, we
big data come; the quantity of data is big but the quality                     need to consider the security of data, the privacy,
is low and the data are various from different data sources                    the data sharing mechanism, the growth of data size,
inherently possessing a great many different types and repre-                  and so forth. A well designed data mining framework
sentation forms, and the data is heterogeneous, as-structured,                 for big data is a very important direction and a big
semistructured, and even entirely unstructured. We analyze                     challenge.
the challenges in data extracting, data mining algorithms, and
data mining system area. Challenges are summarized below.             4.3. Recent Works of Big Data Mining System for IoT. In
    (i) The first challenge is to access, extracting large scale      data mining system area, many large companies as Facebook,
        data from different data storage locations. We need to        Yahoo, and Twitter benefit and contribute works to open
8                                                                                                     International Journal of Distributed Sensor Networks
                                                                                                                     Interpretation layer
                                                               u1         u2      u3       u4       u5        u6
                                                       S.N.
                                   Social layer
                                                               u7         u8      u9      u10       u11      u12
                                                       Cloud
                                 Network layer                                         GDB
                                                                                                                                            Security/privacy/standard
                        Data                                                                              Batch analysis      Workflow
                                         system       (MapReduce/R)                     analysis
                      processing                                                                            (Hadoop)           (Oozie)
                                         (HDFS)                                        (Storm/S4)
source projects. Big data mining infrastructure includes the                                 and interpretation layers. The extraction layer maps onto the
following.                                                                                   perception layer. Different from the traditional KDD, the
                                                                                             extraction layer of the proposed framework also takes into
     (i) Apache Mahout project implements a wide range of                                    consideration the behavior of agents for its devices [2].
         machine learning and data mining algorithms [152].
    (ii) R Project is a programming language and software                                    4.4. Suggested System Architecture for IoT. According to the
         environment designed for statistical computing and                                  survey of big data mining system and IoT system, we suggest
         visualization [153].                                                                the system architecture for IoT and big data mining system.
    (iii) MOA project performs data mining in real time                                      In this system, it includes 5 layers as shown in Figure 7.
          [154] and SAMOA [155] project integrates MOA with                                       (i) Devices: lots of IoT devices, such as sensors, RFID,
          Strom and S4.                                                                               cameras, and other devices, can be integrated into
    (iv) Pegasus is a petascale graph mining library for the                                          this system to apperceive the world and generate data
         Hadoop platform [156].                                                                       continuously.
    Some researchers from IoT area also proposed big data                                        (ii) Raw data: in the big data mining system, structured
mining system architectures for IoT, and these systems focus                                          data, semistructured data, and unstructured data can
on the integration with devices and data mining technologies                                          be integrated.
[157]. Figure 6 shows an architecture for the support of social                                 (iii) Data gather: real-time data and batch data can be
network and cloud computing in IoT. They integrated the big                                           supported and all data can be parsed, analyzed, and
data and KDD into the extraction, management and mining,                                              merged.
International Journal of Distributed Sensor Networks                                                                                         9
   (iv) Data processing: lots of open source solutions are                [2] C.-W. Tsai, C.-F. Lai, and A. V. Vasilakos, “Future internet of
        integrated, including Hadoop, HDFS, Storm, and                        things: open issues and challenges,” Wireless Networks, vol. 20,
        Oozie.                                                                no. 8, pp. 2201–2217, 2014.
                                                                          [3] H. Jiawei and M. Kamber, Data Mining: Concepts and Tech-
    (v) Service: data mining functions will be provided as
                                                                              niques, Morgan Kaufmann, 2011.
        service.
                                                                          [4] A. Mukhopadhyay, U. Maulik, S. Bandyopadhyay, and C. A.
   (vi) Security/privacy/standard: security, privacy, and                     C. Coello, “A survey of multiobjective evolutionary algorithms
        standard are very important to big data mining                        for data mining: part I,” IEEE Transactions on Evolutionary
        system. Security and privacy protect the data from                    Computation, vol. 18, no. 1, pp. 4–19, 2014.
        unauthorized access and privacy disclosure. Big data              [5] Y. Zhang, M. Chen, S. Mao, L. Hu, and V. Leung, “CAP: crowd
        mining system standard makes data integration,                        activity prediction based on big data analysis,” IEEE Network,
        sharing, and mining more open to the third part of                    vol. 28, no. 4, pp. 52–57, 2014.
        developer.                                                        [6] M. Chen, S. Mao, and Y. Liu, “Big data: a survey,” Mobile
                                                                              Networks and Applications, vol. 19, no. 2, pp. 171–209, 2014.
5. Conclusions                                                            [7] M. Chen, S. Mao, Y. Zhang, and V. Leung, Big Data: Related
                                                                              Technologies, Challenges and Future Prospects, SpringerBriefs in
The Internet of Things concept arises from the need to                        Computer Science, Springer, 2014.
manage, automate, and explore all devices, instruments, and               [8] J. Wan, D. Zhang, Y. Sun, K. Lin, C. Zou, and H. Cai, “VCMIA:
sensors in the world. In order to make wise decisions both for                a novel architecture for integrating vehicular cyber-physical
people and for the things in IoT, data mining technologies                    systems and mobile cloud computing,” Mobile Networks and
are integrated with IoT technologies for decision making                      Applications, vol. 19, no. 2, pp. 153–160, 2014.
support and system optimization. Data mining involves                     [9] X. H. Rong, F. Chen, P. Deng, and S. L. Ma, “A large-scale device
discovering novel, interesting, and potentially useful patterns               collaboration mechanism,” Journal of Computer Research and
from data and applying algorithms to the extraction of                        Development, vol. 48, no. 9, pp. 1589–1596, 2011.
hidden information. In this paper, we survey the data mining            [10] F. Chen, X.-H. Rong, P. Deng, and S.-L. Ma, “A survey of device
in 3 different views: knowledge view, technique view, and                     collaboration technology and system software,” Acta Electronica
                                                                              Sinica, vol. 39, no. 2, pp. 440–447, 2011.
application view. In knowledge view, we review classification,
                                                                         [11] L. Zhou, M. Chen, B. Zheng, and J. Cui, “Green multimedia
clustering, association analysis, time series analysis, and
                                                                              communications over Internet of Things,” in Proceedings of the
outlier analysis. In application view, we review the typical data             IEEE International Conference on Communications (ICC ’12), pp.
mining application, including e-commerce, industry, health                    1948–1952, Ottawa, Canada, June 2012.
care, and public service. The technique view is discussed with          [12] P. Deng, J. W. Zhang, X. H. Rong, and F. Chen, “A model of
knowledge view and application view. Nowadays, big data is                    large-scale Device Collaboration system based on PI-Calculus
a hot topic for data mining and IoT; we also discuss the new                  for green communication,” Telecommunication Systems, vol. 52,
characteristics of big data and analyze the challenges in data                no. 2, pp. 1313–1326, 2013.
extracting, data mining algorithms, and data mining system               [13] P. Deng, J. W. Zhang, X. H. Rong, and F. Chen, “Modeling
area. Based on the survey of the current research, a suggested                the large-scale device control system based on PI-Calculus,”
big data mining system is proposed.                                           Advanced Science Letters, vol. 4, no. 6-7, pp. 2374–2379, 2011.
                                                                        [14] J. Zhang, P. Deng, J. Wan, B. Yan, X. Rong, and F. Chen,
Conflict of Interests                                                         “A novel multimedia device ability matching technique for
                                                                              ubiquitous computing environments,” EURASIP Journal on
The authors declare that there is no conflict of interests                    Wireless Communications and Networking, vol. 2013, no. 1,
regarding the publication of this paper.                                      article 181, 12 pages, 2013.
                                                                        [15] G. Kesavaraj and S. Sukumaran, “A study on classification tech-
Acknowledgments                                                               niques in data mining,” in Proceedings of the 4th International
                                                                              Conference on Computing, Communications and Networking
This work is partially supported by the National Natural                      Technologies (ICCCNT ’13), pp. 1–7, July 2013.
Science Foundation of China (Grant nos. 61100066, 61262013,             [16] S. Song, Analysis and acceleration of data mining algorithms
61472283, and 61103185), the Open Fund of Guangdong                           on high performance reconfigurable computing platforms [Ph.D.
Province Key Laboratory of Precision Equipment and Man-                       thesis], Iowa State University, 2011.
ufacturing Technology (no. PEMT1303), the Fok Ying-Tong                 [17] J. R. Quinlan, “Induction of decision trees,” Machine Learning,
Education Foundation, China (Grant no. 142006), and the                       vol. 1, no. 1, pp. 81–106, 1986.
Fundamental Research Funds for the Central Universities                 [18] J. R. Quinlan, C4. 5: Programs for Machine Learning, vol. 1,
(Grant no. 2013KJ034). This project is also sponsored by the                  Morgan Kaufmann, 1993.
Scientific Research Foundation for the Returned Overseas                [19] M. Mehta, R. Agrawal, and J. Rissanen, SLIQ: A Fast Scalable
Chinese Scholars, State Education Ministry.                                   Classifier for Data Mining, Springer, Berlin, Germany, 1996.
                                                                        [20] B. Chandra and P. P. Varghese, “Fuzzy SLIQ decision tree algo-
References                                                                    rithm,” IEEE Transactions on Systems, Man, and Cybernetics,
                                                                              Part B: Cybernetics, vol. 38, no. 5, pp. 1294–1301, 2008.
[1] Q. Jing, A. V. Vasilakos, J. Wan, J. Lu, and D. Qiu, “Security      [21] J. Shafer, R. Agrawal, and M. Mehta, “SPRINT: a scalable parallel
    of the internet of things: perspectives and challenges,” Wireless         classifier for data mining,” in Proceedings of 22nd International
    Networks, vol. 20, no. 8, pp. 2481–2501, 2014.                            Conference on Very Large Data Bases, pp. 544–555, 1996.
10                                                                                    International Journal of Distributed Sensor Networks
[22] K. Polat and S. Güneş, “A novel hybrid intelligent method based       [40] Y. Lei, X. Q. Ding, and S. J. Wang, “Visual tracker using sequen-
      on C4.5 decision tree classifier and one-against-all approach                tial Bayesian learning: discriminative, generative, and hybrid,”
      for multi-class classification problems,” Expert Systems with                IEEE Transactions on Systems, Man, and Cybernetics, Part B:
      Applications, vol. 36, no. 2, pp. 1587–1592, 2009.                           Cybernetics, vol. 38, no. 6, pp. 1578–1591, 2008.
[23] S. Ranka and V. Singh, “CLOUDS: a decision tree classifier for          [41] D. Geiger and D. Heckerman, “Knowledge representation
      large datasets,” in Proceedings of the 4th Knowledge Discovery               and inference in similarity networks and Bayesian multinets,”
      and Data Mining Conference, pp. 2–8, 1998.                                   Artificial Intelligence, vol. 82, no. 1-2, pp. 45–74, 1996.
[24] M. van Diepen and P. H. Franses, “Evaluating chi-squared                [42] T. Joachims, “Text categorization with support vector machines:
      automatic interaction detection,” Information Systems, vol. 31,              learning with many relevant features,” in Machine Learning:
      no. 8, pp. 814–831, 2006.                                                    ECML-98, vol. 1398, pp. 137–142, Springer, Berlin, Germany,
[25] D. T. Larose, “k-nearest neighbor algorithm,” in Discovering                  1998.
      Knowledge in Data: An Introduction to Data Mining, pp. 90–106,         [43] L. Yingxin and R. Xiaogang, “Feature selection for cancer
      John Wiley & Sons, 2005.                                                     classification based on support vector machine,” Journal of
[26] W.-J. Hwang and K.-W. Wen, “Fast kNN classification algorithm                 Computer Research and Development, vol. 42, no. 10, pp. 1796–
      based on partial distance search,” Electronics Letters, vol. 34, no.         1801, 2005.
      21, pp. 2062–2063, 1998.                                               [44] Y. Tang, B. Jin, Y. Sun, and Y.-Q. Zhang, “Granular support
                                                                                   vector machines for medical binary classification problems,” in
[27] P. Jeng-Shyang, Q. Yu-Long, and S. Sheng-He, “Fast k-nearest
                                                                                   Proceedings of the IEEE Symposium on Computational Intelli-
      neighbors classification algorithm,” IEICE Transactions on
                                                                                   gence in Bioinformatics and Computational Biology (CIBCB ’04),
      Fundamentals of Electronics, Communications and Computer
                                                                                   pp. 73–78, October 2004.
      Sciences, vol. 87, no. 4, pp. 961–963, 2004.
                                                                             [45] H.-S. Guo, W.-J. Wang, and C.-Q. Men, “A novel learning
[28] J.-S. Pan, Z.-M. Lu, and S.-H. Sun, “An efficient encoding algo-
                                                                                   model-kernel granular support vector machine,” in Proceedings
      rithm for vector quantization based on subvector technique,”
                                                                                   of the International Conference on Machine Learning and Cyber-
      IEEE Transactions on Image Processing, vol. 12, no. 3, pp. 265–
                                                                                   netics, pp. 930–935, July 2009.
      270, 2003.
                                                                             [46] K. Lian, J. Huang, H. Wang, and B. Long, “Study on a GA-
[29] Z.-M. Lu and S.-H. Sun, “Equal-average equal-variance equal-                  based SVM decision-tree multi-classification strategy,” Acta
      norm nearest neighbor search algorithm for vector quantiza-                  Electronica Sinica, vol. 36, no. 8, pp. 1502–1507, 2008.
      tion,” IEICE Transactions on Information and Systems, vol. 86,
      no. 3, pp. 660–663, 2003.                                              [47] C.-F. Lin and S.-D. Wang, “Fuzzy support vector machines,”
                                                                                   IEEE Transactions on Neural Networks, vol. 13, no. 2, pp. 464–
[30] L. L. Tang, J. S. Pan, X. Guo, S. C. Chu, and J. F. Roddick, “A               471, 2002.
      novel approach on behavior of sleepy lizards based on K-nearest
                                                                             [48] H.-P. Huang and Y.-H. Liu, “Fuzzy support vector machines for
      neighbor algorithm,” in Social Networks: A Framework of
                                                                                   pattern recognition and data mining,” International Journal of
      Computational Intelligence, vol. 526 of Studies in Computational
                                                                                   Fuzzy Systems, vol. 4, no. 3, pp. 826–835, 2002.
      Intelligence, pp. 287–311, Springer, Cham, Switzerland, 2014.
                                                                             [49] W.-Y. Yan and Q. He, “Multi-class fuzzy support vector machine
 [31] C. Bielza and P. Larrañaga, “Discrete bayesian network classi-
                                                                                   based on dismissing margin,” in Proceedings of the International
      fiers: a survey,” ACM Computing Surveys, vol. 47, no. 1, article 5,
                                                                                   Conference on Machine Learning and Cybernetics, pp. 1139–1144,
      2014.
                                                                                   July 2009.
[32] M. E. Maron and J. L. Kuhns, “On relevance, probabilistic               [50] Z. Qi, Y. Tian, and Y. Shi, “Robust twin support vector machine
      indexing and information retrieval,” Journal of the ACM, vol.                for pattern classification,” Pattern Recognition, vol. 46, no. 1, pp.
      7, no. 3, pp. 216–244, 1960.                                                 305–316, 2013.
[33] M. Minsky, “Steps toward artificial intelligence,” Proceedings of        [51] R. Khemchandani and S. Chandra, “Twin support vector
      the IRE, vol. 49, no. 1, pp. 8–30, 1961.                                     machines for pattern classification,” IEEE Transactions on Pat-
[34] P. Langley and S. Sage, “Induction of selective Bayesian clas-                tern Analysis and Machine Intelligence, vol. 29, no. 5, pp. 905–
      sifiers,” in Proceedings of the 10th International Conference on             910, 2007.
      Uncertainty in Artificial Intelligence, pp. 399–406, 1994.             [52] Z. Qi, Y. Tian, and Y. Shi, “Structural twin support vector
[35] I. Kononenko, “Semi-naive Bayesian classifier,” in Machine                    machine for classification,” Knowledge-Based Systems, vol. 43,
      Learning—EWSL-91, vol. 482 of Lecture Notes in Artificial                    pp. 74–81, 2013.
      Intelligence, pp. 206–219, Springer, Berlin, Germany, 1991.            [53] P. Tsyurmasto, M. Zabarankin, and S. Uryasev, “Value-at-
[36] F. Zheng and G. I. Webb, Tree Augmented Naive Bayes, Springer,                risk support vector machine: stability to outliers,” Journal of
      Berlin, Germany, 2010.                                                       Combinatorial Optimization, vol. 28, no. 1, pp. 218–232, 2014.
[37] L. Jiang, H. Zhang, Z. Cai, and J. Su, “Learning tree augmented         [54] R. Herbrich, T. Graepel, and K. Obermayer, “Large margin
      naive bayes for ranking,” in Proceedings of the 10th International           rank boundaries for ordinal regression,” in Advances in Neural
      Conference on Database Systems for Advanced Applications                     Information Processing Systems, pp. 115–132, MIT Press, 1999.
      (DASFAA ’05), pp. 688–698, 2005.                                       [55] A. K. Jain and R. C. Dubes, Algorithms for Clustering Data,
[38] M. Sahami, “Learning limited dependence Bayesian classifiers,”                Prentice Hall, Englewood Cliffs, NJ, USA, 1988.
      in Proceedings of the 2nd International Conference on Knowledge        [56] S. Ansari, S. Chetlur, S. Prabhu, G. N. Kini, G. Hegde, and Y.
      Discovery and Data Mining, pp. 335–338, Portland, Ore, USA,                  Hyder, “An overview of clustering analysis techniques used in
      August 1996.                                                                 data mining,” International Journal of Emerging Technology and
[39] N. Friedman, “Learning belief networks in the presence of                     Advanced Engineering, vol. 3, no. 12, pp. 284–286, 2013.
      missing values and hidden variables,” in Proceedings of the 14th       [57] K. Srivastava, R. Shah, D. Valia, and H. Swaminarayan, “Data
      International Conference on Machine Learning, pp. 125–133, 1997.             mining using hierarchical agglomerative clustering algorithm
International Journal of Distributed Sensor Networks                                                                                                11
       in distributed cloud computing environment,” International            [73] H. S. Nagesh, S. Goil, and A. N. Choudhary, “Adaptive grids
       Journal of Computer Theory and Engineering, vol. 5, no. 3, pp.             for clustering massive data sets,” in Proceedings of the 1st SIAM
       520–522, 2013.                                                             International Conference on Data Mining (SDM ’01), pp. 1–17,
[58]   P. Berkhin, “A survey of clustering data mining techniques,” in            Chicago, Ill, USA, April 2001.
       Grouping Multidimensional Data, pp. 25–71, Springer, Berlin,          [74] R. Agrawal, T. Imieliński, and A. Swami, “Mining association
       Germany, 2006.                                                             rules between sets of items in large databases,” in Proceedings of
[59]   S. Guha, R. Rastogi, and K. Shim, “CURE: an efficient clustering           the ACM SIGMOD International Conference on Management of
       algorithm for large databases,” ACM SIGMOD Record, vol. 27,                Data (SIGMOD ’93), pp. 207–216, 1993.
       no. 2, pp. 73–84, 1998.                                               [75] A. Gosain and M. Bhugra, “A comprehensive survey of associa-
[60]   S. Guha, R. Rastogi, and K. Shim, “CURE: an efficient clustering           tion rules on quantitative data in data mining,” in Proceedings
       algorithm for large databases,” Information Systems, vol. 26, no.          of the IEEE Conference on Information & Communication
       1, pp. 35–58, 2001.                                                        Technologies (ICT ’13), pp. 1003–1008, JeJu Island, Republic of
                                                                                  Korea, April 2013.
[61]   M. W. Berry and M. Browne, Understanding Search Engines:
       Mathematical Modeling and Text Retrieval, vol. 17, SIAM, 2005.        [76] C. Luo and S. M. Chung, “Efficient mining of maximal sequen-
                                                                                  tial patterns using multiple samples,” in Proceedings of the 5th
[62]   C. S. Wallace and D. L. Dowe, “Intrinsic classification by MML-
                                                                                  SIAM International Conference on Data Mining (SDM ’05), pp.
       the Snob program,” in Proceedings of the 7th Australian Joint
                                                                                  415–426, April 2005.
       Conference on Artificial Intelligence, pp. 37–44, World Scientific,
       1994.                                                                 [77] Z. Yang and M. Kitsuregawa, “LAPIN-SPAM: an improved
                                                                                  algorithm for mining sequential pattern,” in Proceedings of the
[63]   C. Fraley and A. E. Raftery, “MCLUST version 3: an R package
                                                                                  21st International Conference on Data Engineering Workshops, p.
       for normal mixture modeling and model-based clustering,”
                                                                                  1222, April 2005.
       DTIC Document, 2006.
                                                                             [78] J. Han and J. Pei, “Mining frequent patterns by pattern-growth:
[64]   A. Broder, L. Garcia-Pueyo, V. Josifovski, S. Vassilvitskii, and
                                                                                  methodology and implications,” ACM SIGKDD Explorations
       S. Venkatesan, “Scalable K-Means by ranked retrieval,” in
                                                                                  Newsletter, vol. 2, no. 2, pp. 14–20, 2000.
       Proceedings of the 7th ACM International Conference on Web
       Search and Data Mining, pp. 233–242, Feburary 2014.                   [79] K. Huang, C. Chang, and K. Lin, “Prowl: an efficient frequent
                                                                                  continuity mining algorithm on event sequences,” in Data
[65]   Q. Li, P. Wang, W. Wang, H. Hu, Z. Li, and J. Li, “An efficient
                                                                                  Warehousing and Knowledge Discovery, vol. 3181 of Lecture Notes
       K-means clustering algorithm on MapReduce,” in Proceedings
                                                                                  in Computer Science, pp. 351–360, Springer, Berlin, Germany,
       of the 19th International Conference on Database Systems for
                                                                                  2004.
       Advanced Applications (DASFAA ’14), Bali, Indonesia, April
       2014, vol. 8421 of Lecture Notes in Computer Science, pp. 357–        [80] K. Y. Huang and C. H. Chang, “Efficient mining of frequent
       371, Springer International Publishing, 2014.                              episodes from complex sequences,” Information Systems, vol. 33,
                                                                                  no. 1, pp. 96–114, 2008.
[66]   J. Agrawal, S. Soni, S. Sharma, and S. Agrawal, “Modification
       of density based spatial clustering algorithm for large database      [81] S. Cong, J. Han, and D. Padua, “Parallel mining of closed
       using naive’s bayes’ theorem,” in Proceedings of the 4th Inter-            sequential patterns,” in Proceedings of the 11th ACM SIGKDD
       national Conference on Communication Systems and Network                   International Conference on Knowledge Discovery and Data
       Technologies (CSNT ’14), pp. 419–423, Bhopal, India, April 2014.           Mining (KDD ’05), pp. 562–567, August 2005.
[67]   M. Ester, H. Kriegel, J. Sander, and X. Xu, “A density-based          [82] T.-C. Fu, “A review on time series data mining,” Engineering
       algorithm for discovering clusters in large spatial databases with         Applications of Artificial Intelligence, vol. 24, no. 1, pp. 164–181,
       noise,” in Proceedings of the 2nd International Conference on              2011.
       Knowledge Discovery and Data Mining (KDD ’96), pp. 226–231,           [83] P. Esling and C. Agon, “Time-series data mining,” ACM Com-
       Portland, Ore, USA, 1996.                                                  puting Surveys, vol. 45, no. 1, article 12, 34 pages, 2012.
[68]   E. Schikuta and M. Erhart, “The BANG-clustering system: grid-         [84] K. Kalpakis, D. Gada, and V. Puttagunta, “Distance measures for
       based data analysis,” in Advances in Intelligent Data Analysis             effective clustering of ARIMA time-series,” in Proceedings of the
       Reasoning about Data, vol. 1280 of Lecture Notes in Computer               IEEE International Conference on Data Mining (ICDM ’01), pp.
       Science, pp. 513–524, Springer, Berlin, Germany, 1997.                     273–280, San Jose, Calif, USA, December 2001.
[69]   S. Guha, R. Rastogi, and K. Shim, “ROCK: a robust clustering          [85] N. Kumar, V. N. Lolla, E. Keogh, S. Lonardi, C. A. Ratanama-
       algorithm for categorical attributes,” in Proceedings of the 15th          hatana, and L. Wei, “Time-series bitmaps: a practical visual-
       International Conference on Data Engineering (ICD ’99), pp.                ization tool for working with large time series databases,” in
       512–521, March 1999.                                                       Proceedings of the 5th SIAM International Conference on Data
[70]   S. C. A. Thomopoulos, D. K. Bougoulias, and C.-D. Wann,                    Mining (SDM ’05), pp. 531–535, April 2005.
       “Dignet: an unsupervised-learning clustering algorithm for            [86] F. K.-P. Chan, A. W.-C. Fu, and C. Yu, “Haar wavelets for
       clustering and data fusion,” IEEE Transactions on Aerospace and            efficient similarity search of time-series: with and without
       Electronic Systems, vol. 31, no. 1, pp. 21–38, 1995.                       time warping,” IEEE Transactions on Knowledge and Data
[71]   T. Zhang, R. Ramakrishnan, and M. Livny, “BIRCH: a new                     Engineering, vol. 15, no. 3, pp. 686–705, 2003.
       data clustering algorithm and its applications,” Data Mining and      [87] D. E. Shasha and Y. Zhu, High Performance Discovery in Time
       Knowledge Discovery, vol. 1, no. 2, pp. 141–182, 1997.                     Series: Techniques and Case Studies, Springer, 2004.
[72]   E. Keogh, K. Chakrabarti, M. Pazzani, and S. Mehrotra,                [88] M. Vlachos, D. Gunopulos, and G. Das, “Indexing time-series
       “Dimensionality reduction for fast similarity search in large              under conditions of noise,” in Data Mining in Time Series
       time series databases,” Knowledge and Information Systems, vol.            Databases, vol. 57 of Series in Machine Perception and Artificial
       3, no. 3, pp. 263–286, 2001.                                               Intelligence, pp. 67–100, World Scientific, 2004.
 12                                                                                   International Journal of Distributed Sensor Networks
 [89] V. Megalooikonomou, G. Li, and Q. Wang, “A dimensionality             [105] H. C. Koh, W. C. Tan, and C. P. Goh, “A two-step method to
       reduction technique for efficient similarity analysis of time               construct credit scoring models with data mining techniques,”
       series databases,” in Proceedings of the 13th ACM International             International Journal of Business and Information, vol. 1, no. 1,
       Conference on Information and Knowledge Management (CIKM                    pp. 96–118, 2006.
       ’04), pp. 160–161, Washington, DC, USA, November 2004.               [106] N. C. Hsieh and L. P. Hung, “A data driven ensemble classifier
 [90] Q. Chen, L. Chen, X. Lian, Y. Liu, and J. X. Yu, “Indexable                  for credit scoring analysis,” Expert Systems with Applications,
       PLA for efficient similarity search,” in Proceedings of the 33rd            vol. 37, no. 1, pp. 534–545, 2010.
       International Conference on Very Large Data Bases, pp. 435–446,      [107] E. Kambal, I. Osman, M. Taha, N. Mohammed, and S.
       Vienna, Austria, September 2007.                                            Mohammed, “Credit scoring using data mining techniques with
  [91] X. L. Dong, C. K. Gu, and Z. O. Wang, “Research on shape-                   particular reference to Sudanese banks,” in Proceedings of the
       based time series similarity measure,” in Proceedings of the                1st IEEE International Conference on Computing, Electrical and
       International Conference on Machine Learning and Cybernetics,               Electronics Engineering (ICCEEE ’13), pp. 378–383, August 2013.
       pp. 1253–1258, August 2006.                                          [108] Q. Liu, J. Wan, and K. Zhou, “Cloud manufacturing service
 [92] V. Megalooikonomou, Q. Wang, G. Li, and C. Faloutsos, “A                     system for industrial-cluster-oriented application,” Journal of
       multiresolution symbolic representation of time series,” in Pro-            Internet Technology, vol. 15, no. 3, pp. 373–380, 2014.
       ceedings of the 21st International Conference on Data Engineering    [109] D. Maaß, M. Spruit, and P. de Waal, “Improving short-term
       (ICDE ’05), pp. 668–679, April 2005.                                        demand forecasting for short-lifecycle consumer products with
 [93] I. Assent, R. Krieger, F. Afschari, and T. Seidl, “The TS-                   data mining techniques,” Decision Analytics, vol. 1, no. 1, pp. 1–17,
       tree: efficient time series search and retrieval,” in Proceedings           2014.
       of the 11th International Conference on Extending Database           [110] X. F. Du, S. C. H. Leung, J. L. Zhang, and K. K. Lai, “Demand
       Technology: Advances in Database Technology (EDBT ’08), pp.                 forecasting of perishable farm products using support vector
       252–263, 2008.                                                              machine,” International Journal of Systems Science, vol. 44, no.
 [94] P. Gogoi, D. K. Bhattacharyya, B. Borah, and J. K. Kalita,                   3, pp. 556–567, 2013.
       “A survey of outlier detection methods in network anomaly             [111] C.-J. Lu and Y.-W. Wang, “Combining independent component
       identification,” The Computer Journal, vol. 54, no. 4, pp. 570–             analysis and growing hierarchical self-organizing maps with
       588, 2011.                                                                  support vector regression in product demand forecasting,”
 [95] P. Mishra, N. Padhy, and R. Panigrahi, “The survey of data min-              International Journal of Production Economics, vol. 128, no. 2,
       ing applications and feature scope,” Asian Journal of Computer              pp. 603–613, 2010.
       Science & Information Technology, vol. 2, article 4, 2013.            [112] H. Lee, S. G. Kim, H.-W. Park, and P. Kang, “Pre-launch
                                                                                   new product demand forecasting using the Bass model: a
 [96] J. Heer and E. H. Chi, “Identification of web user traffic
                                                                                   statistical and machine learning-based approach,” Technological
       composition using multi-modal clustering and information
                                                                                   Forecasting and Social Change, vol. 86, pp. 49–64, 2013.
       scent,” in Proceedings of the Workshop on Web Mining, SIAM
       Conference on Data Mining, pp. 51–58, 2001.                           [113] M. Chen, S. Gonzalez, V. Leung, Q. Zhang, and M. Li, “A 2G-
                                                                                   RFID-based e-healthcare system,” IEEE Wireless Communica-
 [97] P. Resnick and H. R. Varian, “Recommender systems,” Commu-
                                                                                   tions, vol. 17, no. 1, pp. 37–43, 2010.
       nications of the ACM, vol. 40, no. 3, pp. 56–58, 1997.
                                                                             [114] J. Liu, J. Wan, S. He, and Y. Zhang, “E-healthcare supported by
 [98] J. S. Breese, D. Heckerman, and C. Kadie, “Empirical analysis of             big data,” ZTE Communications, vol. 12, no. 3, pp. 46–52, 2014.
       predictive algorithms for collaborative filtering,” in Proceedings
                                                                             [115] M. Chen, Y. Ma, J. Wang, D. O. Mau, and E. Song, “Enabling
       of the 14th Conference on Uncertainty in Artificial Intelligence
                                                                                   comfortable sports therapy for patient: a novel lightweight
       (UAI ’98), pp. 43–52, 1998.
                                                                                   durable and portable ECG monitoring system,” in Proceedings of
 [99] A. Nikolay, G. Anindya, and G. I. Panagiotis, “Deriving the                  the IEEE 15th International Conference on e-Health Networking,
       pricing power of product features by mining consumer reviews,”              Applications and Services (Healthcom ’13), pp. 271–273, IEEE,
       Management Science, vol. 57, no. 8, pp. 1485–1509, 2011.                    Lisbon, Portugal, October 2013.
[100] I. Guy, “Tutorial on social recommender systems,” in Pro-             [116] J. Liu, Q. Wang, J. Wan, J. Xiong, and B. Zeng, “Towards key
       ceedings of the 23rd International World Wide Web Conference                issues of disaster aid based on Wireless Body Area Networks,”
       (WWW ’14), pp. 195–196, Seoul, Republic of Korea, 2014.                     KSII Transactions on Internet and Information Systems, vol. 7,
[101] J. A. Konstan, J. D. Walker, D. C. Brooks, K. Brown, and M.                  no. 5, pp. 1014–1035, 2013.
       D. Ekstrand, “Teaching recommender systems at large scale:            [117] M. Chen, “NDNC-BAN: supporting rich media healthcare
       evaluation and lessons learned from a hybrid MOOC,” in                      services via named data networking in cloud-assisted wireless
       Proceedings of the 1st ACM Conference on Learning @ Scale                   body area networks,” Information Sciences, vol. 284, no. 10, pp.
       Conference (L@S ’14), pp. 61–70, March 2014.                                142–156, 2014.
[102] A. Tejeda-Lorente, J. Bernabé-Moreno, C. Porcel, and E.               [118] M. Chen, D. O. Mau, X. Wang, and H. Wang, “The virtue
       Herrera-Viedma, “Integrating quality criteria in a fuzzy lin-               of sharing: efficient content delivery in wireless body area
       guistic recommender system for digital libraries,” Procedia                 networks for ubiquitous healthcare,” in Proceedings of the
       Computer Science, vol. 31, pp. 1036–1043, 2014.                             IEEE 15th International Conference on e-Health Networking,
[103] D. Gavalas, C. Konstantopoulos, K. Mastakas, and G. Pantziou,                Applications & Services (Healthcom '13), pp. 669–673, Lisbon,
       “Mobile recommender systems in tourism,” Journal of Network                 Portugal, October 2013.
       and Computer Applications, vol. 39, no. 1, pp. 319–333, 2014.         [119] J. Wan, C. Zou, S. Ullah, C.-F. Lai, M. Zhou, and X. Wang,
[104] N. Elgendy and A. Elragal, “Big data analytics: a literature review          “Cloud-enabled wireless body area networks for pervasive
       paper,” in Advances in Data Mining. Applications and Theoretical            healthcare,” IEEE Network, vol. 27, no. 5, pp. 56–61, 2013.
       Aspects, vol. 8557 of Lecture Notes in Computer Science, pp. 214–    [120] L. Duan, W. N. Street, and E. Xu, “Healthcare information
       227, Springer, Cham, Switzerland, 2014.                                     systems: data mining methods in the creation of a clinical
 International Journal of Distributed Sensor Networks                                                                                             13
        recommender system,” Enterprise Information Systems, vol. 5,          [137] V. N. Shyam, “Crime pattern detection using data mining,”
        no. 2, pp. 169–181, 2011.                                                    in Proceedings of the IEEE/WIC/ACM International Conference
[121]   B. K. Schuerenberg, “An information excavation. Las Vegas                    on Web Intelligence and Intelligent Agent Technology Workshops
        payer uses data mining software to improve HEDIS reporting                   (WI-IAT ’06), pp. 41–44, Hong Kong, December 2006.
        and provider profiling,” Health Data Management, vol. 11, no. 6,      [138] G. Wang, H. Chen, and H. Atabakhsh, “Automatically detecting
        pp. 80–82, 2003.                                                             deceptive criminal identities,” Communications of the ACM, vol.
[122]   J. Sun and C. K. Reddy, “Big data analytics for healthcare,” in              47, no. 3, pp. 70–76, 2004.
        Proceedings of the 19th ACM SIGKDD International Conference           [139] H. Chen, W. Chung, Y. Qin et al., “Crime data mining:
        on Knowledge Discovery and Data Mining, p. 1525, Chicago, Ill,               an overview and case studies,” in Proceedings of the Annual
        USA, August 2013.                                                            National Conference on Digital Government Research, pp. 1–5,
[123]   K. Kincade, “Data mining: digging for healthcare gold,” Insur-               2003.
        ance & Technology, vol. 23, no. 2, pp. 2–7, 1998.                     [140] X. Cao, G. Cong, and C. S. Jensen, “Mining significant semantic
[124]   R. Bellazzi and B. Zupan, “Predictive data mining in clinical                locations from GPS data,” Proceedings of the VLDB Endowment,
        medicine: current issues and guidelines,” International Journal              vol. 3, no. 1-2, pp. 1009–1020, 2010.
        of Medical Informatics, vol. 77, no. 2, pp. 81–97, 2008.               [141] J. Wan, D. Zhang, S. Zhao, L. T. Yang, and J. Lloret, “Context-
[125]   J. Liu, J. Pan, Y. Wang et al., “Component analysis of Chinese               aware vehicular cyber-physical systems with cloud support:
        medicine and advances in fuming-washing therapy for knee                     architecture, challenges, and solutions,” IEEE Communications
        osteoarthritis via unsupervised data mining methods,” Journal                Magazine, vol. 52, no. 8, pp. 106–113, 2014.
        of Traditional Chinese Medicine, vol. 33, no. 5, pp. 686–691, 2013.   [142] S. Schroedl, K. Wagstaff, S. Rogers, P. Langley, and C. Wilson,
[126]   M. Silver, T. Sakata, H. C. Su, C. Herman, S. B. Dolins, and M.              “Mining GPS traces for map refinement,” Data Mining and
        J. O’Shea, “Case study: how to apply data mining techniques in               Knowledge Discovery, vol. 9, no. 1, pp. 59–87, 2004.
        a healthcare data warehouse,” Journal of Healthcare Information       [143] Y. Zheng, L. Zhang, X. Xie, and W. Ma, “Mining interesting
        Management, vol. 15, no. 2, pp. 155–164, 2001.                               locations and travel sequences from GPS trajectories,” in Pro-
[127]   H. C. Koh and G. Tan, “Data mining applications in healthcare,”              ceedings of 18th International Conference on World Wide Web,
        Journal of Healthcare Information Management, vol. 19, no. 2, p.             pp. 791–800, 2009.
        65, 2011.                                                             [144] T. Hu, H. Chen, L. Huang, and X. Zhu, “A survey of mass
[128]   D. Thornton, R. M. Mueller, P. Schoutsen, and J. van Hillegers-              data mining based on cloud-computing,” in Proceedings of the
        berg, “Predicting healthcare fraud in medicaid: a multidimen-                International Conference on Anti-Counterfeiting, Security and
        sional data model and analysis techniques for fraud detection,”              Identification (ASID ’12), pp. 1–4, August 2012.
        Procedia Technology, vol. 9, pp. 1252–1264, 2013.                     [145] Y. Sun, J. Han, X. Yan, and P. S. Yu, “Mining knowledge from
[129]   N. Helbig, J. R. Gil-Garcı́a, and E. Ferro, “Understanding the               interconnected data: a heterogeneous information network
        complexity of electronic government: implications from the                   analysis approach,” in Proceedings of the VLDB Endowment, pp.
        digital divide literature,” Government Information Quarterly,                2022–2023, 2012.
        vol. 26, no. 1, pp. 89–97, 2009.                                      [146] M. Chen, L. T. Yang, T. Kwon, L. Zhou, and M. Jo, “Itinerary
[130]   J. Wan, D. Li, C. Zou, and K. Zhou, “M2M communications                      planning for energy-efficient agent communications in wireless
        for smart city: an event-based architecture,” in Proceedings of              sensor networks,” IEEE Transactions on Vehicular Technology,
        the IEEE 12th International Conference on Computer and Infor-                vol. 60, no. 7, pp. 3290–3299, 2011.
        mation Technology (CIT ’12), pp. 895–900, Chengdu, China,             [147] D. Zhang, J. Wan, Q. Liu, X. Guan, and X. Liang, “A taxonomy
        October 2012.                                                                of agent technologies for ubiquitous computing environments,”
[131]   A. Chadwick and C. May, “Interaction between states and                      KSII Transactions on Internet and Information Systems, vol. 6,
        citizens in the age of the internet: ‘e-government’ in the United            no. 2, pp. 547–565, 2012.
        States, Britain, and the European Union,” Governance, vol. 16,        [148] M. Chen, V. C. M. Leung, and S. Mao, “Directional controlled
        no. 2, pp. 271–300, 2003.                                                    fusion in wireless sensor networks,” Mobile Networks and
[132]   Y. Peng, Y. Zhang, Y. Tang, and S. Li, “An incident informa-                 Applications, vol. 14, no. 2, pp. 220–229, 2009.
        tion management framework based on data integration, data             [149] X. Wu, X. Zhu, G.-Q. Wu, and W. Ding, “Data mining with big
        mining, and multi-criteria decision making,” Decision Support                data,” IEEE Transactions on Knowledge and Data Engineering,
        Systems, vol. 51, no. 2, pp. 316–327, 2011.                                  vol. 26, no. 1, pp. 97–107, 2014.
[133]   B. Sullivan and S. Mitra, “Community issues in American               [150] X. Wu and S. Zhang, “Synthesizing high-frequency rules from
        metropolitan cities: a data mining case study,” Journal of Cases             different data sources,” IEEE Transactions on Knowledge and
        on Information Technology, vol. 16, no. 1, pp. 23–39, 2014.                  Data Engineering, vol. 15, no. 2, pp. 353–367, 2003.
[134]   M. Chen, “Towards smart city: M2M communications with                  [151] K. Su, H. Huang, X. Wu, and S. Zhang, “A logical framework
        software agent intelligence,” Multimedia Tools and Applications,             for identifying quality knowledge from different data sources,”
        vol. 67, no. 1, pp. 167–178, 2013.                                           Decision Support Systems, vol. 42, no. 3, pp. 1673–1683, 2006.
[135]   H. Chen, W. Chung, J. J. Xu, G. Wang, Y. Qin, and M. Chau,            [152] S. Owen, R. Anil, T. Dunning, and E. Friedman, Mahout in
        “Crime data mining: a general framework and some examples,”                  Action, Manning, 2011.
        Computer, vol. 37, no. 4, pp. 50–56, 2004.                            [153] R Development Core Team, R: A Language, and Environment for
[136]   S. Huang, “A study of the application of data mining on                      Statistical Computing, R Foundation for Statistical Computing,
        the spatial landscape allocation of crime hot spots,” in Geo-                Vienna, Austria, 2012.
        Informatics in Resource Management and Sustainable Ecosystem,         [154] A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer, “Moa: massive
        vol. 398 of Communications in Computer and Information                       online analysis,” The Journal of Machine Learning Research, vol.
        Science, pp. 1274–286, Springer, Berlin, Germany, 2013.                      11, pp. 1601–1604, 2010.
 14                                                                        International Journal of Distributed Sensor Networks