What is Data Warehouse
“A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection
  of data in support of management’s decision-making process.”
                                 Data Warehouse Modeling
 1) Enterprise warehouse
      An enterprise warehouse collects all of the information about subjects spanning the
         entire organization.
      It provides corporate-wide data integration and It typically contains detailed data as
         well as summarized data.
      It requires extensive business modeling and may take years to design and build.
 2) Data mart
      A data mart contains a subset of corporate-wide data that is of value to a specific
         group of users.
      The scope is confined to specific selected subjects.
      The data contained in data marts tend to be summarized. Data marts are usually
         implemented on servers that are Unix/Linux or Windows based.
      Depending on the source of data, data marts can be categorized as independent or
         dependent.
 3) Virtual warehouse
      A virtual warehouse is a set of views over operational databases.
      For efficient query processing, only some of the possible summary views may be
         materialized. A virtual warehouse is easy to build but requires excess capacity on
         operational database servers.
                                    What is Data Mining
  Data Mining is the process of automatically discovering useful information in large data
  repositories. Where it is a process used to extract usable data from a larger set of any raw
  data and it implies analysing data patterns in large batches of data using one or more
  software.
                                      Data Mining Tasks
  Data mining tasks are generally divided into two major categories:
1) Predictive tasks:
       It Use some variables to predict unknown or future values of other variables.
       The attribute to be predicted is called: target or dependent.
       Attribute used for making prediction are called: explanatory or independent variable.
      2) Descriptive tasks:
           Here the objective is to derive patterns that summarize the relationships in data.
           They are needed post processing the data to validate and explain the results.
Four of the Core data Mining tasks:
     Predictive Modeling: It refers to the task of building a model for the target variable as a
      function of the explanatory variable. There is two types of predictive modeling tasks:
            Classification: It is used for discrete target variables.
            Regression: It is used for continuous target variables.
     Association Analysis: it is used to find group of data that have related functionality. And its
      Goal is to extract the most of interesting patterns in an efficient manner.
     Cluster Analysis: Clustering has been used to group sets of related customers.
     Anomaly Detection: It is the task of identifying observations whose characteristics are
      significantly different from the rest of the data. Such observations are known as anomalies or
      outliers.
                                        Data Mining Applications
    1) Financial Data Analysis
       The financial data in banking and financial industry is generally reliable and is of high quality
       which facilitates systematic data analysis and data mining.
       For example,
              Loan payment prediction and customer credit policy analysis.
              Classification and clustering of customers for targeted marketing.
2) Retail Industry
    Data Mining has its great application in Retail Industry because it collects large amount of
    data from on sales, customer purchasing history, goods transportation, consumption and
    services. the quantity of data collected will continue to expand rapidly because of the
    increasing ease and availability.
    Data mining in retail industry helps in identifying customer buying patterns and trends that
    lead to improved quality of customer service and good customer retention and satisfaction.
    For example:
           Multidimensional analysis of sales, customers, products, time and region.
           Customer Retention.
           Product recommendation and cross-referencing of items
3) Telecommunication Industry
    Today the telecommunication industry is one of the most emerging industries providing
    various services such as fax, pager, cellular phone, internet messenger, images, e- mail, web
    data transmission, etc. Due to the development of new computer and communication
    technologies, the telecommunication industry is rapidly expanding in business field.
    For example:
    Data mining in telecommunication industry helps in identifying the telecommunication
    patterns, catch fraudulent activities and improve quality of service.
4) Biological Data Analysis
    In recent times, we have seen a tremendous growth in the field of biology such as genomics,
    proteomics, functional Genomics and biomedical research. Biological data mining is a very
    important part of Bioinformatics.
    For example:
          Semantic integration of heterogeneous, distributed genomic and proteomic
           databases.
         Discovery of structural patterns and analysis of genetic networks and protein
           pathways.
         Visualization tools in genetic data analysis.
 5) Intrusion Detection
    Intrusion refers to any kind of action that threatens integrity, confidentiality, or the
    availability of network resources. In this world of connectivity, security has become the
    major issue. With increased usage of internet and availability of the tools and tricks for
    attacking network prompted intrusion detection to become a critical component of network
    administration.
For example:
      Development of data mining algorithm for intrusion detection.
      Analysis of Stream data.
      Visualization and query tools