0% found this document useful (0 votes)

158 views50 pages

Cost Reduction Faster, Better Decision Making New Products and Services

The document provides an overview of how big data analytics can provide benefits to businesses, including cost reduction, faster decision making, and new products and services. It then reviews literature on how data-driven decision making can lead to higher firm performance. Specifically, it finds that firms that use data analytics have 5-6% higher output and productivity. The document also discusses how businesses are increasingly using data from various sources and experiments to make decisions and develop new offerings.

Uploaded by

Dhivakar Sundar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

158 views50 pages

Cost Reduction Faster, Better Decision Making New Products and Services

Uploaded by

Dhivakar Sundar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 50

SERIAL
TITLE PAGE NO
NO
Abstract
2

Introduction
1.1 Cost reduction
1 3
1.2 Faster, better decision making
1.3 New products and services

2 Literature Survey 5

System Specification
3 3.1 Hardware Specification 13
3.2 Software Specification
Project Description
4.1 Introduction
4 4.2 The data mining process 14
4.3 The pillars of artificial intelligence
4.4 Distributed SQL processing
Result and Discussion
5 5.1 Sample Data 37
5.2 Sample Screen Shot
Conclusion and Feature Enhancement
6 6.1 Conclusion 46
6.2 Feature Enhancement

7 References 48

ABSTRACT

1
Recent technology innovations, many of which are based on the capture and analysis
of big data, are transforming the automotive industry in a pace deemed inconceivable just a
short time ago. At the heart of this transformation is the new role of the car itself, and the
increasingly sophisticated abilities that “intelligent cars” possess to communicate with
individuals, enterprises, and devices around them. Company leaders in the automotive industry
clearly recognize that by embracing the concept of big data, they can access a mass of
opportunities for differentiation, growth, and innovation that revolutionize the very core of
existing business models. In order to unlock this potential, the key challenge is to develop and
implement a big data strategy, which is tailored to the capture, analysis, and interpretation of
the ever increasing quantities of structured and unstructured data which will be received from
drivers, vehicles, and other devices. Only those companies which incorporate a big data
strategy in their transformation agendas will be able to reap the rewards offered by the zetta
byte revolution.

1.INTRODUCTION

2
Big data analytics helps organizations harness their data and use it to identify new
opportunities. That, in turn, leads to smarter business moves, more efficient operations, higher
profits and happier customers. In his report Big Data in Big Companies, IIA Director of
Research Tom Davenport interviewed more than 50 businesses to understand how they used big
data. He found they got value in the following ways:

Cost reduction: Big data technologies such as Hadoop and cloud-based analytics bring
significant cost advantages when it comes to storing large amounts of data – plus they can
identify more efficient ways of doing business.

Faster, better decision making: With the speed of Hadoop and in-memory analytics, combined
with the ability to analyze new sources of data, businesses are able to analyze information
immediately – and make decisions based on what they’ve learned.

New products and services: With the ability to gauge customer needs and satisfaction through
analytics comes the power to give customers what they want. Davenport points out that with big
data analytics, more companies are creating new products to meet customers’ needs

The automotive industry continues to face a dynamic set of challenges. For those with the
right ambition it represents an exciting time with opportunities to differentiate and stand out from
the crowd. One area that has the opportunity to deliver significant competitive advantages is
analytics.

The concept of big data has been around for years; most organizations now understand
that if they capture. But even in the 1950s, decades before anyone uttered the term “big data,”
businesses were using basic analytics (essentially numbers in a spreadsheet that were manually
examined)touncoverinsightsandtrends.

The new benefits that big data analytics brings to the table, however, are speed and
efficiency. Whereas a few years ago a business would have gathered information, run analytics
and unearthed information that could be used for future decisions, today that business can

3
identify insights for immediate decisions. The ability to work faster – and stay agile – gives
organizations a competitive edge they didn’t have before.

4
2.LITERATURE SURVEY

We examine whether firms that emphasize decision making based on data and business
analytics (“data driven decision making” or DDD) show higher performance. Using detailed
survey data on the business practices and information technology investments of 179 large
publicly traded firms, we find that firms that adopt DDD have output and productivity that is 5-
6% higher than what would be expected given their other investments and information
technology usage. Furthermore, the relationship between DDD and performance also appears in
other performance measures such as asset utilization, return on equity and market value. Using
instrumental variables methods, we find evidence that the effect of DDD onthe productivity
donot appear to be due to reverse causality. Our results provide some of the first large scale data
on the direct connection between data-driven decision making and firm performance.

How do firms make better decisions? In more and more companies, managerial decisions
rely less on a leader’s “gut instinct” and instead on data-based analytics. At the same time, we
have been witnessing a data revolution; firms gather extremely detailed data from and propagate
knowledge to their consumers, suppliers, alliance partners, and competitors. Part of this trend is
due to the widespread diffusion of enterprise information technology such as Enterprise
Resource Planning (ERP), Supply Chain Management (SCM), and Customer Relationship
Management (CRM) systems (Aral et al. 2006; McAfee 2002), which capture and process vast
quantities of data as part of their regular operations.

Increasingly these systems are imbued with analytical capabilities, and these capabilities
are further extended by Business Intelligence (BI)systems that enable a broader array of data
analytic tools to be applied to operational data. Moreover, the opportunities for data collection
outside of operational systems have increased substantially. Mobile phones, vehicles, factory
automation systems, and other devices are routinely instrumented to generate streams of data on
their activities, making possible an emerging field of “reality mining” (Pentland and Pentland
2008). Manufacturers and retailers use RFID tags to track individual items as they pass through
the supply chain, and they use the data they provide optimize and reinvent their business
processes. Similarly, click stream data and keyword searches collected from websites generate a
plethora of data, making customer behavior and customer-firm interactions visible without
having to resort to costly or ad-hoc focus groups or customer behavior studies.

5
Leading-edge firms have moved from passively collecting data to actively conducting
customer experiments to develop and test new products. For instance, Capital One Financial
pioneered a strategy of “test and learn” in the credit card industry where large number of
potential card offers were field-tested using randomized trials to determine customer acceptance
and customer profitability (Clemons and Thatcher 1998). While these trials were quite
expensive, they were driven by the insight that existing data can have limited relevance for
understanding customer behavior in products that do not yet exist; some of the successful trials
created led to products such as “balance transfer cards,” which revolutionized the credit car
industry.

Online firms such as Amazon, eBay, and Google also rely heavily on field experiments as
part of a system of rapid innovation, utilizing the high visibility and high volume of online
customer interaction to validate and improve new product or pricing strategies. Increasingly, the
culture of experimentation has diffused to other information-intensive industries such as retail
financial services (Toronto-Dominion Bank, Wells Fargo, PNC), retail (Food Lion,
Sears,Famous Footwear), and services (CKE Restaurants, Subway) (see Davenport
2009).Information theory (e.g., Blackwell 1953) and the information-processing view of
organizations (e.g. Galbraith 1974) suggest that more precise and accurate information should
facilitate greater use of information in decision making and therefore lead to higher firm
performance. ). However, there is little independent, large sample empirical evidence on the
value or performance implications of adopting these technologies. In this paper, we develop a
measure of the use of “data-driven decision making” (DDD) that captures business practices
surrounding the collection and analysis of external and internal .

Business intelligence and analytics (BI&A) has emerged as an important area of study for
both practitioners and researchers, reflecting the magnitude and impact of data-related problems
to be solved in contemporary business organizations. This introduction to the MIS Quarterly
Special Issue on Business Intelligence Research first provides a framework that identifies the
evolution, applications, and emerging research areas of BI&A. BI&A 1.0, BI&A 2.0, and BI&A
3.0 are defined and described in terms of their key characteristics and capabilities. Current
research in BI&A is analyzed and challenges and opportunities associated with BI&A research

6
and education are identified. We also report a bibliometric study of critical BI&A publications,
researchers, and research topics based on more than a decade of related academic and industry
publications. Finally, the six articles that comprise this special issue are introduced and
characterized in terms of the proposed BI&A research framework.

Business intelligence and analytics (BI&A) and the related field of big data analytics
have become increasingly important in both the academic and the business communities over the
past two decades. Industry studies have highlighted this significant development. For example,
based on a survey of over 4,000 information technology (IT) professionals from 93 countries and
25 industries, the IBM Tech Trends Report (2011) identified business analytics as one of the four
major technology trends in the 2010s.

In a survey of the state of business analytics by Bloomberg Businessweek (2011), 97

percent of companies with revenues exceeding $100 million were found to use some form of
business analytics. A report by the McKinsey Global Institute (Manyika et al. 2011) predicted
that by 2018, the United States alone will face a short- age of 140,000 to 190,000 people with
deep analytical skills, as well as a shortfall of 1.5 million data-savvy managers with the know-
how to analyze big data to make effective decisions.

Hal Varian, Chief Economist at Google and emeritus profes- sor at the University of
California, Berkeley, commented on the emerging opportunities for IT professionals and students
in data analysis as follows: The opportunities associated with data and analysis in different
organizations have helped generate significant interest in BI&A, which is often referred to as the
techniques, technologies, systems, practices, methodologies, and applications that analyze
critical business data to help an enterprise better understand its business and market and make
timely business decisions. In addition to the underlying data processing and analytical
technologies, BI&A includes business-centric practices and methodologies that can be applied to
various high-impact applications such as e-commerce, market intelligence, e-government,
healthcare, and security.

This introduction to the MIS Quarterly Special Issue on Business Intelligence Research
provides an overview of this exciting and high-impact field, highlighting its many challenges
and opportunities. Figure 1 shows the key sections of this paper, including BI&A evolution,

7
applications, and emerging analytics research opportunities. We then report on a bibliometric
study of critical BI&A publications, researchers, and research topics based on more than a
decade of related BI&A academic and industry publications. Educa- tion and program
development opportunities in BI&A are presented, followed by a summary of the six articles that
appear in this special issue using our research framework. The final and visualization technol-
ogiesIn this field that offers new directions for BI&A research.

This paper provides a summary of how big data can be useful for small and medium
enterprises in the 21st century. Organisations can draw meaningful information from analysis of
how large amounts of data are created on a daily basis, which can help them make informed
decisions. Case studies in which SMEs have successfully utilised the potential of big data are
cited in the paper. The paper shows the potential of big data for mobile application development
enterprises, which fall under the SME bracket, and how they can utilise the benefits of big data,
by presenting applications-based analytical results. The conclusion presents a syllogism whereby
a mobile app development company has succeeded in using big data for generating revenue
streams, and therefore provides an example of how big data could be utilised by other firms.

The rapid growth of the Internet, smartphones, social media, wireless technologies, etc. in
the digital world has led to an explosion of data. It is predicted that the next technological
revolution will be based upon the science of ‘big data’, through which large amounts of data can
be processed, captured, stored, shared and analysed. According to estimates by analysts, these
large quantities of data will increase by 40 times in the upcoming decade (Intuit 2020 Report
2012). Previously, the ability to gather and capitalise upon large amounts of information was
limited to large enterprises, since they possessed a pool of statisticians who could obtain
meaningful information from the data.

However, due to the democracy provided by big data, small and medium enterprises can
utilise the benefits of data analysis, which will help them to obtain meaningful insights into
competition, markets, bottom lines and top-line results, and enhance their decision-making
abilities (Brown and Duguid 2002, p. 29, 39). The ease of access to large amounts of data has
made data a vital resource, along with labour and capital, in any industry. As predicted, the key

8
driver for growth in the 21st century for any organisation, big or small, is data that encompasses
many societal aspects, such as health care, business, entertainment, government, finance, etc.
(Erbes et al. 2012, pp. 66–72).

In collaboration with Emergent Research, Intuit, an American software company,

analysed and predicted data trends for the next seven years. The report estimates that 30
zettabytes of data will be stored in the year 2020 (Intuit 2020 Report 2012). IBM discovered that
around 10 terabytes of data are generated daily by Facebook, and 7 terabytes of data are
generated daily by Twitter (Intuit 2020 Report, 2012). Similarly, McKinsey predicts that data
will grow 40% annually, while CISCO concludes that the number of mobile- connected
computing devices will exceed 10 billion by 2016 (Manyika et al. 2011, p. 4), an estimate that
has grown from under 7 billion in 2012 (Intuit 2020 Report 2012).

Gartner, an IT and research firm, has analysed and graphically presented the results of
over 2000 technologies, grouped into 98 categories, that outline and depict the life cycle of
technologies which have a high potential (Ross 2011). Technology is expected to stabilise during
the later stages of its life cycle, since by that time society and industry will accept it as
indispensable, and will not create undue hype. In this research, high expectations for the future
are placed benefits that will enable them to make more informed decisions, improve
communities and open new avenues for business opportunities.

As considered from the perspective of a mobile application development company, the

potential for the huge amounts of data that will be created by smartphone technologies can be
utilised through big data, and can create more useful mobile application

Whether industrial 4.0 nor Internet industry, for today's industrial manufacturing
enterprises, it should be to make full use of information and communication technology to deal
with the arrival of smart and effective large data, combining products, machinery and human
resources into together, according to the unexpected speed about the mode of sales product, it
can change the manufacturing enterprises to process innovation and reform. This paper takes the
automobile manufacturing industry as an example, based on sale car large data analysis, using
data mining technology, through the Java program to prepare web crawler program for data
collection. To give some suggestions for the automobile manufacturing industry in the

9
production of automobile, it reduces the inventory of automobile enterprises and the waste of
resources.

Manufacturing industry is the pillar industry of the national economy, however, there is
still a big gap in our country with the independently intellectual property rights of innovative
design, advanced manufacturing technology , equipment, modern design and management, so
China is not manufacturing power, At the same time, with the formation of the world economic
integration, foreign technology, capital, products influx of a large number of China, Chinese
enterprises, especially the manufacturing industry is facing an unprecedented fierce competition.
Manufacturing industry to achieve new development, we must seize the opportunity, which
requires enterprises advanced management philosophy of mining.

The application of advanced computer network technology to integrate enterprise's

production, management, design, manufacture, management ,in time for the enterprise “three
level decision” system (Tactical layer and strategic layer, decision layer) to provide accurate and
effective data information. At the same time, with the manufacturing industry as the research
object, it is based on large data analysis and research of product sales. Product sales analysis,
including product sales strategy, sales analysis, and the impact of product sales factor analysis.

The increasing focus on big data and how it has potential to influence almost every
industry, gives it a often deterministic presence that presents it as a implement-and-reap
beneficial solution for enterprises. The project is inspired by the limited focus on the potential for
SMEs to harness big data instead of multinational enterprises. As noted, mostly larger enterprises
has launched initiatives to compliment their analytical proficiencies, but as technologies mature,
and more companies adopt frameworks for handling data, and learn how to organize within this
new framework, SMEs might find an easier time reaping some of the benefits.

Also led by cheaper and more easily accessible servers and data centers, delivered
through cloud vendors, SMEs now face less of a constraint on upfront investment, rather the
challenges presents themselves as organizational and strategic of nature. The right technologies
still needs to be chosen, but with well supported and documented open source data systems being
available, it has increasingly become a question of choosing right, choosing a scalable option .

10
While manufacturers have been generating highly distributed data from various systems,
devices and applications, a number of challenges in both data management and data analysis
require new approaches to support the big data era. These challenges for industrial big data
analytics is real-time analysis and decision- making from massive heterogeneous data sources in
manufacturing space. This survey presents new concepts, methodologies, and applications
scenarios of industrial big data analytics, which can provide dramatic improvements in velocity
and veracity problem solving.

We focus on five important methodologies of industrial big data analytics: 1) Highly

distributed industrial data ingestion: access and integrate to highly distributed data sources from
various systems, devices and applications; 2) Industrial big data repository: cope with sampling
biases and heterogeneity, and store different data formats and structures; 3) Large-scale
industrial data management: organizes massive heterogeneous data and share large-scale data; 4)
Industrial data analytics: track data provenance, from data generation through data preparation;
5) Industrial data governance: ensures data trust, integrity and security. For each phase, we
introduce to current research in industries and academia, and discusses challenges and potential
solutions. We also examine the typical applications of industrial big data, including smart factory
visibility, machine fleet, energy management, proactive maintenance, and just in time supply
chain. These discussions aim to understand the value of industrial big data. Lastly, this survey is
concluded with a discussion of open problems and future directions.

When manufacturers have been entered the age of big data, Data sizes can range from a
few dozen terabytes to many petabytes of data in a single data set. For example, the GE company
that produces a personal care product generates

5,000 data samples every 33 milliseconds, resulting in . Big data analytics will be vital
foundation for forecasting manufacturing, machine fleet, and proactive maintenance. Compared
to big data in general, industrial big data has the potential to create value in different sections of
manufacturing business chain. For example, valuable information regarding the hidden
degradation or inefficiency patterns within machines or manufacturing processes can lead to
informed and effective maintenance decisions which can avoid costly failures and unplanned
downtime. However, the ability to perform analysis on the data is constrained by the increasingly
distributed nature of industrial data sets.

11
Highly distributed data sources bring about challenges in industrial data access,
integration, and sharing. Furthermore, massive data produced by different sources are often
defined using different representation methods and structural specifications. Bringing those data
together becomes a challenge because the data are not properly prepared for data integration and
management, and the technical infrastructures lack the appropriate information infrastructure
services to support big data analytics if it remains distributed. Recently, industrial big data
analytics has attracted extensive research interests from both academia and industry.

According to a report from McKinsey institute, the effective use of industrial big data
has the underlying benefits to transform economies,and delivering a new wave of productive
growth. Taking advantages of valuable industrial big data analytics will become basic
competition for to days enterprises and will create new competitors who are able to attract
employees that have the critical skills on industrial big data.

The GE Corporate published while book about an industrial big data platform. It
illustrates industrial big data requirements that must be addressed in order for industrial operators
to achieve the many efficient opportunities in a cost-effective manner. The industrial big data
software platform brings these capabilities together in a single technology infrastructure that
opens the whole capabilities for service provider . Brian corporate describes industrial big data
analytics. The industrial big data analytics will focus on high-performance operational data
management system, cloud-based data storage, and hybrid service platforms.

ABB corporate proposes that turn industrial big data into decision-making so that
enterprise have additional context and insight to enable better decision making. In 2015,
industrial big data analytics in is proposed for manufacturing maintenance and service
innovation, which discusses automate data processing, health assessment and prognostics in
industrial big data environment.

12
3.SYSTEM SPECIFICATION

3.1 HARDWARE SPECIFICATION

Processor : Intel Dual Core

Clock speed : 1.8 GHz
Hard disk : 160 GB
RAM : 2 GB
Pointing device : Scroll Mouse
Keyboard : Standard Key-board
Peripherals : Printer

3.2 SOFTWARE SPECIFICATION

Language : HTML
Operating System : Windows 8.1
Front End : PHP

13
4.PROJECT DESCRIPTION

4.1.Introduction

Data science and machine learning are now key technologies in our everyday lives, as we
can see in a multitude of applications, such as voice recognition in vehicles and on cell phones,
automatic facial and traffic sign recognition, as well as chess and, more recently, Go machine
algorithms which humans can no longer beat. The analysis of large data volumes based on
search, pattern recognition, and learning algorithms provides insights into the behavior of
processes, systems, nature, and ultimately people, opening the door to a world of fundamentally
new possibilities. In fact, the now already implementable idea of autonomous driving is virtually
a tangible reality for many drivers today with the help of lane keeping assistance and adaptive
cruise control systems in the vehicle.

The fact that this is just the tip of the iceberg, even in the automotive industry, becomes
readily apparent when one considers that, at the end of 2015, Toyota and Tesla's founder, Elon
Musk, each announced investments amounting to one billion US dollars in artificial intelligence
research and development almost at the same time. The trend towards connected, autonomous,
and artificially intelligent systems that continuously learn from data and are able to make optimal
decisions is advancing in ways that are simply revolutionary, not to mention fundamentally
important to many industries.

This includes the automotive industry, one of the key industries in Germany, in which
international competitiveness will be influenced by a new factor in the near future – namely the
new technical and service offerings that can be provided with the help of data science and
machine learning. This article provides an overview of the corresponding methods and some
current application examples in the automotive industry. It also outlines the potential
applications to be expected in this industry very soon. Accordingly, sections 2 and 3 begin by
addressing the sub domains of data mining (also referred to as “big data analytics”) and artificial
intelligence, briefly summarizing the corresponding processes, methods, and areas of application
and presenting them in context.

Section 4 then provides an overview of current application examples in the automotive

industry based on the stages in the industry's value chain –from development to production and

14
logistics through to the end customer. Based on such an example, section 5 describes the vision
for future applications using three examples: one in which vehicles play the role of autonomous
agents that interact with each other in cities, one that covers integrated production optimization,
and one that describes companies themselves as autonomous agents. Whether these visions will
become a reality in this or any other way cannot be said with certainty at present – however, we
can safely predict that the rapid rate of development in this area will lead to the creation of
completely new products, processes, and services, many of which we can only imagine today.
This is one of the conclusions drawn in section 6, together with an outlook regarding the
potential future effects of the rapid rate of development in this area.

4.2. The Data Mining Process

Gartner uses the term “prescriptive analytics“ to describe the highest level of ability to
make business decisions on the basis of data-based analyses. This is illustrated by the question
“what should I do?” and prescriptive analytics supplies the required decision-making support, if
a person is still involved, or automation if this is no longer the case. The levels below this, in
ascending order in terms of the use and usefulness of AI and data science, are defined as follows:
descriptive analytics (“what has happened?”), diagnostic analytics (“why did it happen?”), and
predictive analytics (“what will happen?”) (see Figure 1).

The last two levels are based on data science technologies, including data mining and
statistics, while descriptive analytics essentially uses traditional business intelligence concepts
(data warehouse, OLAP). In this article, we seek to replace the term “prescriptive analytics“
with the term “optimizing analytics.“ The reason for this is that a technology can “prescribe”
many things, while, in terms of implementation within a company, the goal is always to make
something “better” with regard to target criteria or quality criteria. This optimization can be
supported by search algorithms, such as evolutionary algorithms in nonlinear cases and operation
research (OR) methods in – much rarer – linear cases. It can also be supported by application
experts who take the results from the data mining process and use them to draw conclusions
regarding process improvement.

15
One good example are the decision trees learned from data, which application experts
can understand, reconcile with their own expert knowledge, and then implement in an
appropriate manner. Here too, the

Application is used for optimizing purposes, admittedly with an intermediate human step.
Within this context, another important aspect is the fact that multiple criteria required for the
relevant application often need to be optimized at the same time, meaning that multi- criteria
optimization methods – or, more generally, multi- criteria decision-making support methods –
are necessary. These methods can then be used in order to find the best possible compromises
between conflicting goals. The examples mentioned include the frequently occurring conflicts
between cost and quality, risk and profit, and, in a more technical example, between the weight
and passive occupant safety of a body. Optimzing Analytics “What am I supposed to do?”
Decision support, multi criterial optimization Predictive Analytics “What will happen?”
Modelling Diagnostic Analytics “Why did it happen?”. Modelling Descriptive Analytics “What
happened?” Business Intelligence

Figure 4.1: The four levels of data analysis usage within a company

These four levels form a framework, within which it is possible to categorize data
analysis competence and potential benefits for a company in general. This framework is depicted
in Figure 1 and shows the four layers which build upon each other, together with the respective
technology category required for implementation. The traditional Cross-Industry Standard

16
Process for Data Mining (CRISP-DM) includes no optimization or decision- making support
whatsoever. Instead, based on the business understanding, data understanding, data preparation,
modeling, and evaluation sub-steps, CRISP proceeds directly to the deployment of results in
business processes. Here too, we propose an additional optimization step that in turn comprises
multi-criteria optimization and decision- making support. This approach is depicted
schematically in Figure 4.2.

Figure 4.2: Traditional CRISP-DM process with an additional optimization step

It is important to note that the original CRISP model deals with a largely iterative
approach used by data scientists to analyze data manually, which is reflected in the iterations
between business understanding and data understanding as well as data preparation and
modeling. However, evaluating the modeling results with the relevant application experts in the
evaluation step can also result in having to start the process all over again from the business
understanding sub-step, making it necessary to go through all the sub-steps again partially or
completely (e.g., if additional data needs to be incorporated).

The manual, iterative procedure is also due to the fact that the basic idea behind this
approach – as up-to-date as it may be for the majority of applications – is now almost 20 years
old and certainly only partially compatible with a big data strategy.

17
The fact is that, in addition to the use of nonlinear modeling methods (in contrast to the
usual generalized linear models derived from statistical modeling) and knowledge extraction
from data, data mining rests on the fundamental idea that models can be derived from data with
the help of algorithms and that this modeling process can run automatically for the most part –
because the algorithm “does the work.” In applications where a large number of models need to
be created, for example for use in making forecasts (e.g., sales forecasts for individual vehicle
models and markets based on historical data), automatic modeling plays an important role.

The same applies to the use of online data mining, in which, for example, forecast
models (e.g., for forecasting product quality) are not only constantly used for a production
process, but also adapted (i.e., retrained) continuously whenever individual process aspects
change (e.g., when a new raw material batch is used).

This type of application requires the technical ability to automatically Generate data, and
integrate and process it in such a way that data mining algorithms can be applied to it. In
addition, automatic modeling and automatic optimization are necessary in order to update
models and use them as a basis for generating optimal proposed actions in online applications.
These actions can then be communicated to the process expert as a suggestion or – especially in
the case of continuous production processes – be used directly to control the respective process.

If sensor systems are also integrated directly into the production process – to collect data
in real time – this results in a self-learning cyber- physical system 3 that facilitates
implementation of the Industry 4.04 vision in the field of production engineering.

Figure 4.3: Architecture of an Industry 4.0 model for optimizing analytics

18
This approach is depicted schematically in Figure 3. Data from the system is acquired
with the help of sensors and integrated into the data management system. Using this as a basis,
forecast models for the system's relevant outputs (quality, deviation from target value, process
variance, etc.) are used continuously in order to forecast the system's output.

Other machine learning options can be used within this context in order, for example, to
predict maintenance results (predictive maintenance) or to identify anomalies in the process. The
corresponding models are monitored continuously and, if necessary, automatically retrained if
any process drift is observed. Finally, the multi-criteria optimization uses the models to
continuously computer Systems “in which information and software components are connected
to mechanical and electronic components and in which data is transferred and exchanged, and
monitoring and control tasks are carried out, in real-time using infrastructures such as the
Internet.” (Translation of the following article in Gabler Wirtschaftslexikon, Springer: Industry
4.0 is defined therein as “a marketing term that is also used in science communication and refers
to a ‘future project’ of the German federal government.

The so-called ‘Fourth Industrial Revolution’ is characterized by the customization and

hybridization of products and the integration of customers and business partners into business
processes.” Optimum set points for the system control. Human process experts can also be
integrated here by using the system as a suggestion generator so that a process expert can
evaluate the generated suggestions before they are implemented in the original system.

In order to differentiate it from “traditional” data mining, the term “big data” is
frequently defined now with three (sometimes even four or five) essential characteristics:
volume, velocity, and variety, which refer to the large volume of data, the speed at which data is
generated, and the heterogeneity of the data to be analyzed, which can no longer be categorized
into the conventional relational database schema. Veracity, i.e., the fact that large uncertainties
may also be hidden in the data (e.g., measurement inaccuracies), and finally value, i.e., the value
that the data and its analysis represents for a company's business processes, are often cited as
additional characteristics.

So it is not just the pure data volume that distinguishes previous data analytics methods
from big data, but also other technical factors that require the use of new methods– such as

19
Hadoop and MapReduce – with appropriately adapted data analysis algorithms in order to allow
the data to be saved and processed. In addition, so- called “in-memory databases” now also make
it possible to apply traditional learning and modeling algorithms in main memory to large data
volumes.

This means that if one were to establish a hierarchy of data analysis and modeling
methods and techniques, then, in very simplistic terms, statistics would be a subset of data
mining, which in turn would be a subset of big data. Not every application requires the use of
data mining or big data technologies. However, a clear trend can be observed, which indicates
that the necessities and possibilities involved in the use of data mining and big data are growing
at a very rapid pace as increasingly large data volumes are being collected and linked across all
processes and departments of a company. Nevertheless, conventional hardware architecture with
additional main memory is often more than sufficient for analyzing large data volumes in the
gigabyte range.

Although optimizing analytics is of tremendous importance, it is also crucial to always be

open to the broad variety of applications when using artificial intelligence and machine learning
algorithms. The wide range of learning and search methods, with potential use in applications
such as image and language recognition, knowledge learning, control and planning in areas such
as production and logistics, among many others, can only be touched upon within the scope of
this article.

4.3. The Pillars of Artificial Intelligence

An early definition of artificial intelligence from the IEEE Neural Networks Council was
“the study of how to make computers do things at which, at the moment, people are better.”
Although this still applies, current research is also focused on improving the way that software
does things at which computers have always been better, such as analyzing large amounts of
data. Data is also the basis for developing artificially intelligent software systems not only to
collect information, but also to:

• Learn

• Understand and interpret information

20
• Behave adaptively

• Plan

• Make inferences

• Solve problems

• Think abstractly

• Understand and interpret ideas and language

4.4 Machine Learning

At the most general level, Machine Learning (ML) algorithms can be subdivided into two
categories: supervised and unsupervised, depending on whether or not the respective algorithm
requires a target variable to be specified.

Supervised learning algorithms

Apart from the input variables (predictors), supervised learning algorithms also require
the known target values (labels) for a problem. In order to train an ML model to identify traffic
signs using cameras, images of traffic signs – preferably with a variety of configurations – are
required as input variables. In this case, light conditions, angles, soiling, etc. are compiled as
noise or blurring in the data; nonetheless, it must be possible to recognize a traffic sign in rainy
conditions with the same accuracy as when the sun is shining. The labels, i.e., the correct
designations, for such.

Data are normally assigned manually. This correct set of input variables and their correct
classification constitute a training data set. Although we only have one image per training data
set in this case, we still speak of multiple input variables, since ML algorithms find relevant
features in training data and learn how these features and the class assignment for the
classification task indicated in the example are associated.

21
Supervised learning is used primarily to predict numerical values (regression) and for
classification purposes (predicting the appropriate class), and the corresponding data is not
limited to a specific format – ML algorithms are more than capable of processing images, audio
files, videos, numerical data, and text. Classification examples include object recognition (traffic
signs, objects in front of a vehicle, etc.), face recognition, credit risk assessment, voice
recognition, and customer churn, to name but a few.

Regression examples include determining continuous numerical values on the basis of

multiple (sometimes hundreds or thousands) input variables, such as a self- driving car
calculating its ideal speed on the basis of road and ambient conditions, determining a financial
indicator such as gross domestic product based on a changing number of input variables (use of
arable land, population education levels, industrial production, etc.), and determining potential
market shares with the introduction of new models. Each of these problems is highly complex
and cannot be represented by simple, linear relationships in simple equations. Or, to put it
another way that more accurately represents the enormous challenge involved: the necessary
expertise does not even exist.

Unsupervised learning algorithms

Unsupervised learning algorithms do not focus on individual target variables, but instead
have the goal of characterizing a data set in general. Unsupervised ML algorithms are often used
to group (cluster) data sets, i.e., to identify relationships between individual data points (that can
consist of any number of attributes) and group them into clusters. In certain cases, the output
from unsupervised ML algorithms can in turn be used as an input for supervised methods.
Examples of unsupervised learning include forming customer groups based on their buying
behavior or demographic data, or clustering time series in order to group millions of time series
from sensors into groups that were previously not obvious.

In other words, machine learning is the area of artificial intelligence (AI) that enables
computers to learn without being programmed explicitly. Machine learning focuses on
developing programs that grow and change by themselves as soon as new data is provided.
Accordingly, processes that can be represented in a flowchart are not suitable candidates for
machine learning – in contrast, everything that requires dynamic and changing solution strategies

22
and cannot be constrained to static rules is potentially suitable for solution with ML. For
example, ML is used when:

 No relevant human expertise exists

 People are unable to express their expertise
 The solution changes over time
 The solution needs to be adapted to specific cases

In contrast to statistics, which follows the approach of making inferences based on

samples, computer science is interested in developing efficient algorithms for solving
optimization problems, as well as in developing a representation of the model for evaluating
inferences. Methods frequently used for optimization in this context include so-called
“evolutionary algorithms” (genetic algorithms, evolution strategies), the basic principles of
which emulate natural evolution6. These methods are very efficient when applied to complex,
nonlinear optimization problems.

Even though ML is used in certain data mining applications, and both look for patterns in
data, ML and data mining are not the same thing. Instead of extracting data that people can
understand, as is the case with data mining, ML methods are used by programs to improve their
own understanding of the data provided. Software that implements ML methods recognizes
patterns in data and can dynamically adjust the behavior based on them.

If, for example, a self-driving car (or the software that interprets the visual signal from
the corresponding camera) has been trained to initiate a braking maneuver if a pedestrian appears
in front it, this must work with all pedestrians regardless of whether they are short, tall, fat, thin,
clothed, coming from the left, coming from the right, etc. In turn, the vehicle must not brake if
there is a stationary garbage bin on the side of the road.

The level of complexity in the real world is often greater than the level of complexity of
an ML model, which is why, in most cases, an attempt is made to subdivide problems into sub
problems and then apply ML models to these sub problems. The output from these models is
then integrated in order to permit complex tasks, such as autonomous vehicle operation, in
structured and unstructured environments.

23
4.5 Computer Vision

Computer vision (CV) is a very wide field of research that merges scientific theories from
various fields (as is often the case with AI), starting from biology, neuroscience, and psychology
and extending all the way to computer science, mathematics, and physics. First, it is important to
know how an image is produced physically. Before light hits sensors in a two-dimensional array,
it is refracted, absorbed, scattered, or reflected, and an image is produced by measuring the
intensity of the light beams through each element in the image (pixel). The three primary focuses
of CV are:

 Reconstructing a scene and the point from which the scene is

observed based on an image, an image sequence, or a video.
 Emulating biological visual perception in order to better understand
which physical and biological processes are involved, how the
wetware works, and how the corresponding interpretation and
understanding work.
 Technical research and development focuses on efficient,
algorithmic solutions – when it comes to CV software, problem-
specific solutions that only have limited commonalities with the
visual perception of biological organisms are often developed.

All three areas overlap and influence each other. If, for example, the focus in an
application is on obstacle recognition in order to initiate an automated braking maneuver in the
event of a pedestrian appearing in front of the vehicle, the most important thing is to identify the
pedestrian as an obstacle. Interpreting the entire scene – e.g., understanding that the vehicle is
moving towards a family having a picnic in a field – is not necessary in this case.

In contrast, understanding a scene is an essential prerequisite if context is a relevant

input, such as is the case when developing domestic robots that need to understand that an
occupant who is lying on the floor not only represents an Obstacle that needs to be evaded, but is
also probably not sleeping and a medical emergency is occurring.

24
Vision in biological organisms is regarded as an active process that includes controlling
the sensor and is tightly linked to successful performance of an action. Consequently, CV
systems are not passive either. In other words, the system must:

 Be continuously provided with data via sensors (streaming)

 Act based on this data stream

Having said that, the goal of CV systems is not to understand scenes in images – first and
foremost, the systems must extract the relevant information for a specific task from the scene.
This means that they must identify a “region of interest” that will be used for processing.
Moreover, these systems must feature short response times, since it is probable that scenes will
change over time and that a heavily delayed action will not achieve the desired effect. Many
different methods have been proposed for object recognition purposes (“what” is located
“where” in a scene), including:

Object detectors, in which case a window moves over the image and a filter response is
determined for each position by comparing a template and the sub-image (window content), with
each new object parameterization requiring a separate scan. More sophisticated algorithms
simultaneously make calculations based on various scales and apply filters that have been
learned from a large number of images.

Segment-based techniques extract a geometrical description of an object by grouping

pixels that define the dimensions of an object in an image. Based on this, a fixed feature set is
computed, i.e., the features in the set retain the same values even when subjected to various
image transformations, such as changes in light conditions, scaling, or rotation. These features
are used to clearly identify objects or object classes, one example being the aforementioned
identification of traffic signs.

Alignment-based methods use parametric object models that are trained on data.
Algorithms search for parameters, such as scaling, translation, or rotation, that adapt a model
optimally to the corresponding features in the image, whereby an approximated solution can be
found by means of a reciprocal process, i.e., by features, such as contours, corners, or others,
“selecting” characteristic points in the image for parameter solutions that are compatible with the
found feature.

25
With object recognition, it is necessary to decide whether algorithms need to process 2-D
or 3-D representations of objects – 2-D representations are very frequently a good compromise
between accuracy and availability. Current research (deep learning) shows that even distances
between two points based on two 2-D images captured from different points can be accurately
determined as an input. In daylight conditions and with reasonably good visibility, this input can
be used in addition to data acquired with laser and radar equipment in order to increase accuracy
– moreover, a single camera is sufficient to generate the required data.

In contrast to 3-D objects, no shape, depth, or orientation information is directly encoded

in 2-D images. Depth can be encoded in a variety of ways, such as with the use of laser or stereo
cameras (emulating human vision) and structured light approaches (such as Kinect). At present,
the most intensively pursued research direction involves the use of super quadrics – geometric
shapes defined with formulas, which use any number of exponents to identify structures such as
cylinders, cubes, and cones with round or sharp edges.

This allows a large variety of different basic shapes to be described with a small set of
parameters. If 3-D images are acquired using stereo cameras, statistical methods (such as
generating a stereo point cloud) are used instead of the aforementioned shape-based methods,
because the data quality achieved with stereo cameras is poorer than that achieved with laser
scans. Other research directions include tracking, contextual scene understanding, and
monitoring, although these aspects are currently of secondary importance to the automotive
industry.

4.6 Inference and Decision-Making

This field of research, referred to in the literature as “knowledge representation &

reasoning” (KRR), focuses on designing and developing data structures and inference
algorithms. Problems solved by making inferences are very often found in applications that
require interaction with the physical world (humans, for example), such as generating
diagnostics, planning, processing natural languages, answering questions, etc. KRR forms the
basis for AI at the human level.

Making inferences is the area of KRR in which data-based answers need to be found
without human intervention or assistance, and for which data is normally presented in a formal

26
system with distinct and clear semantics. Since 1980, it has been assumed that the data involved
is a mixture of simple and complex structures, with the former having a low degree of
computational complexity and forming the basis for research involving large databases.

The latter are presented in a language with more expressive power, which requires less
space for representation, and they correspond to generalizations and fine-grained information.

Decision-making is a type of inference that revolves primarily around answering

questions regarding preferences between activities, for example when an autonomous agent
attempts to fulfill a task for a person. Such decisions are very frequently made in a dynamic
domain which changes over the course of time and when actions are executed. An example of
this is a self-driving car that needs to react to changes in traffic.

Logic and combinatorics

Mathematical logic is the formal basis for many applications in the real world, including
calculation theory, our legal system and corresponding arguments, and theoretical developments
and evidence in the field of research and development. The initial vision was to represent every
type of knowledge in the form of logic and use universal algorithms to make inferences from it,
but a number of challenges arose – for example, not all types of knowledge can be represented
simply.

Moreover, compiling the knowledge required for complex applications can become very
complex, and it is not easy to learn this type of knowledge in a logical, highly expressive
language. In addition, it is not easy to make inferences with the required highly expressive
language – in extreme cases, such scenarios cannot be implemented computationally, even if the
first two challenges are overcome. Currently, there are three ongoing debates on this subject,
with the first one focusing on the argument that logic is unable to represent many concepts, such
as space, analogy, shape, uncertainty, etc., and consequently cannot be included as an active part
in developing AI to a human level.

The counterargument states that logic is simply one of many tools. At present, the
combination of representative expressiveness, flexibility, and clarity cannot be achieved with any
other method or system. The second debate revolves around the argument that logic is too slow

27
for making inferences and will therefore never play a role in a productive system. The
counterargument here is that ways exist to approximate the inference process with logic, so
processing is drawing close to remaining within the required time limits, and progress is being
made with regard to logical inference.

Finally, the third debate revolves around the argument that it is extremely difficult, or
even impossible, to develop systems based on logical axioms into applications for the real world.
The counterarguments in this debate are primarily based on the research of individuals currently
researching techniques for learning logical axioms from natural-language texts.

In principle, a distinction is made between four different types of logic which are not
discussed any further in this article:

 Propositional logic
 First-order predicate logic
 Modal logic
 Non-monotonic logic

Automated decision-making, such as that found in autonomous robots (vehicles), WWW

agents, and communications agents, is also worth mentioning at this point. This type of decision-
making is particularly relevant when it comes to representing expert decision-making processes
with logic and automating them. Very frequently, this type of decision-making process takes
account of the dynamics of the surroundings, for example when a transport robot in a production
plant needs to evade another transport robot.

However, this is not a basic prerequisite, for example, if a decision-making process

without a clearly defined direction is undertaken in future, e.g., the decision to rent a warehouse
at a specific price at a specific location. Decision-making as a field of research encompasses
multiple domains, such as computer science, psychology, economics, and all engineering
disciplines. Several fundamental questions need to be answered to enable development of
automated decision-making systems:

28
 Is the domain dynamic to the extent that a sequence of decisions is
required or static in the sense that a single decision or multiple
simultaneous decisions need to be made?
 Is the domain deterministic, non-deterministic, or stochastic?
 Is the objective to optimize benefits or to achieve a goal?
 Is the domain known to its full extent at all times? Or is it only partially
known?

Logical decision-making problems are non-stochastic in nature as far as planning and

conflicting behavior are concerned. Both require that the available information regarding the
initial and intermediate states be complete, that actions have exclusively deterministic, known
effects, and that a specific defined goal exists. These problem types are often applied in the real
world, for example in robot control, logistics, complex behavior in the WWW, and in computer
and network security.

In general, planning problems consist of an initial (known) situation, a defined goal, and
a set of permitted actions or transitions between steps. The result of a planning process is a
sequence or set of actions that, when executed correctly, change the executing entity from an
initial state to a state that meets the target conditions. Computationally speaking, planning is a
difficult problem, even if simple problem specification languages are used. Even when relatively
simple problems are involved, the search for a plan cannot run through all state-space
representations, as these are exponentially large in the number of states that define the domains.
Consequently, the aim is to develop efficient algorithms that represent sub-representations in
order to search through these with the hope of achieving the relevant goal.

Current research is focused on developing new search methods and new representations
for actions and states, which will make planning easier. Particularly when one or more agents
acting against each other are taken into account, it is crucial to find a balance between learning
and decision-making – exploration for the sake of learning while decisions are being made can
lead to undesirable results.

Many problems in the real world are problems with dynamics of a stochastic nature.
One example of this is buying a vehicle with features that affect its value, of which we are

29
unaware. These dependencies influence the buying decision, so it is necessary to allow risks and
uncertainties to be considered. For all intents and purposes, stochastic domains are more
challenging when it comes to making decisions, but they are also more flexible than
deterministic domains with regard to approximations – in other words, simplifying practical
assumptions makes automated decision-making possible in practice. A great number of problem
formulations exist, which can be used to represent various aspects and decision-making
processes in stochastic domains, with the best-known being decision networks and Markov
decision processes.

Many applications require a combination of logical (non- stochastic) and stochastic

elements, for example when the control of robots requires high-level specifications in logic and
low-level representations for a probabilistic sensor model. Processing natural languages is
another area in which this assumption applies, since high-level knowledge in logic needs to be
combined with low-level models of text and spoken signals.

4.7 Language and Communication

In the field of artificial intelligence, processing language is considered to be of

fundamental importance, with a distinction being made here between two fields: computational
linguistics (CL) and natural language processing (NLP). In short, the difference is that CL
research focuses on using computers for language processing purposes, while NLP consists of all
applications, including machine translation (MT), Q&A, document summarization, information
extraction, to name but a few. In other words, NLP requires a specific task and is not a research
discipline per se. NLP comprises:

• Part-of-speech tagging

• Natural language understanding

• Natural language generation

• Automatic summarization

• Named-entity recognition

• Parsing

30
• Voice recognition

• Sentiment analysis

• Language, topic, and word segmentation

• Co-reference resolution

• Discourse analysis

• Machine translation

• Word sense disambiguation

• Morphological segmentation

• Answers to questions

• Relationship extraction

• Sentence splitting

The core vision of AI says that a version of first-order predicate logic (“first-order
predicate calculus” or “FOPC”) supported by the necessary mechanisms for the respective
problem is sufficient for representing language and knowledge. This thesis says that logic can
and should supply the semantics underlying natural language.

Although attempts to use a form of logical semantics as the key to representing contents
have made progress in the field of AI and linguistics, they have had little success with regard to a
program that can translate English into formal logic.

To date, the field of psychology has also failed to provide proof that this type of
translation into logic corresponds to the way in which people store and manipulate “meaning.”
Consequently, the ability to translate a language into FOPC continues to be an elusive goal.
Without a doubt, there are NLP applications that need to establish logical inferences between
sentence representations, but if these are only one part of an application, it is not clear that they
have anything to do with the underlying meaning of the corresponding natural language (and

31
consequently with CL/NLP), since the original task for logical structures was inference. These
and other considerations have crystallized into three different

Positions:

• Position 1: Logical inferences are tightly linked to the meaning of sentences, because
knowing their meaning is equivalent to deriving inferences and logic is the best way to do this.

• Position 2: A meaning exists outside logic, which postulates a number of semantic

markers or primes that are appended to words in order to express their meaning – this is
prevalent today in the form of annotations.

• Position 3: In general, the predicates of logic and formal systems only appear to be
different from human language, but their terms are in actuality the words as which they appear

The introduction of statistical and AI methods into the field is the latest trend within this
context. The general strategy is to learn how language is processed – ideally in the way that
humans do this, although this is not a basic prerequisite. In terms of ML, this means learning
based on extremely large corpora that have been translated manually by humans. This often
means that it is necessary to learn (algorithmically) how annotations are assigned or how part-of-
speech categories (the classification of words and punctuation marks in a text into word types) or
semantic markers or primes are added to corpora, all based on corpora that have been prepared
by humans (and are therefore correct).

In the case of supervised learning, and with reference to ML, it is possible to learn
potential associations of part-of-speech tags with words that have been annotated by humans in
the text, so that the algorithms are also able to annotate new, previously unknown texts. 18 This
works the same way for lightly supervised and unsupervised learning, such as when no
annotations have been made by humans and the only data presented is a text in a language with
texts with identical contents in other languages or when relevant clusters are found in thesaurus
data without there being a defined goal.

19 With regard to AI and language, information retrieval (IR) and information extraction
(IE) play a major role and correlate very strongly with each other. One of the main tasks of IR is
grouping texts based on their content, whereas IE extracts similarly factual elements from texts

32
or is used to be able to answer questions concerning text contents. These fields therefore
correlate very strongly with each other, since individual sentences (not only long texts) can also
be regarded as documents.

These methods are used, for example, in interactions between users and systems, such as
when a driver asks the on-board computer a question regarding the owner's manual during a
journey – once the language input has been converted into text, the question's semantic content is
used as the basis for finding the answer in the manual, and then for extracting the answer and
returning it to the driver.

4.8 Agents and Actions

In traditional AI, people focused primarily on individual, isolated software systems that
acted relatively inflexibly to predefined rules. However, new technologies and applications have
established a need for artificial entities that are more flexible, adaptive, and autonomous, and that
act as social units in multi-agent systems.

In traditional AI (see also "physical symbol system hypothesis"20 that has been
embedded into so-called “deliberative” systems), an action theory that establishes how systems
make decisions and act is represented logically in individual systems that must execute
actions.Based on these rules, the system must prove a theorem – the prerequisite here being that
the system must receive a description of the world in which it currently finds itself, the desired
target state, and a set of actions, together with the prerequisites for executing these actions and a
list of the results for each action.

It turned out that the computational complexity involved rendered any system with time
limits useless even when dealing with simple problems, which had an enormous impact on
symbolic AI, resulting in the development of reactive architectures.

These architectures follow if-then rules that translate inputs directly into tasks.Such
systems are extremely simple, although they can solve very complex tasks. The problem is that
such systems learn procedures rather than declarative knowledge, i.e., they learn attributes that
cannot easily be generalized for similar situations.

33
Many attempts have been made to combine deliberative and reactive systems, but it
appears that it is necessary to focus either on impractical deliberative systems or on very loosely
developed reactive systems – focusing on both is not optimal.

The agent-oriented approach is characterized by the following principles:

• Autonomous behavior

“Autonomy” describes the ability of systems to make their own decisions and execute
tasks on behalf of the system designer. The goal is to allow systems to act autonomously in
scenarios where controlling them directly is difficult. Traditional software systems execute
methods after these methods have been called, i.e., they have no choice, whereas agents make
decisions based on their beliefs, desires, and intentions (BDI).

• Adaptive behavior

Since it is impossible to predict all the situations that agents will encounter, these agents
must be able to act flexibly. They must be able to learn from and about their environment and
adapt accordingly. This task is all the more difficult if not only nature is a source of uncertainty,
but the agent is also part of a multi-agent system. Only environments that are not static and self-
contained allow for an effective use of BDI agents – for example, reinforcement learning can be
used to compensate for a lack of knowledge of the world.

Within this context, agents are located in an environment that is described by a set of
possible states. Every time an agent executes an action, it is “rewarded” with a numerical value
that expresses how good or bad the action was. This results in a series of states, actions, and
rewards, and the agent is compelled to determine a course of action that entails maximization of
the reward.

• Social behavior

34
In an environment where various entities act, it is necessary for agents to recognize their
adversaries and form groups if this is required by a common goal. Agent-oriented systems are
used for personalizing user interfaces, as middleware, and in competitions such as the Robo Cup.
In a scenario where there are only self-driving cars on roads, the individual agent’s autonomy is
not the only indispensable component – car2car communications, i.e., the exchange of
information between vehicles and acting as a group on this basis, are just as important.
Coordination between .

The agents results in an optimized flow of traffic, rendering traffic jams and accidents
virtually impossible (see also section 5.1, “Vehicles as autonomous, adaptive, and social agents
& cities as super-agents”). In summary, this agent-oriented approach is accepted within the AI
community as the direction of the future.

Multi-agent behavior
Various approaches are being pursued for implementing multi-agent behavior,
with the primary difference being in the degree of control that designers have over
individual agents.22,23,24 A distinction is made here between:

• Distributed problem-solving systems (DPS)

• Multi-agent systems (MAS)

DPS systems allow the designer to control each individual agent in the domain, with the
solution to the task being distributed among multiple agents. In contrast, MAS systems have
multiple designers, each of whom can only influence their own agents with no access to the
design of any other agent. In this case, the design of the interaction protocols is extremely
important. In DPS systems, agents jointly attempt to achieve a goal or solve a problem, whereas,
in MAS systems, each agent is individually motivated and wants to achieve its own goal and
maximize its own benefit.

The goal of DPS research is to find collaboration strategies for problem-solving, while
minimizing the level of communication required for this purpose. Meanwhile, MAS research is
looking at coordinated interaction, i.e., how autonomous agents can be brought to find a common

35
basis for communication and undertake consistent actions.25 Ideally, a world in which only self-
driving cars use the road would be a DPS world. However, the current competition between
OEMs means that a MAS world will come into being first. In other words, communication and
negotiation between agents will take center stage (see also Nash equilibrium).

Multi-agent learning

Multi-agent learning (MAL) has only relatively recently been bestowed a certain degree
of attention.26,27,28,29 The key problems in this area include determining which techniques
should be used and what exactly “multi-agent learning” means. Current ML approaches were
developed in order to train individual agents, whereas MAL focuses first and foremost on
distributed learning. “Distributed” does not necessarily mean that a neural network is used, in
which many identical operations run during training and can accordingly be parallelized, but
instead that:

• A problem is split into sub problems and individual agents learn these sub
problems in order to solve the main problem using their combined knowledge OR

• Many agents try to solve the same problem independently of each other by
competing with each other

Reinforcement learning is one of the approaches being used in this context

4.9 Distributed SQL Processing

SQL Structured Query Language is a domainspecific language used in programming

and designed for managing data held in a relational database managementsystem (RDBMS),
or for stream processing in a relational data streammanagement system (RDSMS).

It is
particulaly useful in handlingstructured data where there are relations between different entities/v
ariablesof the data. SQL offers two main advantages over older read/write APIs likeISAM or VS

36
AM. First, it introduced the concept of accessing many recordswith one single command; and sec
ond, it eiminates the need to specify howto reach a record,

e.g. with or without an index.Originally based upon relational algebra and tuple relational
calculus, SQLconsists of many types of statements, which may be informally classed as
sublanguages, commonly: a data query language (DQL),

[a] A data definition language (DDL),

[b] A data control language (DCL), and a data manipulation language (DML).

[c] The scope of SQL includes data query, datamanipulation (insert, update and delete), data defi
nition (schema creationand modification), and data access control. Although SQL is often descri
bedas, and to a great extent is, a declarative language (4GL).

It also includesprocedural elements.SQL was one of the first commercial languages for E
dgar F. Codd's relationalmodel. The model was described in his influential 1970 paper, "A Relati
onalModel of Data for Large Shared Data Banks". Despite not entirely adheringto the relational
model as described by Codd, it became the most widely used database language.

SQL became a standard of the American National Standards Institute (ANSI)in 1986, and
of the International Organization for Standardization (ISO) in1987. Since then, the standard has
been revised to include a larger set offeatures. Despite the existence of such standards, most
SQL code is not completely portable among different database systems without adjustments.

History

SQL was initially developed at IBM by Donald D. Chamberlin and Raymond F. Boyce af
ter learning about the relationalmodel from Ted Codd in the early 1970s.This version,
initially called SEQUEL (Structured English QueryLanguage), was designed to manipulate and r
etrieve data stored in IBM's original quasirelational database managementsystem, System R, whi
ch a group at IBM San Jose Research Laboratory had developed during the 1970s.

Chamberlin and Boyce's first attempt of a relational database language was Square, but it
was difficult to use due tosubscript notation. After moving to the San Jose Research Laboratory i
n 1973, they began work on SEQUEL. Theacronym SEQUEL was later changed to SQL because

37
"SEQUEL" was a trademark of the UKbased Hawker SiddeleyDynamics Engineering Limited co
mpany.

Design

SQL deviates in several ways from its theoretical foundation, the relational model and its
tuple calculus. In that model, atable is a set of tuples, while in SQL, tables and query results are l
ists of rows: the same row may occur multiple times, andthe order of rows can be employed in q
ueries (e.g. in the LIMIT clause).

Critics argue that SQL should be replaced with a language that strictly returns to the origi
nal foundation: for example, seeThe Third Manifesto. However, no known proof exists that such
uniqueness cannot be added to SQL itself, or at least avariation of SQL. In other words, it's quite
possible that SQL can be "fixed" or at least improved in this regard such thathe industry may not
have to switch to a completely different query language to obtain uniqueness. Debate on this
remains open.

Interoperability and standardization

SQL implementations are incompatible between vendors and do not necessarily
completely follow standards. Inparticular date and time syntax, string concatenation, NULLs, an
d comparison case sensitivity vary from vendor tovendor. Particular exceptions are PostgreSQLa
nd Mimer SQL which strive for standards compliance, thoughPostgreSQL does not adhere to the
standard in how folding of unquoted names is done.

The folding of unquoted names tolower case in PostgreSQL is incompatible with the stan
dard, which says that unquoted names should be folded toupper case.Thus, Foo should be equiva
lent to FOO not foo according to the standard.

Popular implementations of SQL commonly omit support for basic features of Standard S
QL, such as the DATE or TIMEdata types. The most obvious such examples, and incidentally th
e most popular commercial and proprietary SQL DBMSs,are Oracle (whose DATE behaves as D
ATETIME,and lacks a TIME type)and MS SQL Server

38
5.Result And Descussion

5.1 Sample Data

HONDA

S.No ModelName CC Kmpl Prize FuleType Rank Color

1. Honda Civic 1799 16 1770000 Diesel,Petrol 3 Red
2. Honda Amaze 1498 27 586000 Diesel,Petrol 4 Red
3. Honda City 1498 25 970000 Diesel,Petrol 3 Red
4. Honda WRV 1199 17 748000 Diesel,Petrol 5 Red
5. Honda BRV 1498 21 951000 Diesel,Petrol 4 Brown
6. Honda Jazz 1498 27 740000 Diesel,Petrol 3 Red
7. Honda Brio 1198 18 473000 Petrol 4 Red
8. Honda CR-V 1597 19 282500 Diesel,Petrol 5 Silver
9. Honda Accord 1993 23 432100 Petrol 5 White
10. Honda Vezel 1588 20 100000 Petrol 4 Blue

39
HYUNDAI

S.No ModelName CC Kmpl Prize FuleType Rank Color

1. Hyundai Elitei20 1197 18 550000 Petrol,Diesel 4 Blue
2. Hyundai Creta 1396 22 960000 Petrol, Diesel 5 Brown
3. Hyundai Vera 1582 22 808000 Petrol, Diesel 4 Red
4. Hyundai Grand 1186 24 497000 Petrol, Diesel 5 Red
i10
5. Hyundai santo 1086 20 390000 CNG,Petrol 5 Ash
6. Hyundai ENO 814 21 335000 Petrol 3 Blue
7. Hyundai i20Active 1197 17 771000 Petrol,Diesel 4 Brown
8. Hyundai Xcent 1186 25 569000 Petrol, Diesel 5 Silver
9. Hyundai Elantra 1582 22 138100 Petrol, Diesel 4 Blue
10. Hyundai Tucson 1995 18 187400 Petrol, Diesel 3 Blue

MAHINDRA

40
S.No ModelName CC Kmpl Prize FuleType Rank Color
1. Mahindra XUV300 1197 17 790000 Petrol,Diesel 3 Red
2. Mahindra XUV500 2179 13 127200 Petrol,Diesel 4 Red
3. Mahindra Scorpio 2179 11 100000 Diesel 5 White
4. Mahindra Thar 2498 16 672000 Diesel 4 Red
5. Mahindra Bolero 2523 15 769000 Diesel 5 Black
6. Mahindra Tuv300 1493 18 840000 Diesel 3 Red
7. Mahindra Marazzo 1497 17 999000 Diesel 4 Green
8. Mahindra kuv100NXT 1198 25 477000 Petrol,Diesel 4 Red
9. Mahindra Xylo 2489 14 942000 Diesel 5 White
10. Mahindra Tuv300plus 2179 18 978000 Diesel 3 Silver

MARUTI

S.No ModelName CC Kmpl Prize FuleType Rank Color

1. Maruti Baleno 1197 21 546000 Petrol,Disel 3 Blue
2. Maruti Wagon 1197 21 419000 CNG,Petrol 4 Blue
3. Maruti Swift 1248 28 499000 Petrol,Diesel 5 Red

41
4. Maruti Ertiga 1248 25 744000 Petrol,Diesel 3 Rose
5. Maruti Vitara Brezza 1248 24 767000 Diesel 4 Brown
6. Maruti Alto800 796 24 263000 Petrol,CNG 5 Green
7. Maruti Ciaz 1462 21 519000 Petrol,Diesel 4 Blue
8. Maruti Eeco 1196 15 337000 Petrol,CNG 4 Red
9. Maruti Alto k10 998 24 338000 Petrol,CNG 5 Black
10. Maruti Ignis 1197 20 479000 Petrol 5 Blue

TATA

S.No ModelName CC Kmp Prize FuleType Rank Color

l
1. TataTiago 1199 23 421000 Petrol,Diese 4 Yellow
l
2. Tata Safari Strome 2179 14 110800 Diesel 5 Silver
3. Tata Tigor 1199 20 542000 Petrol,Diese 3 Blue
l
4. Tata Nano 624 23 236000 CNG,Petrol 4 Red

42
5. Tata Zest 1193 17 564000 Petrol,Diese 4 Blue
l
6. Tata TiagoNRG 1047 27 561000 Petrol,Diese 5 Black
l
7. Tata Tiago JTP 1199 23 639000 Petrol 4 Red
8. Tata Tigor JTP 1199 20 749000 Petrol 5 White
9. Tata Bolt 1193 17 508000 Diesel,Petro 3 Red
l
10. Tata Nexon 1497 21.5 653000 Diesel 4 Blue

TOYOTA

S.No ModelName CC Kmp Prize FuleType Rank Color

l
1. Toyota Fortuner 2694 10 2758000 Diesel,Petro 4 Silver
l
2. Toyota Innova Crysta 2393 13 1483000 Diesel,Petro 5 Black

43
l
3. Toyota Platinum Etios 1496 16 690000 Diesel,Petro 3 Red
l
4. Toyota Corolla Altis 1364 21 1645000 Diesel,Petro 4 Black
l
5. Toyota Yaris 1496 17 929000 Petrol 5 Red
6. Toyota Camry299 2487 19 3750000 Petrol 5 Black
7. Toyota Land Cruiser 4461 11 14700000 Diesel 5 White
8. Toyota Land Cruiser 2982 11 9630000 Diesel 5 Black
Prado
9. Toyota Prius 1798 23 4509000 Petrol 4 Blue
10. Toyota Etios Liva 2982 15 778000 Petrol, Diesel 4 Red

5.2 Sample Screen Shot

44
45
46
47
6.CONCLUSION AND FUTURE ENHANCEMENT

6.1 Conclusion

There are a lot of different considerations for businesses wishing to take advantage of
data analytics, especially those with limited resources. A majority of SMEs have not yet
immersed themselves into the world of Big Data for various reasons, most of which stem back to
not having the required knowledge or even the need for vast data collection. For SMEs which
are on the smallest end of the scale, using free web tools which are catered for less technot
necessary for them to be specialised to begin with as training them in relevant courses should be
more affordable.

They also do not necessarily need to be proficient in IT and networking support as,
depending on the amount of Small Data being processed and stored, a single machine should be
sufficient. As there are many different types of data analysis to choose from, SMEs benefit much
more from Small Data as they can pick and choose which specific areas they are going to
analyse, as opposed to collecting a lot of information where not all of it is actionable or relevant
to their current needs/access to resources. However, if SMEs are trying to get the most reach
online, particularly in social media, they need to be prepared for the vast variety of data they will
be collecting, and be sufficiently trained or have the right software to analyze it.

This in itself can be a large restriction, but there are short and relatively cheap online
courses that are focused on specific areas which an SME may wish to invest in if they feel a
social media presence will have a great impact on their business

6.2 Future Enhancement

Customer data integration, breaking silos across the organization and the desire to create a
single integrated view of the customer is the key to unlocking the richness and maturity of the
dataset. This will involve the aggregation of a range of internal and external sources including
CRM, dealer management systems , demographics, sales and marketing databases to name a few. A
critical next step in making customer data useful is to use it to create actionable and meaningful

48
customer segment that allow the development of a differentiated product offering and value
proposition for each segment at each stage of the lifecycle.

This could lead for example to the formulation of a new innovative retail model such as
a pop up store to drive awareness and/or more targeted campaigns of special service bundles
beyond normal warranty to reduce customer leakage and aid retention.

49
REFERENCES

1. Bodkin, R. The big data Wild West: The good, the bad and the ugly. [Accessed
30.09.2013].

2. Brown, T. Change by Design: How Design Thinking Can Transform

Organizations and Inspire Innovation. New York: HarperCollins.2013

3. Brynjolfsson, E. How Does Data-Driven Decision Making Affect Firm

Performance, MSI Big Data Conference, Conference Summary, December
2012.

4. Brynjolfsson, E., Hitt, L., Kim, H. Strength in Numbers: How Does Data-Driven
Decision Making Affect Firm Performance? MIT Working Paper, April 2011.

5. Chen, H., Chiang, R. H., Storey, V. C. Business Intelligence and Analytics: From Big
Data to Big Impact. MIS Quarterly, 36 (4), 1165 – 1188.

Digital Transformation BCG PDF
No ratings yet
Digital Transformation BCG PDF
1 page
Big Data Management
No ratings yet
Big Data Management
25 pages
Xaviers Project Work
No ratings yet
Xaviers Project Work
32 pages
El Camino Hacia Una Transformación Digital Exitosa - 5 Pasos para Hacerlo Bien
No ratings yet
El Camino Hacia Una Transformación Digital Exitosa - 5 Pasos para Hacerlo Bien
10 pages
A Review of The Literature On Big Data Analytics in Healthcare PDF
No ratings yet
A Review of The Literature On Big Data Analytics in Healthcare PDF
20 pages
Digital Transformation and Organization Design An Integrated Approach
No ratings yet
Digital Transformation and Organization Design An Integrated Approach
19 pages
MS Navision
No ratings yet
MS Navision
2 pages
Role of Internet in Supply-Chain Management in B2B
No ratings yet
Role of Internet in Supply-Chain Management in B2B
37 pages
IMT Sineflex
No ratings yet
IMT Sineflex
10 pages
Digital Transformation in Freight Transport
No ratings yet
Digital Transformation in Freight Transport
24 pages
2018 IDG Digital Business Survey
100% (3)
2018 IDG Digital Business Survey
9 pages
ERP Software for Business Leaders
60% (5)
ERP Software for Business Leaders
316 pages
Gartner Article
No ratings yet
Gartner Article
4 pages
Digital Business Management Brochure
No ratings yet
Digital Business Management Brochure
8 pages
Digital Transformation 101: Transforming The Business and Culture To The Digital Age
No ratings yet
Digital Transformation 101: Transforming The Business and Culture To The Digital Age
35 pages
Lean Accounting Article 0703
No ratings yet
Lean Accounting Article 0703
5 pages
9b663CRM Module 1
No ratings yet
9b663CRM Module 1
42 pages
Digital Strategy's Impact on Pharma
No ratings yet
Digital Strategy's Impact on Pharma
70 pages
Mba Iv Semester Module-Ii Digital Marketing Research
100% (1)
Mba Iv Semester Module-Ii Digital Marketing Research
32 pages
Customer Relationship Management CRM in Automobile Industry
No ratings yet
Customer Relationship Management CRM in Automobile Industry
5 pages
CRM's Impact on Healthcare
No ratings yet
CRM's Impact on Healthcare
8 pages
# (Article) Integrated Business Planning - A Roadmap To Linking S&OP and CPFR (2011)
100% (1)
# (Article) Integrated Business Planning - A Roadmap To Linking S&OP and CPFR (2011)
10 pages
IAPI Webinar Stathis Gould
No ratings yet
IAPI Webinar Stathis Gould
17 pages
Problems of Management in The 21st Century, Vol. 3, 2012
No ratings yet
Problems of Management in The 21st Century, Vol. 3, 2012
137 pages
Amazon Business Analysis Report
No ratings yet
Amazon Business Analysis Report
29 pages
HCL Information Security Practice: Security Transformation System Integration
No ratings yet
HCL Information Security Practice: Security Transformation System Integration
4 pages
Machine Learning Zero To Hero 6 Weeks
No ratings yet
Machine Learning Zero To Hero 6 Weeks
6 pages
AI Transforms M&A Due Diligence
No ratings yet
AI Transforms M&A Due Diligence
10 pages
MGI Digital China Report December 20 2017 (001 176)
No ratings yet
MGI Digital China Report December 20 2017 (001 176)
176 pages
Shynopsis - Neha Jaiswal
No ratings yet
Shynopsis - Neha Jaiswal
13 pages
New Trends in Business Information Systems and Technology: Digital Innovation and Digital Business Transformation Rolf Dornberger PDF Download
100% (1)
New Trends in Business Information Systems and Technology: Digital Innovation and Digital Business Transformation Rolf Dornberger PDF Download
89 pages
Targeting CIOs for Strategic Growth
100% (1)
Targeting CIOs for Strategic Growth
31 pages
Deloitte Chemicals 4.0 G.wehberg
No ratings yet
Deloitte Chemicals 4.0 G.wehberg
44 pages
Silicon Supply Planner Job at Micron
No ratings yet
Silicon Supply Planner Job at Micron
4 pages
Eiu Accenture Pega Digital Evolution
100% (1)
Eiu Accenture Pega Digital Evolution
30 pages
Cost Cutting and Profit Maximization in Indian Banks
92% (13)
Cost Cutting and Profit Maximization in Indian Banks
26 pages
BAIN BRIEF The Power of Focus PDF
No ratings yet
BAIN BRIEF The Power of Focus PDF
8 pages
Agile Project Management in New Product Development and Innovation Processes Challenges and Benefits Beyond Software Domain
No ratings yet
Agile Project Management in New Product Development and Innovation Processes Challenges and Benefits Beyond Software Domain
9 pages
Change MGT Ch09 - Digital Transfromation
No ratings yet
Change MGT Ch09 - Digital Transfromation
15 pages
Retail Analytics
No ratings yet
Retail Analytics
11 pages
A Seminar Report: Big Data
No ratings yet
A Seminar Report: Big Data
22 pages
Dti Executive Summary Website Version PDF
No ratings yet
Dti Executive Summary Website Version PDF
71 pages
Using Artificial Intelligence in Internal Audit - The Future Is Now - Internal Audit 360
No ratings yet
Using Artificial Intelligence in Internal Audit - The Future Is Now - Internal Audit 360
12 pages
Accenture Global Operations Megatrends Study Big Data Analytics
No ratings yet
Accenture Global Operations Megatrends Study Big Data Analytics
20 pages
ERP ALL Full Forms
No ratings yet
ERP ALL Full Forms
5 pages
Design Approach to Data-Driven Decisions
No ratings yet
Design Approach to Data-Driven Decisions
20 pages
KPMG Genai Survey August 2024
No ratings yet
KPMG Genai Survey August 2024
12 pages
Supply Chain Management: 1. SCM - Introduction
No ratings yet
Supply Chain Management: 1. SCM - Introduction
8 pages
Accenture - 2016 - Digital Transformation of Industries Consumer Industries
No ratings yet
Accenture - 2016 - Digital Transformation of Industries Consumer Industries
32 pages
Measurement With A Focus: Goal-Driven Software Measurement
No ratings yet
Measurement With A Focus: Goal-Driven Software Measurement
4 pages
Knowledge Leader Learning Material
No ratings yet
Knowledge Leader Learning Material
10 pages
Korea's National Cybersecurity Plan
No ratings yet
Korea's National Cybersecurity Plan
27 pages
Big Data Finance
No ratings yet
Big Data Finance
8 pages
Inventory Management Overview
No ratings yet
Inventory Management Overview
51 pages
Using Data Analytics To Derive Business Intelligence: A Case Study
No ratings yet
Using Data Analytics To Derive Business Intelligence: A Case Study
13 pages
Business Intelligence and Analytics Fundamentals - Charles Natuhamya
100% (1)
Business Intelligence and Analytics Fundamentals - Charles Natuhamya
21 pages
Unlocking Strategic Insights: Elevating Business Intelligence Through Advanced Big Data Analytics Services
No ratings yet
Unlocking Strategic Insights: Elevating Business Intelligence Through Advanced Big Data Analytics Services
10 pages
CHAPTER 02: Big Data Analytics
No ratings yet
CHAPTER 02: Big Data Analytics
62 pages
E-Book - Business Intelligence Design Thinking - English
No ratings yet
E-Book - Business Intelligence Design Thinking - English
21 pages
Global E Business and Collaboration
No ratings yet
Global E Business and Collaboration
21 pages
Application of Business in The Banking Industry
No ratings yet
Application of Business in The Banking Industry
8 pages
Lesson 2 Business Intelligence
No ratings yet
Lesson 2 Business Intelligence
23 pages
Senior Data Scientist - Pratikkumar Girishbhai Patel
No ratings yet
Senior Data Scientist - Pratikkumar Girishbhai Patel
4 pages
SQL Server 2016: New Features Guide
No ratings yet
SQL Server 2016: New Features Guide
2 pages
Bca Vi Sem (Datawartehousing) Unit - I Notes
No ratings yet
Bca Vi Sem (Datawartehousing) Unit - I Notes
66 pages
Microstrategy Freeform SQL Essentials Co
No ratings yet
Microstrategy Freeform SQL Essentials Co
228 pages
Marketing Research and Sales Forecasting Chapter Objectives
No ratings yet
Marketing Research and Sales Forecasting Chapter Objectives
16 pages
IiiQbets - Company Profile
No ratings yet
IiiQbets - Company Profile
12 pages
Swot
No ratings yet
Swot
21 pages
PESTLE EXCEL EXAMPLE For Marketing
No ratings yet
PESTLE EXCEL EXAMPLE For Marketing
12 pages
Entrep 12 q1 m3 Recognize and Understand The Market
No ratings yet
Entrep 12 q1 m3 Recognize and Understand The Market
22 pages
Business Intelligence Tools Guide
No ratings yet
Business Intelligence Tools Guide
11 pages
Oracle BI Suite EE 10g R3
No ratings yet
Oracle BI Suite EE 10g R3
3 pages
Business Intelligence Test
No ratings yet
Business Intelligence Test
3 pages
CV Sunil KalanagaSrinivas
No ratings yet
CV Sunil KalanagaSrinivas
7 pages
PRISM G2 Brochure
No ratings yet
PRISM G2 Brochure
12 pages
Rakshana SN - LAQ Week 1 MA
No ratings yet
Rakshana SN - LAQ Week 1 MA
2 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
37 pages
IoT NOtes
No ratings yet
IoT NOtes
34 pages
Dashboard in A Day
No ratings yet
Dashboard in A Day
40 pages
Terp01 - Sap Erp: Introduction: Financial Accounting
No ratings yet
Terp01 - Sap Erp: Introduction: Financial Accounting
10 pages
Business Analytics
No ratings yet
Business Analytics
3 pages
Making Sense of Market Intelligence
No ratings yet
Making Sense of Market Intelligence
13 pages
Microstrategy - ProjectDesign
No ratings yet
Microstrategy - ProjectDesign
601 pages
ABC-AI-Big-Data-and-Cloud-Computing
No ratings yet
ABC-AI-Big-Data-and-Cloud-Computing
67 pages
060B3490012 AttachF13a
No ratings yet
060B3490012 AttachF13a
19 pages
BSc-Data Science
No ratings yet
BSc-Data Science
9 pages
Uses of Seo
No ratings yet
Uses of Seo
2 pages
Ba LP
No ratings yet
Ba LP
4 pages