0% found this document useful (0 votes)

41 views11 pages

Bda PST

Big data refers to large and complex datasets that traditional data management systems struggle to handle, characterized by the five V's: Volume, Variety, Velocity, Veracity, and Value. Big data analytics involves processing and analyzing these datasets to uncover insights, but it faces challenges such as data overload, quality issues, and privacy concerns. The evolution of big data has progressed through milestones like data warehousing, Hadoop, NoSQL databases, and machine learning, with applications in various domains, notably healthcare analytics.

Uploaded by

Ritika Darade

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views11 pages

Bda PST

Uploaded by

Ritika Darade

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

BDA PST

Q. Define Big data and its characteristics?

 Big data refers to extremely large and diverse collections of structured, unstructured, and
semi-structured data that continues to grow exponentially over time.
 These datasets are so huge and complex in volume, velocity, and variety, that traditional
data management systems cannot store, process, and analyze them.
 Big data describes large and diverse datasets that are huge in volume and also rapidly
grow in size over time.
 Big data is used in machine learning, predictive modeling, and other advanced analytics
to solve business problems and make informed decisions.

There are five v's of Big Data that explains the characteristics.
5 V's of Big Data

o Volume o Accuracy
o Veracity o Reliability
o Variety o Completeness
o Value
o Relevances
o Velocity
o Timeliness
Volume
The name Big Data itself is related to an enormous size. Big Data is a vast 'volumes' of data
generated from many sources daily, such as business processes, machines, social media
platforms, networks, human interactions, and many more.

Facebook can generate approximately a billion messages, 4.5 billion times that the "Like"
button is recorded, and more than 350 million new posts are uploaded each day. Big data
technologies can handle large amounts of data.

Variety
Big Data can be structured, unstructured, and semi-structured that are being collected from
different sources. Data will only be collected from databases and sheets in the past, But these
days the data will comes in array forms, that are PDFs, Emails, audios, SM posts, photos,
videos, etc.

Veracity
Veracity means how much the data is reliable. It has many ways to filter or translate the data.
Veracity is the process of being able to handle and manage data efficiently. Big Data is also
essential in business development.

For example, Facebook posts with hashtags.

Value
Value is an essential characteristic of big data. It is not the data that we process or store. It
is valuable and reliable data that we store, process, and also analyze.
Velocity
Velocity plays an important role compared to others. Velocity creates the speed by which the data
is created in real-time. It contains the linking of incoming data sets speeds, rate of change,
and activity bursts. The primary aspect of Big Data is to provide demanding data rapidly.

Big data velocity deals with the speed at the data flows from sources like application logs,
business processes, networks, and social media sites, sensors, mobile devices, etc.

Variability:

 Refers to the inconsistency of the data, which can change over time.
 This could involve variations in data formats, quality, or even in how data is collected
and interpreted.

Complexity

 Refers to the complexity involved in managing, processing, and analyzing big data.
 The interconnectedness and large-scale nature of data sources often require
sophisticated infrastructure and tools.
Q. Define big data analytics and challenges and advantages of big data?
 Big Data Analytics is all about crunching massive amounts of information to uncover
hidden trends, patterns, and relationships. It's like sifting through a giant mountain of
data to find the gold nuggets of insight.
 Here's a breakdown of what it involves:
o Collecting Data: Such data is coming from various sources such as social
media, web traffic, sensors and customer reviews.
o Cleaning the Data: Imagine having to assess a pile of rocks that included some
gold pieces in it. You would have to clean the dirt and the debris first. When
data is being cleaned, mistakes must be fixed, duplicates must be removed and
the data must be formatted properly.
o Analyzing the Data: It is here that the wizardry takes place. Data analysts
employ powerful tools and techniques to discover patterns and trends. It is the
same thing as looking for a specific pattern in all those rocks that you sorted
through.
 For example, big data analytics is integral to the modern health care industry. As you can
imagine, systems that must manage thousands of patient records, insurance plans,
prescriptions, and vaccine information.
Challenges of Big data analytics
While Big Data Analytics offers incredible benefits, it also comes with its set of challenges:
 Data Overload: Consider Twitter, where approximately 6,000 tweets are posted
every second. The challenge is sifting through this avalanche of data to find
valuable insights.
 Data Quality: If the input data is inaccurate or incomplete, the insights generated
by Big Data Analytics can be flawed. For example, incorrect sensor readings could
lead to wrong conclusions in weather forecasting.
 Privacy Concerns: With the vast amount of personal data used, like in Facebook's
ad targeting, there's a fine line between providing personalized experiences and
infringing on privacy.
 Security Risks: With cyber threats increasing, safeguarding sensitive data becomes
crucial. For instance, banks use Big Data Analytics to detect fraudulent activities,
but they must also protect this information from breaches.
 Costs: Implementing and maintaining Big Data Analytics systems can be
expensive. Airlines like Delta use analytics to optimize flight schedules, but they
need to ensure that the benefits outweigh the costs.

Benefits of Big Data Analytics

Big Data Analytics offers a host of real-world advantages, and let's understand with examples:
1. Informed Decisions: Imagine a store like Walmart. Big Data Analytics helps them
make smart choices about what products to stock. This not only reduces waste but
also keeps customers happy and profits high.
2. Enhanced Customer Experiences: Think about Amazon. Big Data Analytics is
what makes those product suggestions so accurate. It's like having a personal
shopper who knows your taste and helps you find what you want.
3. Fraud Detection: Credit card companies, like MasterCard, use Big Data Analytics
to catch and stop fraudulent transactions. It's like having a guardian that watches
over your money and keeps it safe.
4. Optimized Logistics: FedEx, for example, uses Big Data Analytics to deliver your
packages faster and with less impact on the environment. It's like taking the fastest
route to your destination while also being kind to the planet.
Q. Evolution of Big Data?
If we see the last few decades, we can analyze that Big Data technology has gained so much
growth. There are a lot of milestones in the evolution of Big Data which are described below:

1. Data Warehousing:
In the 1990s, data warehousing emerged as a solution to store and analyze large volumes
of structured data.
2. Hadoop:
Hadoop was introduced in 2006 by Doug Cutting and Mike Cafarella. Distributed storage
medium and large data processing are provided by Hadoop, and it is an open-source
framework.
3. NoSQL Databases:
In 2009, NoSQL databases were introduced, which provide a flexible way to store and
retrieve unstructured data.
4. Cloud Computing:
Cloud Computing technology helps companies to store their important data in data
centers that are remote, and it saves their infrastructure cost and maintenance costs.
5. Machine Learning:
Machine Learning algorithms are those algorithms that work on large data, and analysis is
done on a huge amount of data to get meaningful insights from it. This has led to the
development of artificial intelligence (AI) applications.
6. Data Streaming:
Data Streaming technology has emerged as a solution to process large volumes of data in
real time.
7. Edge Computing:
dge Computing is a kind of distributed computing paradigm that allows data processing to
be done at the edge or the corner of the network, closer to the source of the data.
Q. Explain any one domain specific example of big data?
One domain-specific example of big data is healthcare analytics. In the healthcare industry,
large volumes of data are generated from various sources such as patient records, medical
devices, diagnostic equipment, wearable health trackers, and even social media.

Example: Predictive Healthcare Analytics

Hospitals and healthcare providers can use big data to predict patient outcomes, improve
treatment plans, and prevent diseases. For instance, by analyzing historical data from millions
of patients, healthcare professionals can develop predictive models to identify individuals at
high risk for certain diseases, like diabetes or heart disease, even before symptoms appear.

How It Works:

1. Data Sources: The data used includes electronic health records (EHR), lab test results,
medical imaging, and sensor data from wearable devices.
2. Processing: Advanced algorithms and machine learning models process these vast amounts
of data to find patterns or correlations that would be impossible for humans to identify
manually.
3. Outcomes: With these insights, doctors can make more informed decisions, leading to
better care, more accurate diagnoses, and cost reductions by preventing hospital
readmissions or unnecessary treatments.

Benefits:

 Personalized Medicine: Tailoring treatments based on individual patient data.

 Early Detection: Identifying at-risk patients earlier than traditional methods.
 Operational Efficiency: Optimizing hospital workflows by predicting peak times, resource
needs, and patient flow.

This use of big data in healthcare improves both patient outcomes and operational efficiency,
making it a critical example of how big data is transforming industries.
Q. Explain analytic flow of big data?
Analytics Flow for Big Data
The analytics flow for big data refers to the process of collecting, storing, processing, and
analyzing large and complex data sets to gain insights and make better decisions. It typically
includes the following steps:

1. Data collection: Data is collected from various sources such as social media, IoT devices, and
sensors. The data can be structured, semi-structured, or unstructured and may need to be cleaned
and transformed before it can be analyzed.
2. Data storage: The data is stored in a centralized repository such as a data lake, Hadoop
Distributed File System (HDFS), or NoSQL database.
3. Data processing: The data is processed using technologies such as Hadoop MapReduce,
stream processing, and machine learning to extract insights and prepare it for analysis.
4. Data analysis: The data is analyzed using tools such as SQL, data visualization, and machine
learning algorithms to gain insights and make better decisions.
5. Data governance: Data governance policies and procedures are put in place to ensure data is
accurate, complete, consistent and compliant with regulations.
6. Data security: Security measures such as data encryption, access controls, and incident
response are implemented to protect sensitive information and prevent unauthorized access.
7.Data visualization: The data is transformed into interactive and easy-to-understand
visualizations using tools such as Tableau, QlikView and Power BI.
8. Decision-making: Insights from the data are used to make better decisions and take action.
Q. Classification of Big Data Analytics?

Types of Data Analytics

1. Predictive (forecasting)
2. Descriptive (business intelligence and data
mining)
3. Prescriptive (optimization and simulation)
4. Diagnostic analytics

Predictive Analytics
 Predictive analytics turn the data into valuable, actionable information. predictive
analytics uses data to determine the probable outcome of an event or a likelihood of
a situation occurring.
 Predictive analytics holds a variety of statistical techniques from modeling, machine
learning , data mining , and game theory that analyze current and historical facts to
make predictions about a future event. Techniques that are used for predictive
analytics are:
 Linear Regression
 Time Series Analysis and Forecasting
 Data Mining
Basic Cornerstones of Predictive Analytics
 Predictive modeling
 Decision Analysis and optimization
 Transaction profiling
Descriptive Analytics
 Descriptive analytics looks at data and analyze past event for insight as to how to
approach future events.
 It looks at past performance and understands the performance by mining historical
data to understand the cause of success or failure in the past.
 Almost all management reporting such as sales, marketing, operations, and finance
uses this type of analysis.
 The descriptive model quantifies relationships in data in a way that is often used to
classify customers or prospects into groups.
 Unlike a predictive model that focuses on predicting the behavior of a single
customer, Descriptive analytics identifies many different relationships between
customer and product.
Common examples of Descriptive analytics are company reports that provide historic
reviews like:
 Data Queries
 Reports
 Descriptive Statistics
 Data dashboard

Prescriptive Analytics
 Prescriptive Analytics automatically synthesize big data, mathematical science,
business rule, and machine learning to make a prediction and then suggests a
decision option to take advantage of the prediction.
 Prescriptive analytics goes beyond predicting future outcomes by also suggesting
action benefits from the predictions and showing the decision maker the implication
of each decision option.
 Prescriptive Analytics not only anticipates what will happen and when to happen but
also why it will happen.
 Further, Prescriptive Analytics can suggest decision options on how to take
advantage of a future opportunity or mitigate a future risk and illustrate the
implication of each decision option.
 For example, Prescriptive Analytics can benefit healthcare strategic planning by
using analytics to leverage operational and usage data combined with data of
external factors such as economic data, population demography, etc.

Diagnostic Analytics
 In this analysis, we generally use historical data over other data to answer any
question or for the solution of any problem. We try to find any dependency and
pattern in the historical data of the particular problem.
 For example, companies go for this analysis because it gives a great insight into a
problem, and they also keep detailed information about their disposal otherwise data
collection may turn out individual for every problem and it will be very time-
consuming. Common techniques used for Diagnostic Analytics are:
 Data discovery
 Data mining
 Correlations

Emerging Tech & Big Data Guide
No ratings yet
Emerging Tech & Big Data Guide
30 pages
Unit - 1
No ratings yet
Unit - 1
104 pages
UNIT Two Emerging Technology
No ratings yet
UNIT Two Emerging Technology
43 pages
Bda U1
No ratings yet
Bda U1
78 pages
Introduction To Bda
No ratings yet
Introduction To Bda
67 pages
BDA-1st Unit
No ratings yet
BDA-1st Unit
39 pages
Seminar Report Alisha
No ratings yet
Seminar Report Alisha
22 pages
Unit - 1 Bda
No ratings yet
Unit - 1 Bda
14 pages
Unit 1
No ratings yet
Unit 1
21 pages
CS 329 Lecture One 2025
No ratings yet
CS 329 Lecture One 2025
28 pages
Big Data Seminar Report Rahul Jain
No ratings yet
Big Data Seminar Report Rahul Jain
41 pages
Big Data Insights for Businesses
No ratings yet
Big Data Insights for Businesses
13 pages
IT UNIT 2 Part 1
No ratings yet
IT UNIT 2 Part 1
33 pages
Unit 1 (Big Data Analytics)
No ratings yet
Unit 1 (Big Data Analytics)
11 pages
Introductions: What Are The 5 Vs of Big Data/ Characteristics of Big Data or Nature of Data
No ratings yet
Introductions: What Are The 5 Vs of Big Data/ Characteristics of Big Data or Nature of Data
75 pages
Computer Networks TCP
No ratings yet
Computer Networks TCP
48 pages
Presentation Print Temp
No ratings yet
Presentation Print Temp
90 pages
Content For
No ratings yet
Content For
7 pages
BDA Notes
No ratings yet
BDA Notes
96 pages
Big Data Analysis by Deshbandhu
No ratings yet
Big Data Analysis by Deshbandhu
368 pages
BDA Unit 1
No ratings yet
BDA Unit 1
39 pages
Bigdata Units
No ratings yet
Bigdata Units
80 pages
Unit1 Big Data Analytics
No ratings yet
Unit1 Big Data Analytics
31 pages
Introduction To Big Data Unit - 2
No ratings yet
Introduction To Big Data Unit - 2
75 pages
117769
No ratings yet
117769
20 pages
ETB 1 (Big Data)
No ratings yet
ETB 1 (Big Data)
28 pages
Unit-1.1-Introduction To Big Data
No ratings yet
Unit-1.1-Introduction To Big Data
50 pages
What Is Big Data & Why Is Big Data Important in Today's Era
100% (1)
What Is Big Data & Why Is Big Data Important in Today's Era
13 pages
Unit 1 - BDS - DS307
No ratings yet
Unit 1 - BDS - DS307
47 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Unit 1 - ETI (BDA)
No ratings yet
Unit 1 - ETI (BDA)
20 pages
Big Data Analytics
No ratings yet
Big Data Analytics
194 pages
Unit 6
No ratings yet
Unit 6
24 pages
Lecture 3-Introduction To Big Data
No ratings yet
Lecture 3-Introduction To Big Data
25 pages
Introduction To Big Data and Hadoop
No ratings yet
Introduction To Big Data and Hadoop
31 pages
Unit 2 Notes Data Analytics
No ratings yet
Unit 2 Notes Data Analytics
11 pages
Big Data: Presented by J.Jitendra Kumar
No ratings yet
Big Data: Presented by J.Jitendra Kumar
14 pages
ETEM S01 - (Big Data)
No ratings yet
ETEM S01 - (Big Data)
24 pages
What Is Big Data? Explain in Detail About The Characteristics of Big Data
No ratings yet
What Is Big Data? Explain in Detail About The Characteristics of Big Data
10 pages
Big Data Basics for Beginners
No ratings yet
Big Data Basics for Beginners
43 pages
Big Data
No ratings yet
Big Data
16 pages
Unit-1 Notes
No ratings yet
Unit-1 Notes
30 pages
What Is Big Data
No ratings yet
What Is Big Data
8 pages
Bda Mse
No ratings yet
Bda Mse
62 pages
Big Data: Abstract
No ratings yet
Big Data: Abstract
15 pages
Big Data Analysis
No ratings yet
Big Data Analysis
3 pages
BDA - Unit-I
No ratings yet
BDA - Unit-I
35 pages
Unit 1 Introduction To Data Science
No ratings yet
Unit 1 Introduction To Data Science
63 pages
Unit I Introduction To Big Data
No ratings yet
Unit I Introduction To Big Data
36 pages
Quote: "Data Is Widely Available. What Is Scarce Is The Ability To Extract Wisdom From It."
No ratings yet
Quote: "Data Is Widely Available. What Is Scarce Is The Ability To Extract Wisdom From It."
58 pages
Unit 1
No ratings yet
Unit 1
56 pages
What Is Big Data Analytics-1
No ratings yet
What Is Big Data Analytics-1
9 pages
Big Data and Data Analysis: Offurum Paschal I Kunoch Education and Training College, Owerri
No ratings yet
Big Data and Data Analysis: Offurum Paschal I Kunoch Education and Training College, Owerri
35 pages
Big Data Analytics - Unit 1
No ratings yet
Big Data Analytics - Unit 1
29 pages
What Is Big Data? Characteristics of Big Data and Significance
No ratings yet
What Is Big Data? Characteristics of Big Data and Significance
22 pages
Introduction To Big Data Computing
No ratings yet
Introduction To Big Data Computing
25 pages
BDAchap 1
No ratings yet
BDAchap 1
15 pages
Investigating A Virtual Queueing System For Durban University of Technology A Comprehensive Review Approach To Improve Efficiency
No ratings yet
Investigating A Virtual Queueing System For Durban University of Technology A Comprehensive Review Approach To Improve Efficiency
6 pages
HWMonitor
No ratings yet
HWMonitor
84 pages
S/4HANA Payment Instruction Keys
No ratings yet
S/4HANA Payment Instruction Keys
9 pages
Training Handbook Fortinet FortiGate Security
No ratings yet
Training Handbook Fortinet FortiGate Security
4 pages
CompTIA A+ 220-1001 Core 1 Course Notes by Professor Messers - 013-015
No ratings yet
CompTIA A+ 220-1001 Core 1 Course Notes by Professor Messers - 013-015
3 pages
Instagram
100% (1)
Instagram
124 pages
Warehouse Management System: Complex Processes - Easily Operated
No ratings yet
Warehouse Management System: Complex Processes - Easily Operated
12 pages
Ug Brochure
No ratings yet
Ug Brochure
43 pages
Lab Manual of Ai
No ratings yet
Lab Manual of Ai
48 pages
Sap Interview Questions Answers Star Method Guide
No ratings yet
Sap Interview Questions Answers Star Method Guide
6 pages
Tutorial Sheet 5 (Differential Calculus)
No ratings yet
Tutorial Sheet 5 (Differential Calculus)
2 pages
cs3491 Aiandmllabmanual
No ratings yet
cs3491 Aiandmllabmanual
43 pages
Processor Number Feature Table: Desktop - Page 1
No ratings yet
Processor Number Feature Table: Desktop - Page 1
4 pages
Experiment No 5: AIM: Study The Use of Network Reconnaissance Tools Like WHOIS, Dig
No ratings yet
Experiment No 5: AIM: Study The Use of Network Reconnaissance Tools Like WHOIS, Dig
6 pages
DS UNIT-1 Saqs Laqs (Complete)
No ratings yet
DS UNIT-1 Saqs Laqs (Complete)
14 pages
SMART MEDIA Users Manual
No ratings yet
SMART MEDIA Users Manual
88 pages
XI - Chapter 5 - WSheet - Getting Started With Pythnon - Compressed
No ratings yet
XI - Chapter 5 - WSheet - Getting Started With Pythnon - Compressed
4 pages
IoT in Smart Building Maintenance
No ratings yet
IoT in Smart Building Maintenance
10 pages
Lab 4. Overview To Wireshark Tool.: Learning Outcomes
No ratings yet
Lab 4. Overview To Wireshark Tool.: Learning Outcomes
11 pages
Chapter 5 - Software Architecture
No ratings yet
Chapter 5 - Software Architecture
11 pages
ABB Felt Permeability Meter Datasheet - Final - V2
No ratings yet
ABB Felt Permeability Meter Datasheet - Final - V2
4 pages
Vehicle Grade, Lte-Advanced Pro, Gigabit Wi-Fi: Sierra Wireless Airlink® Mp70 High Performance Vehicle Router
No ratings yet
Vehicle Grade, Lte-Advanced Pro, Gigabit Wi-Fi: Sierra Wireless Airlink® Mp70 High Performance Vehicle Router
6 pages
Linsn LED Studio User Manual
No ratings yet
Linsn LED Studio User Manual
94 pages
CWI Developers Guide - WEB SERVICE MITEL
No ratings yet
CWI Developers Guide - WEB SERVICE MITEL
57 pages
The Python Workbook: A Brief Introduction With Exercises and Solutions 2nd Edition Ben Stephenson PDF Download
100% (1)
The Python Workbook: A Brief Introduction With Exercises and Solutions 2nd Edition Ben Stephenson PDF Download
147 pages
Fake News Detection Using Enhanced BERT
No ratings yet
Fake News Detection Using Enhanced BERT
8 pages
DSL-2740u Internet Setup Guide
No ratings yet
DSL-2740u Internet Setup Guide
4 pages
Bluetooth Install On Ostrich Moates Support
100% (1)
Bluetooth Install On Ostrich Moates Support
6 pages
Mars 4K User Manual (English) V1.0.0
No ratings yet
Mars 4K User Manual (English) V1.0.0
16 pages
Data Deduplication Strategies in Cloud Computing
No ratings yet
Data Deduplication Strategies in Cloud Computing
5 pages

Bda PST

Uploaded by

Bda PST

Uploaded by

BDA PST

Q. Define Big data and its characteristics?

For example, Facebook posts with hashtags.

Benefits of Big Data Analytics

Example: Predictive Healthcare Analytics

 Personalized Medicine: Tailoring treatments based on individual patient data.

Types of Data Analytics

You might also like