0% found this document useful (0 votes)
176 views21 pages

Evolution of Big Data

Uploaded by

akilankannan4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
176 views21 pages

Evolution of Big Data

Uploaded by

akilankannan4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Evolution of Big Data in USA

1
Outline
• Birth: 1880 US census
• Adolescence: Big Science
• Modern Era: Big Business
• Future Landscape
• Conclusion

2
The First Big Data Challenge
• 1880 census
• 50 million people
• Age, gender (sex),
occupation, education
level, no. of insane
people in household

3
The First Big Data Solution
• Hollerith Tabulating
System
• Punched cards – 80
variables
• Used for 1890 census
• 6 weeks instead of 7+
years

4
Space Program (1960s)
• Began in late 1950s

• An active area for Big


Data nowadays

5
Adolescence: Big Science

6
Big Science
• The International
Geophysical Year
– An international scientific
project
– Last from Jul. 1, 1957 to Dec.
31, 1958
• A synoptic collection of
observational data on a
global scale
• Implications
– Big budgets, Big staffs, Big
machines, Big laboratories

7
Summary of Big Science
• Laid foundation for ambitious projects
– International Biological Program
– Long Term Ecological Research Network
• Ended in 1974
• Many participants viewed it as a failure
• Nevertheless, it was a success
– Transform the way of processing data
– Realize original incentives
– Provide a renewed legitimacy for synoptic data
collection

8
Lessons from Big Science
• Spawn new Big Data projects
– Weather prediction
– Physics research (supercollider data analytics)
– Astronomy images (planet detection)
– Medical research (drug interaction)
–…
• Businesses latched onto its techniques,
methodologies, and objectives

9
Modern Era: Big Business

10
Big Science vs. Big Business
• Common
– Need technologies to work with data
– Use algorithms to mine data
• Big Science
– Source: experiments and research conducted in
controlled environments
– Goals: to answer questions, or prove theories
• Big Business
– Source: transactions in nature and little control
– Goals: to discover new opportunities, measure
efficiencies, uncover relationships

11
Current Status
• IDC reports
– 2.7 billion terabytes in 2012, up 48 percent from 2011
– 8 billion terabytes in 2015
• Sources
– Structured corporate databases
– Unstructured data from webpages, blogs, social networking
messages, …
– Countless digital sensors
• Business sectors
– Retailers: Walmart, Kohl
– Logistics companies: UPS
– Telecommunication: AT&T, T-Mobile
– …

12
Understanding of Big Data (1)
• An avalanche of data available increasing
exponentially
• Google CEO Erik Schmidt said
“Every two days we create as much information as we
did from the dawn of civilization up until 2003. That’s
something like five exabytes of data.”
• Farnam Jahanian kicked off a May 1, 2012
briefing, calling data
“a transformative new currency for science,
engineering, education, and commerce.”

13
Understanding of Big Data (2)
• Farnam Jahanian (NSF)
“Big Data is characterized not only by the enormous volume
of data but also by the diversity and heterogeneity of the
data and the velocity of its generation.”
• Nuala O’Connor Kelly (GE)
“it’s the volume and velocity and variety of data… to achieve
new results for …”
• Nick Combs (EMC)
“It’s needle in a haystack or connecting the dots.”
• Arvind Krishna (IBM) added the fourth V:
–Veracity: data in doubt
– Describe 'contradictory data,' or noisy data

14
Implications
• Big Science ?
– Big budgets, Big staffs, Big machines, Big laboratories
• Farnam Jahanian (NSF)
– To drive the creation of new IT products and services
– To accelerate the pace of discovery in almost every SE
discipline
– To solve the nation’s most pressing challenges
• Response: $200 million Big Data R&D initiative in 2012
– Advance in foundational techniques and technologies
– Cyberinfrastructure to manage, curate, and serve data to
SE research and education communities
– New approaches to education and workforce
development
– Nurturance of new types of collaborations
15
Data Bases’ View
• DB space 2000 - 2010 • After

scalable
BI / nonrelational
reporting (“nosql”)

OLTP /
operational

16
Big Medicine
• Information
– Related people: Patients, service
providers, nurses, physicians,
hospital administrators,
government, insurance agencies
– A mixture of structured and
unstructured data
• Technologies
– Dashboard technologies and
analytics, business intelligence,
clinical intelligence, revenue cycle
management intelligence
• Other factors
– Decision support, ease of
information accessibility, quality of
care, physician-patient relationship

17
Changes in Algorithms
• Efficiency vs. Effectiveness
• Flexible learning algorithms to remove bias
• Big Data is at an evolutionary juncture to
improve/replace human judgment
• Businesses are seeing the value, but thwarted
by the cost of storage, slower processing
speeds, and the flood of the data themselves.

18
Big Data at NASA
• NASA Open Government Plan ver. 2
– Managing and processing
– Storage
– Archiving and Distribution
– Analysis
– Visualization
– Commercial cloud computing services
• Strategy: push from top down and bottom up

19
Conclusions
• The first challenge
• The first solution
• What is adolescent age?
• What is modern era?
• What are characteristics?
• What is future landscape?
• What does NASA do?
Big Big
Census Science Business
20
References
• Frank J. Ohlhorst, Big Data Analytics: Turning Big Data into Big
Money, Wiley, 2012.
• 1880 census: http://www.1880census.com/
• Herman Hollerith: http://en.wikipedia.org/wiki/Herman_Hollerith
• Manhattan Project:
http://en.wikipedia.org/wiki/Manhattan_Project
• Space exploration: http://en.wikipedia.org/wiki/Space_exploration
• Big Science: http://en.wikipedia.org/wiki/Big_Science
• IBM Research:
http://ibmresearchalmaden.blogspot.hk/2011/09/ibm-research-al
maden-centennial.html
• NASA:
http://open.nasa.gov/blog/2012/10/04/what-is-nasa-doing-with-bi
g-data-today/

21

You might also like