Introduction to Big Data
www.serendio.com
Content
• What is Big Data?
• Big Data Application
• Introduction to NoSQL
• CAP Theory
• Conclusion
What is Big Data?
Big data is about Application of new tools to do more
analytics on more data for more people.
Big data is a term for data sets that are so large or complex that
traditional data processing applications are inadequate.
Why Big Data is Important?
Thomas H. Davenport
Jill Dyché
1. Cost reduction: Big data technologies such as Hadoop and cloud-based
analytics bring significant cost advantages when it comes to storing large
amounts of data – plus they can identify more efficient ways of doing
business.
2. Faster, better decision making: With the speed of Hadoop and in-
memory analytics, combined with the ability to analyze new sources of data,
businesses are able to analyze information immediately – and make
decisions based on what they’ve learned.
3. New products and services: With the ability to gauge customer
needs and satisfaction through analytics comes the power to give customers
what they want. Davenport points out that with big data analytics, more
companies are creating new products to meet customers’ needs.
Big Data : General Characteristics
Volume:
Scale of Data
Velocity :
Streaming Data, Data
Production Rate
Variety:
Different type of Data
Veracity:
Uncertainty of data,
lack of confidence in
Data
Big Data : GE
"One sensor on a blade of a gas turbine engine generates
520GB per day, and you have 20 of them."
"The airline industry spends $200bn on fuel per year so a
2% saving is $4bn. GE provides software that enables
airline pilots to manage fuel efficiency." Bill Ruh :
Senior VP and Chief
Digital Officer (CDO)
for GE
"We invested $1.5bn over four years to develop
services and create new software. We are working on
making devices more intelligent using sensors; and
controllers that can be configured in real time,"
Big Data : Boeing 787 Dreamliner
David Bulman,
Director of Technology,
Virgin Atlantic Airways
Every flight a 787 takes, it can produce over 500GB of data. That may sound
like a lot, but when you consider ever part of the aircraft is being monitored
and is Internet-connected, you can see how the gigabytes soon add up.
Big Data : Social Media
Big Data : Fin Tech
1. Traditional Data Warehouse to Big Data Warehouse
2. Achieving a 360-degree view of your customer
3. Credit Card Fraud Detection
4. Stock Market forecasting
5. Location Based Recommendation
6. Many more ….
Big Data : CERN Particle Accelerator
Four Experiments:
1. ALICE: 4 GB/s
2. ATLAS: 800 MB/s – 1 GB/s
3. CMS: 600 MB/s
4. LHCb: 750 MB/s
The raw data per event is around one million bytes (1
Mb), produced at a rate of about 600 million events per
second.
Big Data : Journalism
Big Data : Politics & Governance
1. Sentiment & Predictive analysis of candidates
2. Big Data for Policy Making process
Big Data : Many More Applications
1. Healthcare
2. E-Commerce
3. Online Search Engine
4. Smart City
5. Online Recommendation Engines
6. City Traffic Prediction
7. Weather Information & Prediction
8. Space Science Data
9. Biological Data
10. Radar Information
11. RFID trackers
12. Etc…..
Introduction to Nosql
CAP Theory
http://blog.nahurst.com/visual-guide-to-nosql-systems
ACID vs BASE
ACID = Atomicity, Consistency, Isolation and Durability
BASE = Basically Available, Soft State and Eventual Consistency
ACID focuses on Consistency and Availability
BASE focuses on Partition tolerance and
Availability
Conclusion
 Big Data is changing the shape of Business and Technology. Big
Data brings lot of value from existing data and help business
reduce cost and optimize operations.
 Big Data Tools have brought new possibilities and
opportunities with capability to perform analytics and
produce valuable insights .
nishant@serendio.com
Serendio provides Big Data Science Solutions &
Services for Data-Driven Enterprises.
Learn more at:
serendio.com/index.php/case-studies
Thank You!

Guest Lecture: Introduction to Big Data at Indian Institute of Technology

  • 1.
    Introduction to BigData www.serendio.com
  • 2.
    Content • What isBig Data? • Big Data Application • Introduction to NoSQL • CAP Theory • Conclusion
  • 3.
    What is BigData? Big data is about Application of new tools to do more analytics on more data for more people. Big data is a term for data sets that are so large or complex that traditional data processing applications are inadequate.
  • 4.
    Why Big Datais Important? Thomas H. Davenport Jill Dyché 1. Cost reduction: Big data technologies such as Hadoop and cloud-based analytics bring significant cost advantages when it comes to storing large amounts of data – plus they can identify more efficient ways of doing business. 2. Faster, better decision making: With the speed of Hadoop and in- memory analytics, combined with the ability to analyze new sources of data, businesses are able to analyze information immediately – and make decisions based on what they’ve learned. 3. New products and services: With the ability to gauge customer needs and satisfaction through analytics comes the power to give customers what they want. Davenport points out that with big data analytics, more companies are creating new products to meet customers’ needs.
  • 5.
    Big Data :General Characteristics Volume: Scale of Data Velocity : Streaming Data, Data Production Rate Variety: Different type of Data Veracity: Uncertainty of data, lack of confidence in Data
  • 6.
    Big Data :GE "One sensor on a blade of a gas turbine engine generates 520GB per day, and you have 20 of them." "The airline industry spends $200bn on fuel per year so a 2% saving is $4bn. GE provides software that enables airline pilots to manage fuel efficiency." Bill Ruh : Senior VP and Chief Digital Officer (CDO) for GE "We invested $1.5bn over four years to develop services and create new software. We are working on making devices more intelligent using sensors; and controllers that can be configured in real time,"
  • 7.
    Big Data :Boeing 787 Dreamliner David Bulman, Director of Technology, Virgin Atlantic Airways Every flight a 787 takes, it can produce over 500GB of data. That may sound like a lot, but when you consider ever part of the aircraft is being monitored and is Internet-connected, you can see how the gigabytes soon add up.
  • 8.
    Big Data :Social Media
  • 9.
    Big Data :Fin Tech 1. Traditional Data Warehouse to Big Data Warehouse 2. Achieving a 360-degree view of your customer 3. Credit Card Fraud Detection 4. Stock Market forecasting 5. Location Based Recommendation 6. Many more ….
  • 10.
    Big Data :CERN Particle Accelerator Four Experiments: 1. ALICE: 4 GB/s 2. ATLAS: 800 MB/s – 1 GB/s 3. CMS: 600 MB/s 4. LHCb: 750 MB/s The raw data per event is around one million bytes (1 Mb), produced at a rate of about 600 million events per second.
  • 11.
    Big Data :Journalism
  • 12.
    Big Data :Politics & Governance 1. Sentiment & Predictive analysis of candidates 2. Big Data for Policy Making process
  • 13.
    Big Data :Many More Applications 1. Healthcare 2. E-Commerce 3. Online Search Engine 4. Smart City 5. Online Recommendation Engines 6. City Traffic Prediction 7. Weather Information & Prediction 8. Space Science Data 9. Biological Data 10. Radar Information 11. RFID trackers 12. Etc…..
  • 14.
  • 15.
  • 16.
    ACID vs BASE ACID= Atomicity, Consistency, Isolation and Durability BASE = Basically Available, Soft State and Eventual Consistency ACID focuses on Consistency and Availability BASE focuses on Partition tolerance and Availability
  • 17.
    Conclusion  Big Datais changing the shape of Business and Technology. Big Data brings lot of value from existing data and help business reduce cost and optimize operations.  Big Data Tools have brought new possibilities and opportunities with capability to perform analytics and produce valuable insights .
  • 18.
    nishant@serendio.com Serendio provides BigData Science Solutions & Services for Data-Driven Enterprises. Learn more at: serendio.com/index.php/case-studies Thank You!