SlideShare a Scribd company logo
1 of 44
Big Data
Hello!
I am Vikas Samant.
Working With Entrench Electronics and
Pentaho as a Big Data and Data Science
Engineer.
2
What will you
Learn :
Big Data and Data Science
What is Bigdata?
Characteristics of Big Data
3
Big Data use cases
Processing Big Data
What is Big
Data?
“
 Big data is a term that describes the large volume
of data
– structured, semi-structured and unstructured
– that overpower a business on a day-to-
day basis
5
Big data can be analyzed for insights
that lead to better decisions and
strategic business moves.
6
Big Data Contd…
Big Data
Characteristics
Big Data: 3V’s Volume
Variety
8
Velocity
9
Some Make it
4V’s:
 Volume refers to the vast amounts of data generated
every second. We are not talking Terabytes but
Zettabytes or Brontobytes.
 If we take all the data generated in the world between
the beginning of time and 2000, the same amount of
data will soon be generated every minute.
1.Volume
10
 Velocity is the frequency of incoming data that needs
to be processed. The flow of data is massive and
continuous.
 Think about how many SMS messages, Facebook
status updates, or credit card swipes are being sent
on a particular telecom carrier every minute of every
day, and you’ll have a good appreciation of velocity.
2.Velocity
11
 Variety refers to the different types of data we can
now use. In the past we only focused on structured
data that neatly fitted into tables or relational
databases, such as financial data.
 In fact, 80% of the world’s data is unstructured (text,
images, video, voice, etc.) With big data technology
we can now analyse and bring together data of
different types
3.Variety
12
 Veracity refers to the messiness or trustworthiness of
the data. With many forms of big data quality and
accuracy are less controllable .
 Just think of Twitter posts with hash tags,
abbreviations, typos and colloquial speech as well as
the reliability and accuracy of content but technology
now allows us to work with this type of data.
4.Varacity
13
Big Data :
Data Structure
14
Structured
Semi-Structured
“Quasi” Structured
Unstructured
 Data containing a defined data type, format, structure.
 Example: Transaction data and Data in Databases.
1. Structure
Data
15
 Textual data files with a discernable pattern,
enabling parsing.
 Example: XML data files that are self describing and
defined by an xml schema.
2.Semi-
Structure
Data
16
 Textual data with erratic data formats, can be
formatted with effort, tools, and time.
 Example: Web clickstream data that may contain
some inconsistencies in data values and formats.
.
3.Quasi
Sturecture
Data
17
http://www.google.com/#hl=en&sugexp=kjrmc&cp=8&gs_id=2m&xhr=t&q=
data+scientist&pq=big+data&pf=p&sclient=psyb&source=hp&pbx=1&oq=d
ata+sci&aq=0&aqi=g4&aql=f&gs_sm=&gs_upl=&bav=on.2,or.r_gc.r_pw.,cf.
osb&fp=d566e0fbd09c8604&biw=1382&bih=651
 Data that has no inherent structure and is usually
stored as different types of files.
 Example: Text documents, PDFs, images and video.
4.Unstructure
Data
18
Big Data
and
Data Science
How does Big Data relate to Data Science?
Big Data and
Data Science
20
Big Data and
Data Science
21
Data Science is the process of deriving insights from Big data to form decisions
and provide support to Organizations.
Big Data and
Data Science
22
Data Science : Python and R
Big Data and
Data Science
23
Data Science Process
24
Big Data Use
Cases:
26
BIG D A T A USE C A S E S :
1 . O p t i m i z e F u n n e l
C o n v e r s i o n
2 . B e h a v i o r a l
A n a l y t i c s
3 . C u s t o m e r
S e g m e n t a t i o n
4 . F r a u d
D e t e c t i o n
1. Optimize Funnel
Conversion
27
28
1. OPTIMIZE FUNNEL
CONVERSION
Big data analytics allows companies to track leads through the
entire sales conversion process, from a click on an adword ad
to the final transaction, in order to uncover insights on how the
conversion process can be improved.
COMPANY
T-Mobile
Industry
Communication
Employees
38000
Type
Optimize Funnel
Conversion
Purpose:
T-Mobile uses multiple indicators, such as billing and sentiment analysis, in
order to identify customers that can be upgraded to higher quality products,
as well as to identify those with a high lifetime customer-value, so its team
can focus on retaining those customers.
2. Behavioral
Analytics
30
31
2. Behavioral analytics
With access to data on consumer behavior, companies can
learn what prompts a customer to stick around longer, as well
as learn more about their customer’s characteristics and
purchasing habits in order to improve marketing efforts and
boost profits.
COMPANY
Nestle
Industry
Food and
Beverage
Employees
38000
Type
Behavioral Analytics
Purpose:
Customer complaints and PR crises have become more difficult to handle thanks
to social media. To better keep track of customer sentiment and what is being said
about the company online, Nestle created a 24/7 monitoring center to listen to all
of the conversations about the company and its products on social media. The
company will actively engage with those that post about them online in order to
mitigate damage and build customer loyalty.
3. Customer
Segmentation
33
34
3. CUSTOMER SEGMENTATION
By accessing data about the consumer from multiple sources,
such as social media data and transaction history, companies
can better segment and target their customers and start to make
personalized offers to those customers.
COMPANY
Heineken
Industry
Food and
Beverage
Employees
64270
Type
Customer
Segmentation
Purpose:
Thanks to its partnerships with Google and Facebook, Heineken has access
to vast amounts of data about its customers that it uses to create real-time,
personalized marketing messages. One project provides real-time content to
fans who happen to be watching a sponsored event.
4. Fraud Detection
36
37
7. FRAUD DETECTION
Financial firms use big data to help them identify
sophisticated fraud schemes by combining multiple points
of data.
COMPANY
Discovery
Health
Industry
Insurance
Employees
5000
Type
Fraud
Detection
Purpose:
Discovery Health uses big data analytics to identify fraudulent claims and
possible fraudulent prescriptions. For example, it can identify if a healthcare
provider is charging for a more expensive procedure than was actually
performed.
Processing
Big Data
Big Data
Technologies
40
Big Data
Vendors
41
What is Hadoop
Framework
42
Hadoop is an open source framework that supports the processing
and storage of extremely large data sets in a distributed computing
environment with commodity Hardware‘s.
Why Hadoop?
43
Studies show, that by
2020, 80% of all
Fortune 500
companies will have
adopted Hadoop.
A study at McKinsley Global Institute predicted that by 2020, the annual
GDP in manufacturing and retail industries will increase to $325 billion
with the use of big data analytics.
Thanks!
Any questions?
44

More Related Content

What's hot

What's hot (20)

Big Data
Big DataBig Data
Big Data
 
Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)
 
Chapter 1 big data
Chapter 1 big dataChapter 1 big data
Chapter 1 big data
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
 
Our big data
Our big dataOur big data
Our big data
 
Big data
Big dataBig data
Big data
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data Trends
Big Data TrendsBig Data Trends
Big Data Trends
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
 
Big data analytics with Apache Hadoop
Big data analytics with Apache  HadoopBig data analytics with Apache  Hadoop
Big data analytics with Apache Hadoop
 
Big data
Big dataBig data
Big data
 
Big Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewBig Data - Applications and Technologies Overview
Big Data - Applications and Technologies Overview
 
Big_data_ppt
Big_data_ppt Big_data_ppt
Big_data_ppt
 
Big data analysis and Internet of Things(IoT)
Big data analysis and Internet of Things(IoT)Big data analysis and Internet of Things(IoT)
Big data analysis and Internet of Things(IoT)
 
Big Data Ecosystem
Big Data EcosystemBig Data Ecosystem
Big Data Ecosystem
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
Fraud and Risk in Big Data
Fraud and Risk in Big DataFraud and Risk in Big Data
Fraud and Risk in Big Data
 
Big data
Big dataBig data
Big data
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 

Similar to Big data introduction

Big Data, Analytics and Data Science
Big Data, Analytics and Data ScienceBig Data, Analytics and Data Science
Big Data, Analytics and Data Sciencedlamb3244
 
Converting Big Data To Smart Data | The Step-By-Step Guide!
Converting Big Data To Smart Data | The Step-By-Step Guide!Converting Big Data To Smart Data | The Step-By-Step Guide!
Converting Big Data To Smart Data | The Step-By-Step Guide!Kavika Roy
 
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docxProject 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docxstilliegeorgiana
 
Big Data Customer Experience Analytics -- The Next Big Opportunity for You
Big Data Customer Experience Analytics -- The Next Big Opportunity for You Big Data Customer Experience Analytics -- The Next Big Opportunity for You
Big Data Customer Experience Analytics -- The Next Big Opportunity for You Dr.Dinesh Chandrasekar PhD(hc)
 
Deriving Business Value from Big Data using Sentiment analysis
Deriving Business Value from Big Data using Sentiment analysisDeriving Business Value from Big Data using Sentiment analysis
Deriving Business Value from Big Data using Sentiment analysisCTRM Center
 
Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.Aditya205306
 
Identify and analyze the greatest insights from big data
Identify and analyze the greatest insights from big dataIdentify and analyze the greatest insights from big data
Identify and analyze the greatest insights from big dataTheInnovantes
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAkshata Humbe
 
Module 4 - Data as a Business Model - Online
Module 4 - Data as a Business Model - OnlineModule 4 - Data as a Business Model - Online
Module 4 - Data as a Business Model - Onlinecaniceconsulting
 
Big Data why Now and where to?
Big Data why Now and where to?Big Data why Now and where to?
Big Data why Now and where to?Fady Sayah
 

Similar to Big data introduction (20)

Unit III.pdf
Unit III.pdfUnit III.pdf
Unit III.pdf
 
Big Data, Analytics and Data Science
Big Data, Analytics and Data ScienceBig Data, Analytics and Data Science
Big Data, Analytics and Data Science
 
Big data unit i
Big data unit iBig data unit i
Big data unit i
 
Converting Big Data To Smart Data | The Step-By-Step Guide!
Converting Big Data To Smart Data | The Step-By-Step Guide!Converting Big Data To Smart Data | The Step-By-Step Guide!
Converting Big Data To Smart Data | The Step-By-Step Guide!
 
Big Data
Big DataBig Data
Big Data
 
Unit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdfUnit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdf
 
Unit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdfUnit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdf
 
Unlocking big data
Unlocking big dataUnlocking big data
Unlocking big data
 
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docxProject 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
 
new.pptx
new.pptxnew.pptx
new.pptx
 
Big Data Customer Experience Analytics -- The Next Big Opportunity for You
Big Data Customer Experience Analytics -- The Next Big Opportunity for You Big Data Customer Experience Analytics -- The Next Big Opportunity for You
Big Data Customer Experience Analytics -- The Next Big Opportunity for You
 
Deriving Business Value from Big Data using Sentiment analysis
Deriving Business Value from Big Data using Sentiment analysisDeriving Business Value from Big Data using Sentiment analysis
Deriving Business Value from Big Data using Sentiment analysis
 
Big data assignment
Big data assignmentBig data assignment
Big data assignment
 
Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.
 
Big Data
Big DataBig Data
Big Data
 
Identify and analyze the greatest insights from big data
Identify and analyze the greatest insights from big dataIdentify and analyze the greatest insights from big data
Identify and analyze the greatest insights from big data
 
The dawn of Big Data
The dawn of Big DataThe dawn of Big Data
The dawn of Big Data
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Module 4 - Data as a Business Model - Online
Module 4 - Data as a Business Model - OnlineModule 4 - Data as a Business Model - Online
Module 4 - Data as a Business Model - Online
 
Big Data why Now and where to?
Big Data why Now and where to?Big Data why Now and where to?
Big Data why Now and where to?
 

Recently uploaded

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknowmakika9823
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 

Recently uploaded (20)

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 

Big data introduction

  • 2. Hello! I am Vikas Samant. Working With Entrench Electronics and Pentaho as a Big Data and Data Science Engineer. 2
  • 3. What will you Learn : Big Data and Data Science What is Bigdata? Characteristics of Big Data 3 Big Data use cases Processing Big Data
  • 5. “  Big data is a term that describes the large volume of data – structured, semi-structured and unstructured – that overpower a business on a day-to- day basis 5
  • 6. Big data can be analyzed for insights that lead to better decisions and strategic business moves. 6 Big Data Contd…
  • 8. Big Data: 3V’s Volume Variety 8 Velocity
  • 10.  Volume refers to the vast amounts of data generated every second. We are not talking Terabytes but Zettabytes or Brontobytes.  If we take all the data generated in the world between the beginning of time and 2000, the same amount of data will soon be generated every minute. 1.Volume 10
  • 11.  Velocity is the frequency of incoming data that needs to be processed. The flow of data is massive and continuous.  Think about how many SMS messages, Facebook status updates, or credit card swipes are being sent on a particular telecom carrier every minute of every day, and you’ll have a good appreciation of velocity. 2.Velocity 11
  • 12.  Variety refers to the different types of data we can now use. In the past we only focused on structured data that neatly fitted into tables or relational databases, such as financial data.  In fact, 80% of the world’s data is unstructured (text, images, video, voice, etc.) With big data technology we can now analyse and bring together data of different types 3.Variety 12
  • 13.  Veracity refers to the messiness or trustworthiness of the data. With many forms of big data quality and accuracy are less controllable .  Just think of Twitter posts with hash tags, abbreviations, typos and colloquial speech as well as the reliability and accuracy of content but technology now allows us to work with this type of data. 4.Varacity 13
  • 14. Big Data : Data Structure 14 Structured Semi-Structured “Quasi” Structured Unstructured
  • 15.  Data containing a defined data type, format, structure.  Example: Transaction data and Data in Databases. 1. Structure Data 15
  • 16.  Textual data files with a discernable pattern, enabling parsing.  Example: XML data files that are self describing and defined by an xml schema. 2.Semi- Structure Data 16
  • 17.  Textual data with erratic data formats, can be formatted with effort, tools, and time.  Example: Web clickstream data that may contain some inconsistencies in data values and formats. . 3.Quasi Sturecture Data 17 http://www.google.com/#hl=en&sugexp=kjrmc&cp=8&gs_id=2m&xhr=t&q= data+scientist&pq=big+data&pf=p&sclient=psyb&source=hp&pbx=1&oq=d ata+sci&aq=0&aqi=g4&aql=f&gs_sm=&gs_upl=&bav=on.2,or.r_gc.r_pw.,cf. osb&fp=d566e0fbd09c8604&biw=1382&bih=651
  • 18.  Data that has no inherent structure and is usually stored as different types of files.  Example: Text documents, PDFs, images and video. 4.Unstructure Data 18
  • 20. How does Big Data relate to Data Science? Big Data and Data Science 20
  • 21. Big Data and Data Science 21 Data Science is the process of deriving insights from Big data to form decisions and provide support to Organizations.
  • 22. Big Data and Data Science 22
  • 23. Data Science : Python and R Big Data and Data Science 23
  • 26. 26 BIG D A T A USE C A S E S : 1 . O p t i m i z e F u n n e l C o n v e r s i o n 2 . B e h a v i o r a l A n a l y t i c s 3 . C u s t o m e r S e g m e n t a t i o n 4 . F r a u d D e t e c t i o n
  • 28. 28 1. OPTIMIZE FUNNEL CONVERSION Big data analytics allows companies to track leads through the entire sales conversion process, from a click on an adword ad to the final transaction, in order to uncover insights on how the conversion process can be improved.
  • 29. COMPANY T-Mobile Industry Communication Employees 38000 Type Optimize Funnel Conversion Purpose: T-Mobile uses multiple indicators, such as billing and sentiment analysis, in order to identify customers that can be upgraded to higher quality products, as well as to identify those with a high lifetime customer-value, so its team can focus on retaining those customers.
  • 31. 31 2. Behavioral analytics With access to data on consumer behavior, companies can learn what prompts a customer to stick around longer, as well as learn more about their customer’s characteristics and purchasing habits in order to improve marketing efforts and boost profits.
  • 32. COMPANY Nestle Industry Food and Beverage Employees 38000 Type Behavioral Analytics Purpose: Customer complaints and PR crises have become more difficult to handle thanks to social media. To better keep track of customer sentiment and what is being said about the company online, Nestle created a 24/7 monitoring center to listen to all of the conversations about the company and its products on social media. The company will actively engage with those that post about them online in order to mitigate damage and build customer loyalty.
  • 34. 34 3. CUSTOMER SEGMENTATION By accessing data about the consumer from multiple sources, such as social media data and transaction history, companies can better segment and target their customers and start to make personalized offers to those customers.
  • 35. COMPANY Heineken Industry Food and Beverage Employees 64270 Type Customer Segmentation Purpose: Thanks to its partnerships with Google and Facebook, Heineken has access to vast amounts of data about its customers that it uses to create real-time, personalized marketing messages. One project provides real-time content to fans who happen to be watching a sponsored event.
  • 37. 37 7. FRAUD DETECTION Financial firms use big data to help them identify sophisticated fraud schemes by combining multiple points of data.
  • 38. COMPANY Discovery Health Industry Insurance Employees 5000 Type Fraud Detection Purpose: Discovery Health uses big data analytics to identify fraudulent claims and possible fraudulent prescriptions. For example, it can identify if a healthcare provider is charging for a more expensive procedure than was actually performed.
  • 42. What is Hadoop Framework 42 Hadoop is an open source framework that supports the processing and storage of extremely large data sets in a distributed computing environment with commodity Hardware‘s.
  • 43. Why Hadoop? 43 Studies show, that by 2020, 80% of all Fortune 500 companies will have adopted Hadoop. A study at McKinsley Global Institute predicted that by 2020, the annual GDP in manufacturing and retail industries will increase to $325 billion with the use of big data analytics.