SlideShare a Scribd company logo
1 of 27
BIG DATA
ANALYTICS
CONTENTS
1. Big Data
2. Data vs Big Data
3. Examples
4. Challenges
5. Big Data Analytics
6. Traditional vs Big Data analytics
7. Hadoop
8. Application
WHAT IS BIG DATA
Big data is a collection of data sets that are
large and complex in nature.
They grow both structured and unstructured
data that grow large so fast that they are not
manageable by traditional relational database
systems or conventional statistical tools.
DATA VS BIG DATA
Big data is just data with:
• More volume
• Faster data generation (velocity)
• Multiple data format (variety)
World's data volume to grow 40%
per year & 50 times by 2020 [1]
Data coming from various human
& machine activity
BIG DATA ANALYTICS
IN PRACTICE
1. The New York Stock Exchange generates about
one terabyte of new trade data per day.
2. Single Jet engine can generate 10+terabytes of
data in 30 minutes of a flight time. With many
thousand flights per day, generation of data
reaches up to many Petabytes.
3. Statistic shows that 500+terabytes of new data
gets ingested into the databases of social media
site Facebook, every day. This data is mainly
generated in terms of photo and video uploads,
message exchanges, putting comments etc.
CHALLENGES
More data = more storage space
• More storage = more money to spend (RDBMS server needs
very costly storage)
Data coming faster
• Speed up data processing or we’ll have backlog
Needs to handle various data structure
• How do we put JSON data format in standard RDBMS?
• Hey, we also have XML format from other sources
• Other system give us compressed data in gzip format
Agile business requirement
• On initial discussion, they only need 10 information, now they
ask for 25? Can we do that? We only put that 10 in our
database
TYPES OF BIG DATA
• Structured Data : Any data that can be stored,
accessed and processed in the form of fixed format is
termed as a 'structured' data.
• Un-Structured Data : Any data with unknown form or
the structure is classified as unstructured data.
• Semi-structured Data : Semi-structured data can
contain both the forms of data.
BENEFITS OF BIG
DATA PROCESSING
• Businesses can utilize outside intelligence while
taking decisions:- Access to social data from search
engines and sites like facebook, twitter are enabling
organizations to fine tune their business strategies.
• Improved customer service :- Traditional customer
feedback systems are getting replaced by new
systems designed with ‘Big Data’ technologies. In
these new systems, Big Data and natural language
processing technologies are being used to read and
evaluate consumer responses.
• Early identification of risk to the product/ services, if
any
• Better operational efficiency:-'Big Data' technologies
can be used for creating staging area or landing zone
for new data before identifying what data should be
moved to the data warehouse. In addition, such
integration of 'Big Data' technologies and data
warehouse helps organization to offload infrequently
accessed data.
BIG DATA ANALYTICS
Big data analytics is the process of examining large
and varied data sets -- i.e., big data -- to uncover
hidden patterns, unknown correlations, market trends,
customer preferences and other useful information
that can help organizations make more-informed
business decisions.
TRADITIONAL VS
BIG DATA ANALYTICS
Traditional analytics Big Data Analytics
Analytics using know data
which is well understood.
Not well understood data
format for it largely being
unstructured and semi
structured.
Build based on relational
data base model.
Big data comes in various
forms and formats from
multiple disconnected
system. They are almost flat
with no relationship.
4 TYPES OF
ANALYTICS
1. Descriptive : what happened ??
2. Diagnostic : why did it happened ??
3. Predictive : what is likely to happen ??
4. Prescriptive : what should I do about it ??
APPROACH TO ANALYTICS
1. Identify the data sources.
2. Select the right tools and technology to collect,
store, aggregate the data.
3. Understand the business domain.
4. Identify tools and technology to process the data.
5. Build mathematical models for the analytics .
6. Visualize.
7. Validate your result.
8. Learn, adopt, and rebuild your analytical model.
ANALYTICS TOOLS
Most used statistical programming tools are:
• IBM SPSS
• SAS
• R
• MATLAB
R and MATLAB have the most comprehensive
support of statistical functions.
HADOOP
Hadoop is a framework that allows for distributed
processing of large data sets across clusters of
commodity computers using a simple programming model
.
• Software framework that supports distributed
applications, licensed under the Apache v2 license.
• Hadoop was derived from Google's MapReduce and
Google File System papers.
• YAHOO is the largest contributor to the project
• Written in the Java programming language .
HADOOP :
MAPREDUCE
WHY USE HADOOP ?
• Need to compress data
• Nodes fail every day
• Common infrastructure
Efficient
Easy to use
Open Source
COMMON USES
• Searches
• Log processing
• Recommendation systems
• Analytics (Facebook, Linkedin)
• Image and video processing (NASA)
• Data retention
TECHNOLOGIES AND
TOOLS
Unstructured and semi-structured data types typically
don't fit well in traditional data warehouses that are
based on relational databases oriented to structured
data sets.
As a result, many organizations that collect, process
and analyze big data turn to NoSQL databases as well
as Hadoop and its companion tools, including:
MapReduce: a software framework that allows
developers to write programs that process massive
amounts of unstructured data in parallel across a
distributed cluster of processors or stand-alone
computers.
YARN: a cluster management technology and one
of the key features in second-generation Hadoop.
Spark: an open-source parallel processing
framework that enables users to run large-scale
data analytics applications across clustered
systems.
HBase: a column-oriented key/value data store
built to run on top of the Hadoop Distributed File
System (HDFS).
Hive: an open-source data warehouse system for
querying and analyzing large datasets stored in
Hadoop files.
Kafka: a distributed publish-subscribe messaging
system designed to replace traditional message
brokers.
Pig: an open-source technology that offers a
high-level mechanism for the parallel
programming of MapReduce jobs to be executed
on Hadoop clusters.
BIG DATA ANALYTICS
BENEFITS
• Driven by specialized analytics systems and
software, big data analytics can point the way to
various business benefits, including new revenue
opportunities, more effective marketing, better
customer service, improved operational efficiency
and competitive advantages over rivals.
• Big data analytics applications enable data
scientists, predictive modelers, statisticians and
other analytics professionals to analyze growing
volumes of structured transaction data, plus
other forms of data that are often left untapped by
conventional business intelligence (BI) and
analytics programs.
• On a broad scale, data analytics technologies and
techniques provide a means of analyzing data
sets and drawing conclusions about them to help
organizations make informed business decisions.
BIG DATA ANALYTICS
APPLICATION
• Government : The use and adoption of big data
within governmental processes allows efficiencies
in terms of cost, productivity, and innovation, but
does not come without its flaws.
• Manufacturing: Based on TCS 2013 Global Trend
Study, improvements in supply planning and
product quality provide the greatest benefit of big
data for manufacturing.
• Information Technology :Especially since 2015, big
data has come to prominence within Business
Operations as a tool to help employees work more
efficiently and streamline the collection and
distribution of Information Technology (IT).
• Education: A McKinsey Global Institute study found a
shortage of 1.5 million highly trained data
professionals and managers and a number of
universities including University of Tennessee and UC
Berkeley, have created masters programs to meet this
demand.
THANK YOU

More Related Content

What's hot

Big Data & Business Analytics: Understanding the Marketspace
Big Data & Business Analytics: Understanding the MarketspaceBig Data & Business Analytics: Understanding the Marketspace
Big Data & Business Analytics: Understanding the MarketspaceBala Iyer
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research reportJULIO GONZALEZ SANZ
 
Four Pillars of Business Analytics by Actuate
Four Pillars of Business Analytics by ActuateFour Pillars of Business Analytics by Actuate
Four Pillars of Business Analytics by ActuateEdgar Alejandro Villegas
 
Supply chain management
Supply chain managementSupply chain management
Supply chain managementmuditawasthi
 
How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?Thanakrit Lersmethasakul
 
Business intelligence architectures.pdf
Business intelligence architectures.pdfBusiness intelligence architectures.pdf
Business intelligence architectures.pdfAnand572211
 
Societal Impact of Applied Data Science on the Big Data Stack
Societal Impact of Applied Data Science on the Big Data StackSocietal Impact of Applied Data Science on the Big Data Stack
Societal Impact of Applied Data Science on the Big Data StackStealth Project
 
Business case for Big Data Analytics
Business case for Big Data AnalyticsBusiness case for Big Data Analytics
Business case for Big Data AnalyticsVijay Rao
 
Importance of data analytics for business
Importance of data analytics for businessImportance of data analytics for business
Importance of data analytics for businessBranliticSocial
 

What's hot (20)

Big data Introduction
Big data IntroductionBig data Introduction
Big data Introduction
 
Big Data & Business Analytics: Understanding the Marketspace
Big Data & Business Analytics: Understanding the MarketspaceBig Data & Business Analytics: Understanding the Marketspace
Big Data & Business Analytics: Understanding the Marketspace
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Big data
Big dataBig data
Big data
 
Sample
Sample Sample
Sample
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
 
Four Pillars of Business Analytics by Actuate
Four Pillars of Business Analytics by ActuateFour Pillars of Business Analytics by Actuate
Four Pillars of Business Analytics by Actuate
 
Supply chain management
Supply chain managementSupply chain management
Supply chain management
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
 
IoT and Big Data
IoT and Big DataIoT and Big Data
IoT and Big Data
 
How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?
 
Big Data Strategies
Big Data StrategiesBig Data Strategies
Big Data Strategies
 
uae views on big data
  uae views on  big data  uae views on  big data
uae views on big data
 
Business intelligence architectures.pdf
Business intelligence architectures.pdfBusiness intelligence architectures.pdf
Business intelligence architectures.pdf
 
13 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v313 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v3
 
Societal Impact of Applied Data Science on the Big Data Stack
Societal Impact of Applied Data Science on the Big Data StackSocietal Impact of Applied Data Science on the Big Data Stack
Societal Impact of Applied Data Science on the Big Data Stack
 
Business case for Big Data Analytics
Business case for Big Data AnalyticsBusiness case for Big Data Analytics
Business case for Big Data Analytics
 
Using Big Data Smarter Decision Making
Using Big Data Smarter Decision MakingUsing Big Data Smarter Decision Making
Using Big Data Smarter Decision Making
 
Importance of data analytics for business
Importance of data analytics for businessImportance of data analytics for business
Importance of data analytics for business
 
5 Big Data Use Cases for 2013
5 Big Data Use Cases for 20135 Big Data Use Cases for 2013
5 Big Data Use Cases for 2013
 

Similar to Big data analytics

March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...Experfy
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesSpringPeople
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Prof.Balakrishnan S
 
Oh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG DataOh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG DataPrakalp Agarwal
 
final oracle presentation
final oracle presentationfinal oracle presentation
final oracle presentationPriyesh Patel
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big dataRaul Chong
 
Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperImpetus Technologies
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsFredReynolds2
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolutionitnewsafrica
 
Big Data Analytics Research Report
Big Data Analytics Research ReportBig Data Analytics Research Report
Big Data Analytics Research ReportIla Group
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research reportJULIO GONZALEZ SANZ
 

Similar to Big data analytics (20)

March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19
 
Oh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG DataOh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG Data
 
Intro big data analytics
Intro big data analyticsIntro big data analytics
Intro big data analytics
 
final oracle presentation
final oracle presentationfinal oracle presentation
final oracle presentation
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
Big data and analytics
Big data and analyticsBig data and analytics
Big data and analytics
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
 
Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
 
Big data
Big dataBig data
Big data
 
Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White Paper
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
Big Data Analytics Research Report
Big Data Analytics Research ReportBig Data Analytics Research Report
Big Data Analytics Research Report
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
 
Big data
Big dataBig data
Big data
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 

Recently uploaded

Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numberssuginr1
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...HyderabadDolls
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfSayantanBiswas37
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...gajnagarg
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...kumargunjan9515
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...HyderabadDolls
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 

Recently uploaded (20)

Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 

Big data analytics

  • 2. CONTENTS 1. Big Data 2. Data vs Big Data 3. Examples 4. Challenges 5. Big Data Analytics 6. Traditional vs Big Data analytics 7. Hadoop 8. Application
  • 3. WHAT IS BIG DATA Big data is a collection of data sets that are large and complex in nature. They grow both structured and unstructured data that grow large so fast that they are not manageable by traditional relational database systems or conventional statistical tools.
  • 4.
  • 5. DATA VS BIG DATA Big data is just data with: • More volume • Faster data generation (velocity) • Multiple data format (variety) World's data volume to grow 40% per year & 50 times by 2020 [1] Data coming from various human & machine activity
  • 6. BIG DATA ANALYTICS IN PRACTICE 1. The New York Stock Exchange generates about one terabyte of new trade data per day. 2. Single Jet engine can generate 10+terabytes of data in 30 minutes of a flight time. With many thousand flights per day, generation of data reaches up to many Petabytes. 3. Statistic shows that 500+terabytes of new data gets ingested into the databases of social media site Facebook, every day. This data is mainly generated in terms of photo and video uploads, message exchanges, putting comments etc.
  • 7. CHALLENGES More data = more storage space • More storage = more money to spend (RDBMS server needs very costly storage) Data coming faster • Speed up data processing or we’ll have backlog Needs to handle various data structure • How do we put JSON data format in standard RDBMS? • Hey, we also have XML format from other sources • Other system give us compressed data in gzip format Agile business requirement • On initial discussion, they only need 10 information, now they ask for 25? Can we do that? We only put that 10 in our database
  • 8. TYPES OF BIG DATA • Structured Data : Any data that can be stored, accessed and processed in the form of fixed format is termed as a 'structured' data. • Un-Structured Data : Any data with unknown form or the structure is classified as unstructured data. • Semi-structured Data : Semi-structured data can contain both the forms of data.
  • 9. BENEFITS OF BIG DATA PROCESSING • Businesses can utilize outside intelligence while taking decisions:- Access to social data from search engines and sites like facebook, twitter are enabling organizations to fine tune their business strategies. • Improved customer service :- Traditional customer feedback systems are getting replaced by new systems designed with ‘Big Data’ technologies. In these new systems, Big Data and natural language processing technologies are being used to read and evaluate consumer responses.
  • 10. • Early identification of risk to the product/ services, if any • Better operational efficiency:-'Big Data' technologies can be used for creating staging area or landing zone for new data before identifying what data should be moved to the data warehouse. In addition, such integration of 'Big Data' technologies and data warehouse helps organization to offload infrequently accessed data.
  • 11. BIG DATA ANALYTICS Big data analytics is the process of examining large and varied data sets -- i.e., big data -- to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful information that can help organizations make more-informed business decisions.
  • 12. TRADITIONAL VS BIG DATA ANALYTICS Traditional analytics Big Data Analytics Analytics using know data which is well understood. Not well understood data format for it largely being unstructured and semi structured. Build based on relational data base model. Big data comes in various forms and formats from multiple disconnected system. They are almost flat with no relationship.
  • 13. 4 TYPES OF ANALYTICS 1. Descriptive : what happened ?? 2. Diagnostic : why did it happened ?? 3. Predictive : what is likely to happen ?? 4. Prescriptive : what should I do about it ??
  • 14. APPROACH TO ANALYTICS 1. Identify the data sources. 2. Select the right tools and technology to collect, store, aggregate the data. 3. Understand the business domain. 4. Identify tools and technology to process the data. 5. Build mathematical models for the analytics . 6. Visualize. 7. Validate your result. 8. Learn, adopt, and rebuild your analytical model.
  • 15. ANALYTICS TOOLS Most used statistical programming tools are: • IBM SPSS • SAS • R • MATLAB R and MATLAB have the most comprehensive support of statistical functions.
  • 16. HADOOP Hadoop is a framework that allows for distributed processing of large data sets across clusters of commodity computers using a simple programming model . • Software framework that supports distributed applications, licensed under the Apache v2 license. • Hadoop was derived from Google's MapReduce and Google File System papers. • YAHOO is the largest contributor to the project • Written in the Java programming language .
  • 18. WHY USE HADOOP ? • Need to compress data • Nodes fail every day • Common infrastructure Efficient Easy to use Open Source
  • 19. COMMON USES • Searches • Log processing • Recommendation systems • Analytics (Facebook, Linkedin) • Image and video processing (NASA) • Data retention
  • 20. TECHNOLOGIES AND TOOLS Unstructured and semi-structured data types typically don't fit well in traditional data warehouses that are based on relational databases oriented to structured data sets. As a result, many organizations that collect, process and analyze big data turn to NoSQL databases as well as Hadoop and its companion tools, including:
  • 21. MapReduce: a software framework that allows developers to write programs that process massive amounts of unstructured data in parallel across a distributed cluster of processors or stand-alone computers. YARN: a cluster management technology and one of the key features in second-generation Hadoop. Spark: an open-source parallel processing framework that enables users to run large-scale data analytics applications across clustered systems.
  • 22. HBase: a column-oriented key/value data store built to run on top of the Hadoop Distributed File System (HDFS). Hive: an open-source data warehouse system for querying and analyzing large datasets stored in Hadoop files. Kafka: a distributed publish-subscribe messaging system designed to replace traditional message brokers. Pig: an open-source technology that offers a high-level mechanism for the parallel programming of MapReduce jobs to be executed on Hadoop clusters.
  • 23. BIG DATA ANALYTICS BENEFITS • Driven by specialized analytics systems and software, big data analytics can point the way to various business benefits, including new revenue opportunities, more effective marketing, better customer service, improved operational efficiency and competitive advantages over rivals.
  • 24. • Big data analytics applications enable data scientists, predictive modelers, statisticians and other analytics professionals to analyze growing volumes of structured transaction data, plus other forms of data that are often left untapped by conventional business intelligence (BI) and analytics programs. • On a broad scale, data analytics technologies and techniques provide a means of analyzing data sets and drawing conclusions about them to help organizations make informed business decisions.
  • 25. BIG DATA ANALYTICS APPLICATION • Government : The use and adoption of big data within governmental processes allows efficiencies in terms of cost, productivity, and innovation, but does not come without its flaws. • Manufacturing: Based on TCS 2013 Global Trend Study, improvements in supply planning and product quality provide the greatest benefit of big data for manufacturing.
  • 26. • Information Technology :Especially since 2015, big data has come to prominence within Business Operations as a tool to help employees work more efficiently and streamline the collection and distribution of Information Technology (IT). • Education: A McKinsey Global Institute study found a shortage of 1.5 million highly trained data professionals and managers and a number of universities including University of Tennessee and UC Berkeley, have created masters programs to meet this demand.

Editor's Notes

  1. [1] http://e27.co/worlds-data-volume-to-grow-40-per-year-50-times-by-2020-aureus-20150115-2/