SlideShare a Scribd company logo
1 of 16
“Rayat shikshan Sanstha’s”
Yashwantrao Chavan Institute of Science , Satara
Department of statistics
2017-2018
Seminar on
“Applications of statistics in Big data”.
Presented By-
Wagaj Rahul shamkarna
M.sc-1
Roll no -119
Introduction
Every day , big data is making its influence felt in our lives next big thing in world . Most useful
innovation of the past 20 year have been made possible by the massive data-gathering capabilities
combined with rapidly improving technology.
for e.g. - we have to find any information we need Google search engine or for online shopping
platform (amazon ,flipchart) & startup company . Online shopping is the ability of seller to provide
review of product & recommendation for future purchases. Recommendations are enabled by
application of “Big data”. The use of highly sophisticated data & identify items that tend to be
purchased by the same consumer . In addition to search Big data is making a major impact in
surprising number of other areas that affect our daily lives superior
What is big data
 Having data bigger it require different approaches.
 An aim to solve new problem or problem in better way.
 Big data generates values from storage & processing of very large quantities of digital
information. that cannot be analyzed with traditional computing techniques
 Walmart handles more than 1 million customer transaction every hour.
 Facebook handles 40 billion photo from its user base.
Characteristics of big data
1)Volume :- volume is easy to understand .There is lot of data . A typical pc had 200
gigabytes storage In 2000 ,today Facebook 500 terabytes of new data every day.
e.g – YouTube ,cellphone ,whatapp, internet forums .
2)Velocity:- Velocity suggest that the data comes in faster than ever and must be stored
faster than ever.
e.g. -user million of event per sec ,high frequency stock treading algorithms reflect
market change within microseconds.
3)Variety :- Big data isn’t just number or a string .Big data is in data 3D data ,audio
&video &unstructured text ,including log files & social media.
Selecting of big data stores
 Choosing the correct data stores based on your data characteristics.
 Moving data to code.
 Implementing polyglot data store solutions.
 Aligning business goals to the appropriate data store.
Why big data
 Increase of storage capacities.
 Increase of processing power.
 Availability of data.
 Every day creates 2.5 quintillion bytes of data ; 90%of the data in the world
today has been created in the last two years alone.
e.g. - FB generates lot of data daily ,IBM 90% of today stored data was
generated in just the last two year
How big data is different ?
 Automatically generated by machine
 Typically an entirely new source of data :-use of internet
 Not designed to be friendly
 May not have much values :need to focus on the important part
Statistical Analysis of Big data
Gathering and storing massive quantities of data is major challenge ,but
ultimately the biggest and most important of big data is putting it to good
use.
Many statistical techniques can be used to analyze data to find
useful to data patterns:
e.g. – 1)Probability distribution-you would use the distribution the
likelihood of a given number of event occurring over an interval of time.
2)Normal distribution ,student’s t-distribution , chi-square distribution , F-
distribution .Regression analysis ,time series analysis.
Regression analysis
Regression analysis is used to estimate the strength and direction of the relationship between
variable that are linearly related to each other. Two variable X and Y are said to be linearly related
if the relationship between them can be written in the form . Y=ax+b , where , a is slope ,or the
change in Y due to a given change in X b is the intercept ,or the value Y when X=0
e.g.: suppose a corporation want to determine whether it’s advertising expenditure are
actually increasing profit, and if so, by how much. The corporation gather the data on advertising
and profit for the past 20 years and uses this data to estimate the following equation: Y=50+0.25X
where Y represent annual profit of corporation(m). X represents annual advertising expenditure of
corporation in (m).
Here, slope=0.25 intercept=50.Because slop of the regression line is 0.25 ,this indicate
that average ,for every $ 1 million increase in advertising expenditure ,profit rise by $.25
m .Because the intercept is 50 ,this indicates that with no advertising ,profit would still
be 50 m .
This eqn ,can be used forecast future profits based on planned
advertising expenditure .
e.g. If the corporation plan on spending $10 million on advertising next ,year its
expected profits will be follows: Y=50+0.25x
Y=50++0.25(10)52.5
hence ,with an advertising budget of $10 million next year, profits are expected to be
$52.5 million
Application of Big data
 Smarter healthcare
 Telecom
 Traffic control
 Search quality
 Multi channel sales
 Homeland security
 Trading analytics (e-commerce)
 Weather forecasting
 Insurance
Risk of big data
 Will be overwhelmed.
 Costs escalate too fast.
 Many source of big data is privacy.
Benefits of big data
 Real time big data isn’t just a process for storing petabyte or Exabyte of
data in a data warehouse ,it’s about the ability to make better decisions and
take meaningful action at the right time.
 Our newest research finds that organization are using big data to target
customer centric outcome ,tap into internal data and build a better
information ecosystem.
 Big data is already an important part of the database & data analytics
market.
 It offer commercial opportunities part of a comparable scale to enterprise
software in before 30 year.
Future of big data
 This is hottest field in software, statistics ,Information technology , computer science
.Near about billion on software only specializing in data management and analytics.
 This industry on its own it worth more than $100 billion and growing at almost 10%
a year which is roughly twice as fast as the software business as a whole.
 In feb 2016 , the open source and analyst firm wikibon released the first market
forecast for big data , listing $5.1B revenue in 2013 with growth to $ 53.1B in 2017.
 The Mckinsey global institute estimates that data volume is growing 40%per year
,and will grow 44 times increase 2009 to 2020 year .
Reference
• Google.
• Book:Statistics for Big data for dummies :Alan Anderson.
Thank You…

More Related Content

What's hot

Deriving Business Value from Big Data using Sentiment analysis
Deriving Business Value from Big Data using Sentiment analysisDeriving Business Value from Big Data using Sentiment analysis
Deriving Business Value from Big Data using Sentiment analysisCTRM Center
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Casesaziksa
 
P02 | Big Data | Anurag Gupta | BCA
P02 | Big Data | Anurag Gupta | BCAP02 | Big Data | Anurag Gupta | BCA
P02 | Big Data | Anurag Gupta | BCAANURAGGUPTA570
 
Bigdata analysis in supply chain managment
Bigdata analysis in supply chain managmentBigdata analysis in supply chain managment
Bigdata analysis in supply chain managmentKushal Shah
 
What is big data ? | Big Data Applications
What is big data ? | Big Data ApplicationsWhat is big data ? | Big Data Applications
What is big data ? | Big Data ApplicationsShilpaKrishna6
 
Big DataParadigm, Challenges, Analysis, and Application
Big DataParadigm, Challenges, Analysis, and ApplicationBig DataParadigm, Challenges, Analysis, and Application
Big DataParadigm, Challenges, Analysis, and ApplicationUyoyo Edosio
 
What Does 2018 Have In Store For The Big Data Industry
What Does 2018 Have In Store For The Big Data IndustryWhat Does 2018 Have In Store For The Big Data Industry
What Does 2018 Have In Store For The Big Data IndustryPromptCloud
 
Fiware: open data & open big data
Fiware: open data & open big dataFiware: open data & open big data
Fiware: open data & open big dataEUBrasilCloudFORUM .
 
Big data analytic market opportunity
Big data analytic market opportunityBig data analytic market opportunity
Big data analytic market opportunityStanley Wang
 
Top Ten Big Data Trends in Finance
Top Ten Big Data Trends in FinanceTop Ten Big Data Trends in Finance
Top Ten Big Data Trends in FinancePromptCloud
 
Big data and its applications
Big data and its applicationsBig data and its applications
Big data and its applicationsali easazadeh
 

What's hot (19)

Deriving Business Value from Big Data using Sentiment analysis
Deriving Business Value from Big Data using Sentiment analysisDeriving Business Value from Big Data using Sentiment analysis
Deriving Business Value from Big Data using Sentiment analysis
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
 
P02 | Big Data | Anurag Gupta | BCA
P02 | Big Data | Anurag Gupta | BCAP02 | Big Data | Anurag Gupta | BCA
P02 | Big Data | Anurag Gupta | BCA
 
Bigdata analysis in supply chain managment
Bigdata analysis in supply chain managmentBigdata analysis in supply chain managment
Bigdata analysis in supply chain managment
 
What is big data ? | Big Data Applications
What is big data ? | Big Data ApplicationsWhat is big data ? | Big Data Applications
What is big data ? | Big Data Applications
 
Big data
Big dataBig data
Big data
 
Big DataParadigm, Challenges, Analysis, and Application
Big DataParadigm, Challenges, Analysis, and ApplicationBig DataParadigm, Challenges, Analysis, and Application
Big DataParadigm, Challenges, Analysis, and Application
 
What Does 2018 Have In Store For The Big Data Industry
What Does 2018 Have In Store For The Big Data IndustryWhat Does 2018 Have In Store For The Big Data Industry
What Does 2018 Have In Store For The Big Data Industry
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Intro to analytics
Intro to analyticsIntro to analytics
Intro to analytics
 
Fiware: open data & open big data
Fiware: open data & open big dataFiware: open data & open big data
Fiware: open data & open big data
 
Understanding big data
Understanding big dataUnderstanding big data
Understanding big data
 
Big data unit i
Big data unit iBig data unit i
Big data unit i
 
Big data analytic market opportunity
Big data analytic market opportunityBig data analytic market opportunity
Big data analytic market opportunity
 
Top Ten Big Data Trends in Finance
Top Ten Big Data Trends in FinanceTop Ten Big Data Trends in Finance
Top Ten Big Data Trends in Finance
 
NewMR 2016 presents: 9 Big Applications of Big Data
NewMR 2016 presents: 9 Big Applications of Big DataNewMR 2016 presents: 9 Big Applications of Big Data
NewMR 2016 presents: 9 Big Applications of Big Data
 
Big data and its applications
Big data and its applicationsBig data and its applications
Big data and its applications
 
13 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v313 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v3
 
Big Data Challenges faced by Organizations
Big Data Challenges faced by OrganizationsBig Data Challenges faced by Organizations
Big Data Challenges faced by Organizations
 

Similar to Big data

QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big DataSonovate
 
Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.saranya270513
 
Big Data in Banking (White paper)
Big Data in Banking (White paper)Big Data in Banking (White paper)
Big Data in Banking (White paper)InData Labs
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementationSandip Tipayle Patil
 
What is Big Data
What is Big Data What is Big Data
What is Big Data Hani Saif
 
Big data - a review (2013 4)
Big data - a review (2013 4)Big data - a review (2013 4)
Big data - a review (2013 4)Sonu Gupta
 
Big data destruction of bus. models
Big data destruction of bus. modelsBig data destruction of bus. models
Big data destruction of bus. modelsEdgar Revilla Lavado
 
Bigdata the technological renaissance
Bigdata the technological renaissanceBigdata the technological renaissance
Bigdata the technological renaissanceRituBhargava7
 
Big Data in Retail. Infographic
Big Data in Retail. InfographicBig Data in Retail. Infographic
Big Data in Retail. InfographicInData Labs
 
Mejorar la toma de decisiones con Big Data
Mejorar la toma de decisiones con Big DataMejorar la toma de decisiones con Big Data
Mejorar la toma de decisiones con Big DataMiguel Ángel Gómez
 
Big Data in Retail (White paper)
Big Data in Retail (White paper)Big Data in Retail (White paper)
Big Data in Retail (White paper)InData Labs
 
Big Data in Banking. Infographic
Big Data in Banking.  InfographicBig Data in Banking.  Infographic
Big Data in Banking. InfographicInData Labs
 
The Value of Signal (and the Cost of Noise): The New Economics of Meaning-Making
The Value of Signal (and the Cost of Noise): The New Economics of Meaning-MakingThe Value of Signal (and the Cost of Noise): The New Economics of Meaning-Making
The Value of Signal (and the Cost of Noise): The New Economics of Meaning-MakingCognizant
 
Unlocking Value of Data in a Digital Age
Unlocking Value of Data in a Digital AgeUnlocking Value of Data in a Digital Age
Unlocking Value of Data in a Digital AgeRuud Brink
 
Data - Its a big deal
Data - Its a big dealData - Its a big deal
Data - Its a big dealSubarna Gupta
 

Similar to Big data (20)

QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big Data
 
Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.
 
Big Data in Banking (White paper)
Big Data in Banking (White paper)Big Data in Banking (White paper)
Big Data in Banking (White paper)
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
Bigdata Hadoop introduction
Bigdata Hadoop introductionBigdata Hadoop introduction
Bigdata Hadoop introduction
 
What is Big Data
What is Big Data What is Big Data
What is Big Data
 
Big data - a review (2013 4)
Big data - a review (2013 4)Big data - a review (2013 4)
Big data - a review (2013 4)
 
Big data destruction of bus. models
Big data destruction of bus. modelsBig data destruction of bus. models
Big data destruction of bus. models
 
Big data
Big dataBig data
Big data
 
Bigdata the technological renaissance
Bigdata the technological renaissanceBigdata the technological renaissance
Bigdata the technological renaissance
 
Big Data in Retail. Infographic
Big Data in Retail. InfographicBig Data in Retail. Infographic
Big Data in Retail. Infographic
 
Mejorar la toma de decisiones con Big Data
Mejorar la toma de decisiones con Big DataMejorar la toma de decisiones con Big Data
Mejorar la toma de decisiones con Big Data
 
Big Data in Retail (White paper)
Big Data in Retail (White paper)Big Data in Retail (White paper)
Big Data in Retail (White paper)
 
Big Data in Banking. Infographic
Big Data in Banking.  InfographicBig Data in Banking.  Infographic
Big Data in Banking. Infographic
 
The Value of Signal (and the Cost of Noise): The New Economics of Meaning-Making
The Value of Signal (and the Cost of Noise): The New Economics of Meaning-MakingThe Value of Signal (and the Cost of Noise): The New Economics of Meaning-Making
The Value of Signal (and the Cost of Noise): The New Economics of Meaning-Making
 
Unit III.pdf
Unit III.pdfUnit III.pdf
Unit III.pdf
 
Big data is a popular term used to describe the exponential growth and availa...
Big data is a popular term used to describe the exponential growth and availa...Big data is a popular term used to describe the exponential growth and availa...
Big data is a popular term used to describe the exponential growth and availa...
 
Unlocking Value of Data in a Digital Age
Unlocking Value of Data in a Digital AgeUnlocking Value of Data in a Digital Age
Unlocking Value of Data in a Digital Age
 
The dawn of Big Data
The dawn of Big DataThe dawn of Big Data
The dawn of Big Data
 
Data - Its a big deal
Data - Its a big dealData - Its a big deal
Data - Its a big deal
 

Recently uploaded

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...caitlingebhard1
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceIES VE
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxMarkSteadman7
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaWSO2
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringWSO2
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 

Recently uploaded (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational Performance
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Big data

  • 1. “Rayat shikshan Sanstha’s” Yashwantrao Chavan Institute of Science , Satara Department of statistics 2017-2018 Seminar on “Applications of statistics in Big data”. Presented By- Wagaj Rahul shamkarna M.sc-1 Roll no -119
  • 2. Introduction Every day , big data is making its influence felt in our lives next big thing in world . Most useful innovation of the past 20 year have been made possible by the massive data-gathering capabilities combined with rapidly improving technology. for e.g. - we have to find any information we need Google search engine or for online shopping platform (amazon ,flipchart) & startup company . Online shopping is the ability of seller to provide review of product & recommendation for future purchases. Recommendations are enabled by application of “Big data”. The use of highly sophisticated data & identify items that tend to be purchased by the same consumer . In addition to search Big data is making a major impact in surprising number of other areas that affect our daily lives superior
  • 3. What is big data  Having data bigger it require different approaches.  An aim to solve new problem or problem in better way.  Big data generates values from storage & processing of very large quantities of digital information. that cannot be analyzed with traditional computing techniques  Walmart handles more than 1 million customer transaction every hour.  Facebook handles 40 billion photo from its user base.
  • 4. Characteristics of big data 1)Volume :- volume is easy to understand .There is lot of data . A typical pc had 200 gigabytes storage In 2000 ,today Facebook 500 terabytes of new data every day. e.g – YouTube ,cellphone ,whatapp, internet forums . 2)Velocity:- Velocity suggest that the data comes in faster than ever and must be stored faster than ever. e.g. -user million of event per sec ,high frequency stock treading algorithms reflect market change within microseconds. 3)Variety :- Big data isn’t just number or a string .Big data is in data 3D data ,audio &video &unstructured text ,including log files & social media.
  • 5. Selecting of big data stores  Choosing the correct data stores based on your data characteristics.  Moving data to code.  Implementing polyglot data store solutions.  Aligning business goals to the appropriate data store.
  • 6. Why big data  Increase of storage capacities.  Increase of processing power.  Availability of data.  Every day creates 2.5 quintillion bytes of data ; 90%of the data in the world today has been created in the last two years alone. e.g. - FB generates lot of data daily ,IBM 90% of today stored data was generated in just the last two year
  • 7. How big data is different ?  Automatically generated by machine  Typically an entirely new source of data :-use of internet  Not designed to be friendly  May not have much values :need to focus on the important part
  • 8. Statistical Analysis of Big data Gathering and storing massive quantities of data is major challenge ,but ultimately the biggest and most important of big data is putting it to good use. Many statistical techniques can be used to analyze data to find useful to data patterns: e.g. – 1)Probability distribution-you would use the distribution the likelihood of a given number of event occurring over an interval of time. 2)Normal distribution ,student’s t-distribution , chi-square distribution , F- distribution .Regression analysis ,time series analysis.
  • 9. Regression analysis Regression analysis is used to estimate the strength and direction of the relationship between variable that are linearly related to each other. Two variable X and Y are said to be linearly related if the relationship between them can be written in the form . Y=ax+b , where , a is slope ,or the change in Y due to a given change in X b is the intercept ,or the value Y when X=0 e.g.: suppose a corporation want to determine whether it’s advertising expenditure are actually increasing profit, and if so, by how much. The corporation gather the data on advertising and profit for the past 20 years and uses this data to estimate the following equation: Y=50+0.25X where Y represent annual profit of corporation(m). X represents annual advertising expenditure of corporation in (m).
  • 10. Here, slope=0.25 intercept=50.Because slop of the regression line is 0.25 ,this indicate that average ,for every $ 1 million increase in advertising expenditure ,profit rise by $.25 m .Because the intercept is 50 ,this indicates that with no advertising ,profit would still be 50 m . This eqn ,can be used forecast future profits based on planned advertising expenditure . e.g. If the corporation plan on spending $10 million on advertising next ,year its expected profits will be follows: Y=50+0.25x Y=50++0.25(10)52.5 hence ,with an advertising budget of $10 million next year, profits are expected to be $52.5 million
  • 11. Application of Big data  Smarter healthcare  Telecom  Traffic control  Search quality  Multi channel sales  Homeland security  Trading analytics (e-commerce)  Weather forecasting  Insurance
  • 12. Risk of big data  Will be overwhelmed.  Costs escalate too fast.  Many source of big data is privacy.
  • 13. Benefits of big data  Real time big data isn’t just a process for storing petabyte or Exabyte of data in a data warehouse ,it’s about the ability to make better decisions and take meaningful action at the right time.  Our newest research finds that organization are using big data to target customer centric outcome ,tap into internal data and build a better information ecosystem.  Big data is already an important part of the database & data analytics market.  It offer commercial opportunities part of a comparable scale to enterprise software in before 30 year.
  • 14. Future of big data  This is hottest field in software, statistics ,Information technology , computer science .Near about billion on software only specializing in data management and analytics.  This industry on its own it worth more than $100 billion and growing at almost 10% a year which is roughly twice as fast as the software business as a whole.  In feb 2016 , the open source and analyst firm wikibon released the first market forecast for big data , listing $5.1B revenue in 2013 with growth to $ 53.1B in 2017.  The Mckinsey global institute estimates that data volume is growing 40%per year ,and will grow 44 times increase 2009 to 2020 year .
  • 15. Reference • Google. • Book:Statistics for Big data for dummies :Alan Anderson.