SlideShare a Scribd company logo
1 of 27
analyze(NoSQL,BigData);
/* history, hype, opportunities */




              // By: Vishy Poosala
          // Head of Bell Labs, India
       // poosala@alcatel-lucent.com
                   // @vishyp
                                        1
The dark ages of COBOL




                         2
..then Codd said
let there be tables

              Rows &
              Columns




                        Normal
        SQL
                        Forms




               ACID


                                 3
www.data-for-humans.com


                        SET-
             WHAT
                       VALUED
            COLUMNS
                      ATTRIBUT
               ?
                         ES



                      Schema
              XML
                      Evolution




                                  4
Billions of Keys & Values

                        GFS



                       Google
                      Big Table



                       Hadoop



                      Cassandra
                       Dynamo


                                  5
How would you build a super-fast,
 FB-scale chat service, in 2012?

          (for example)



                                    6
I want my own DB!
           • Memcached
 Main
Memory     • redis


 Distr.
           • MongoDB
 K-V



Versions   • CouchDB



Social
Graphs     • Neo4j


                                    7
BIG
             KB       GB       TB           PB


Data                           Semi-
            FILES   TABLES                 Variety
                             Structured
                                          Dynamic

Analytics            OLAP
            STATS              Apps        Mahout
                     Cube


Language
            COBOL     SQL      XML         NoSQL




            60’s    80-96    96-’07         ‘07-

                                                 8
Following *AMAZING* Slides Courtesy: Gregory Piatesky-Shapiro, kdnuggets.com

You can find all the slides from his talk at:

http://www.slideshare.net/gpiatetskyshapiro/analytics-and-data-mining-industry-overview

                                                                                          9
Data Tsunami
• In 2010 enterprises
  stored 7 exabytes
  =7,000,000,000 GB
of new data (McKinsey)
• 90 percent of the
  world's data has been
                          Image with apologies to KDD-2011
  generated in the past
  two years (IBM)
                                                             10
Pre-history




Statistics is the biggest term in 20th century, but
data mining           and analytics          appears in late
1990s
From Google Ngram viewer – English language books
Note: Our analysis uses only English language data.
Other languages, especially Chinese , need to be considered for full picture
                                                                               11
Recent History:
Analytics, Data Mining, Knowledge Discovery




Analytics has been used since 1800, but started to rise in 2005
Data Mining jumps around 1996 (soon after first KDD conference) but declines after
2003 (TIA controversy, associated with gov. invasion of privacy).
Knowledge Discovery appears in 1989, jumps in 1996, and plateaus after 2000
                                                                           12
Google Trends:
After 2006, Data Mining < Analytics




                                  13
Google Insights: searches for
data mining, analytics -google
are most popular in India, US




                                 14
Analytics > Data Mining > Data
            Science




                                 15
Data Science, Big Data




                         16
Data Types Analyzed/Mined




www.KDnuggets.com/polls/2011/data-types-analyzed-mined.html   17
Largest Dataset Analyzed?
                                               2011 median dataset
                                               size ~10-20 GB,
                                               vs 8-10 GB in 2010.

                                               Increase in
                                               10 GB to 1 PB range




www.KDnuggets.com/polls/2011/largest-dataset-analyzed-data-mined.html
                                                                 18
Which methods/algorithms did you
  use for data analysis in 2011
                                    % analysts who used it
                                    0%   10%   20%   30%   40%   50%   60%   70%

                 Decision Trees
                     Regression
                     Clustering
                       Statistics
                   Visualization
  Time series/Sequence analysis
           Support Vector (SVM)
               Association rules
             Ensemble methods
                    Text Mining
                    Neural Nets
                       Boosting
                      Bayesian
                       Bagging
                Factor Analysis
    Anomaly/Deviation detection
        Social Network Analysis
               Survival Analysis
             Genetic algorithms
                 Uplift modeling



 www.KDnuggets.com/polls/2011/algorithms-analytics-data-mining.html
                                                                  19
Cloud Analytics is not common
             (yet)




www.KDnuggets.com/polls/2011/algorithms-analytics-data-mining.html
                                                                 20
Shortage of Skills
• McKinsey: shortage by 2018 in the US of
  – 140-190,000 people with deep analytical skills

  – 1.5 M managers/analysts with the know-how
    to use the analysis of big data to make
    effective decisions.

  Source:
   www.mckinsey.com/mgi/publications/big_data
   /                                        21
Job data: Data Scientist




                           22
Jobs: Data Mining >> Data
        Scientist




                            23
“Ground” Analytics (LinkedIn
          Skills)
                 ~ 75,000 with Data Mining skill

                  ~ 7,000 with Predictive Modeling



                  Also
                  ~ 20,000 with Predictive
                  Analytics
                  (not related with Predictive
                  Modeling ??




                                             24
Analytics LinkedIn Skills




  Predictive Analytics Machine Learning


 Text
 Mining                                   MapReduce



                                                      25
Big Data Bubble?

Big Data




            Gartner Hype Cycle

                                 26
27

More Related Content

What's hot

An Overview of the Emerging Graph Landscape (Oct 2013)
An Overview of the Emerging Graph Landscape (Oct 2013)An Overview of the Emerging Graph Landscape (Oct 2013)
An Overview of the Emerging Graph Landscape (Oct 2013)Emil Eifrem
 
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformPredictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformSavita Yadav
 
A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)Toshiyuki Shimono
 
GraphConnect SF 2013 Keynote
GraphConnect SF 2013 KeynoteGraphConnect SF 2013 Keynote
GraphConnect SF 2013 KeynoteEmil Eifrem
 
Real-time information analysis: social networks and open data
Real-time information analysis: social networks and open dataReal-time information analysis: social networks and open data
Real-time information analysis: social networks and open dataData Science Society
 
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...Neo4j
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentationAASTHA PANDEY
 
History and Trend of Big Data and Deep Learning
History and Trend of Big Data and Deep LearningHistory and Trend of Big Data and Deep Learning
History and Trend of Big Data and Deep LearningJongwook Woo
 
Rating Prediction using Deep Learning and Spark
Rating Prediction using Deep Learning and SparkRating Prediction using Deep Learning and Spark
Rating Prediction using Deep Learning and SparkJongwook Woo
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentalsrjain51
 
Introduction to Big Data and its Trends
Introduction to Big Data and its TrendsIntroduction to Big Data and its Trends
Introduction to Big Data and its TrendsJongwook Woo
 
Introduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionIntroduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionJongwook Woo
 
Introduction to Big Data: Smart Factory
Introduction to Big Data: Smart FactoryIntroduction to Big Data: Smart Factory
Introduction to Big Data: Smart FactoryJongwook Woo
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsMotaz Saad
 

What's hot (19)

Dbm630 lecture10
Dbm630 lecture10Dbm630 lecture10
Dbm630 lecture10
 
An Overview of the Emerging Graph Landscape (Oct 2013)
An Overview of the Emerging Graph Landscape (Oct 2013)An Overview of the Emerging Graph Landscape (Oct 2013)
An Overview of the Emerging Graph Landscape (Oct 2013)
 
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformPredictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
 
A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)
 
GraphConnect SF 2013 Keynote
GraphConnect SF 2013 KeynoteGraphConnect SF 2013 Keynote
GraphConnect SF 2013 Keynote
 
Real-time information analysis: social networks and open data
Real-time information analysis: social networks and open dataReal-time information analysis: social networks and open data
Real-time information analysis: social networks and open data
 
BigData Analytics
BigData AnalyticsBigData Analytics
BigData Analytics
 
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentation
 
History and Trend of Big Data and Deep Learning
History and Trend of Big Data and Deep LearningHistory and Trend of Big Data and Deep Learning
History and Trend of Big Data and Deep Learning
 
STI Summit 2011 - Digital Worlds
STI Summit 2011 - Digital WorldsSTI Summit 2011 - Digital Worlds
STI Summit 2011 - Digital Worlds
 
Data mining
Data miningData mining
Data mining
 
Rating Prediction using Deep Learning and Spark
Rating Prediction using Deep Learning and SparkRating Prediction using Deep Learning and Spark
Rating Prediction using Deep Learning and Spark
 
Bigdata
BigdataBigdata
Bigdata
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
 
Introduction to Big Data and its Trends
Introduction to Big Data and its TrendsIntroduction to Big Data and its Trends
Introduction to Big Data and its Trends
 
Introduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionIntroduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and Prediction
 
Introduction to Big Data: Smart Factory
Introduction to Big Data: Smart FactoryIntroduction to Big Data: Smart Factory
Introduction to Big Data: Smart Factory
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence Tools
 

Viewers also liked

Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
 
Big Data with Not Only SQL
Big Data with Not Only SQLBig Data with Not Only SQL
Big Data with Not Only SQLPhilippe Julio
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureVenu Anuganti
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQLRTigger
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 

Viewers also liked (7)

Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Big Data with Not Only SQL
Big Data with Not Only SQLBig Data with Not Only SQL
Big Data with Not Only SQL
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 

Similar to NoSQL & Big Data Analytics: History, Hype, Opportunities

Big Data = Big Decisions
Big Data = Big DecisionsBig Data = Big Decisions
Big Data = Big DecisionsInnoTech
 
Enabling a Data Driven Agile Business
Enabling a Data Driven Agile BusinessEnabling a Data Driven Agile Business
Enabling a Data Driven Agile BusinessTharindu Mathew
 
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...i_scienceEU
 
제1회 Korea Community Day 발표자료 Bigdata
제1회 Korea Community Day 발표자료 Bigdata 제1회 Korea Community Day 발표자료 Bigdata
제1회 Korea Community Day 발표자료 Bigdata Gruter
 
Data Mining: Future Trends and Applications
Data Mining: Future Trends and ApplicationsData Mining: Future Trends and Applications
Data Mining: Future Trends and ApplicationsIJMER
 
When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012Jimmy Lai
 
Sample Paper.doc.doc
Sample Paper.doc.docSample Paper.doc.doc
Sample Paper.doc.docbutest
 
Intel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick KnupfferIntel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick KnupfferIntelAPAC
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdfPoornimaShetty27
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdfSreenivasa Harish
 
Research issues in the big data and its Challenges
Research issues in the big data and its ChallengesResearch issues in the big data and its Challenges
Research issues in the big data and its ChallengesKathirvel Ayyaswamy
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureOdinot Stanislas
 
Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013nkabra
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalIIIT Allahabad
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big datahktripathy
 
NoSQL for the SQL Server Pro
NoSQL for the SQL Server ProNoSQL for the SQL Server Pro
NoSQL for the SQL Server ProLynn Langit
 

Similar to NoSQL & Big Data Analytics: History, Hype, Opportunities (20)

Big Data = Big Decisions
Big Data = Big DecisionsBig Data = Big Decisions
Big Data = Big Decisions
 
Enabling a Data Driven Agile Business
Enabling a Data Driven Agile BusinessEnabling a Data Driven Agile Business
Enabling a Data Driven Agile Business
 
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
 
제1회 Korea Community Day 발표자료 Bigdata
제1회 Korea Community Day 발표자료 Bigdata 제1회 Korea Community Day 발표자료 Bigdata
제1회 Korea Community Day 발표자료 Bigdata
 
Data Mining: Future Trends and Applications
Data Mining: Future Trends and ApplicationsData Mining: Future Trends and Applications
Data Mining: Future Trends and Applications
 
When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012
 
Sample Paper.doc.doc
Sample Paper.doc.docSample Paper.doc.doc
Sample Paper.doc.doc
 
Intel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick KnupfferIntel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick Knupffer
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf
 
Research issues in the big data and its Challenges
Research issues in the big data and its ChallengesResearch issues in the big data and its Challenges
Research issues in the big data and its Challenges
 
Big data
Big dataBig data
Big data
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform Architecture
 
Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
13 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v313 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v3
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
Our big data
Our big dataOur big data
Our big data
 
Big data
Big dataBig data
Big data
 
NoSQL for the SQL Server Pro
NoSQL for the SQL Server ProNoSQL for the SQL Server Pro
NoSQL for the SQL Server Pro
 

More from Vishy Poosala

Big Ideas, Ideal Job, and other Holy Grails
Big Ideas, Ideal Job, and other Holy GrailsBig Ideas, Ideal Job, and other Holy Grails
Big Ideas, Ideal Job, and other Holy GrailsVishy Poosala
 
Next Generation Innovation - Power of Stillness and More
Next Generation Innovation - Power of Stillness and MoreNext Generation Innovation - Power of Stillness and More
Next Generation Innovation - Power of Stillness and MoreVishy Poosala
 
18 minutes - Get the Right Things Done
18 minutes - Get the Right Things Done18 minutes - Get the Right Things Done
18 minutes - Get the Right Things DoneVishy Poosala
 
Computers & Programming for Creativity in Children
Computers & Programming for Creativity in  ChildrenComputers & Programming for Creativity in  Children
Computers & Programming for Creativity in ChildrenVishy Poosala
 
Innovation in software architecture
Innovation in software architectureInnovation in software architecture
Innovation in software architectureVishy Poosala
 
Recipe for Viral Marketing
Recipe for Viral MarketingRecipe for Viral Marketing
Recipe for Viral MarketingVishy Poosala
 
Ideal job: Doing what you love to do
Ideal job: Doing what you love to doIdeal job: Doing what you love to do
Ideal job: Doing what you love to doVishy Poosala
 
A recipe for meditation
A recipe for meditationA recipe for meditation
A recipe for meditationVishy Poosala
 
A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...
A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...
A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...Vishy Poosala
 

More from Vishy Poosala (9)

Big Ideas, Ideal Job, and other Holy Grails
Big Ideas, Ideal Job, and other Holy GrailsBig Ideas, Ideal Job, and other Holy Grails
Big Ideas, Ideal Job, and other Holy Grails
 
Next Generation Innovation - Power of Stillness and More
Next Generation Innovation - Power of Stillness and MoreNext Generation Innovation - Power of Stillness and More
Next Generation Innovation - Power of Stillness and More
 
18 minutes - Get the Right Things Done
18 minutes - Get the Right Things Done18 minutes - Get the Right Things Done
18 minutes - Get the Right Things Done
 
Computers & Programming for Creativity in Children
Computers & Programming for Creativity in  ChildrenComputers & Programming for Creativity in  Children
Computers & Programming for Creativity in Children
 
Innovation in software architecture
Innovation in software architectureInnovation in software architecture
Innovation in software architecture
 
Recipe for Viral Marketing
Recipe for Viral MarketingRecipe for Viral Marketing
Recipe for Viral Marketing
 
Ideal job: Doing what you love to do
Ideal job: Doing what you love to doIdeal job: Doing what you love to do
Ideal job: Doing what you love to do
 
A recipe for meditation
A recipe for meditationA recipe for meditation
A recipe for meditation
 
A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...
A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...
A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Recently uploaded (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

NoSQL & Big Data Analytics: History, Hype, Opportunities

  • 1. analyze(NoSQL,BigData); /* history, hype, opportunities */ // By: Vishy Poosala // Head of Bell Labs, India // poosala@alcatel-lucent.com // @vishyp 1
  • 2. The dark ages of COBOL 2
  • 3. ..then Codd said let there be tables Rows & Columns Normal SQL Forms ACID 3
  • 4. www.data-for-humans.com SET- WHAT VALUED COLUMNS ATTRIBUT ? ES Schema XML Evolution 4
  • 5. Billions of Keys & Values GFS Google Big Table Hadoop Cassandra Dynamo 5
  • 6. How would you build a super-fast, FB-scale chat service, in 2012? (for example) 6
  • 7. I want my own DB! • Memcached Main Memory • redis Distr. • MongoDB K-V Versions • CouchDB Social Graphs • Neo4j 7
  • 8. BIG KB GB TB PB Data Semi- FILES TABLES Variety Structured Dynamic Analytics OLAP STATS Apps Mahout Cube Language COBOL SQL XML NoSQL 60’s 80-96 96-’07 ‘07- 8
  • 9. Following *AMAZING* Slides Courtesy: Gregory Piatesky-Shapiro, kdnuggets.com You can find all the slides from his talk at: http://www.slideshare.net/gpiatetskyshapiro/analytics-and-data-mining-industry-overview 9
  • 10. Data Tsunami • In 2010 enterprises stored 7 exabytes =7,000,000,000 GB of new data (McKinsey) • 90 percent of the world's data has been Image with apologies to KDD-2011 generated in the past two years (IBM) 10
  • 11. Pre-history Statistics is the biggest term in 20th century, but data mining and analytics appears in late 1990s From Google Ngram viewer – English language books Note: Our analysis uses only English language data. Other languages, especially Chinese , need to be considered for full picture 11
  • 12. Recent History: Analytics, Data Mining, Knowledge Discovery Analytics has been used since 1800, but started to rise in 2005 Data Mining jumps around 1996 (soon after first KDD conference) but declines after 2003 (TIA controversy, associated with gov. invasion of privacy). Knowledge Discovery appears in 1989, jumps in 1996, and plateaus after 2000 12
  • 13. Google Trends: After 2006, Data Mining < Analytics 13
  • 14. Google Insights: searches for data mining, analytics -google are most popular in India, US 14
  • 15. Analytics > Data Mining > Data Science 15
  • 16. Data Science, Big Data 16
  • 18. Largest Dataset Analyzed? 2011 median dataset size ~10-20 GB, vs 8-10 GB in 2010. Increase in 10 GB to 1 PB range www.KDnuggets.com/polls/2011/largest-dataset-analyzed-data-mined.html 18
  • 19. Which methods/algorithms did you use for data analysis in 2011 % analysts who used it 0% 10% 20% 30% 40% 50% 60% 70% Decision Trees Regression Clustering Statistics Visualization Time series/Sequence analysis Support Vector (SVM) Association rules Ensemble methods Text Mining Neural Nets Boosting Bayesian Bagging Factor Analysis Anomaly/Deviation detection Social Network Analysis Survival Analysis Genetic algorithms Uplift modeling www.KDnuggets.com/polls/2011/algorithms-analytics-data-mining.html 19
  • 20. Cloud Analytics is not common (yet) www.KDnuggets.com/polls/2011/algorithms-analytics-data-mining.html 20
  • 21. Shortage of Skills • McKinsey: shortage by 2018 in the US of – 140-190,000 people with deep analytical skills – 1.5 M managers/analysts with the know-how to use the analysis of big data to make effective decisions. Source: www.mckinsey.com/mgi/publications/big_data / 21
  • 22. Job data: Data Scientist 22
  • 23. Jobs: Data Mining >> Data Scientist 23
  • 24. “Ground” Analytics (LinkedIn Skills) ~ 75,000 with Data Mining skill ~ 7,000 with Predictive Modeling Also ~ 20,000 with Predictive Analytics (not related with Predictive Modeling ?? 24
  • 25. Analytics LinkedIn Skills Predictive Analytics Machine Learning Text Mining MapReduce 25
  • 26. Big Data Bubble? Big Data Gartner Hype Cycle 26
  • 27. 27