SlideShare a Scribd company logo
1 of 21
Big Data Journeys
 A review of roadmaps taken by early
adopters to achieve their big data goals




                                           TDWI Big Data Solution Summit
                                           San Deigo, CA // June 4-6, 2012




                                           Krishnan Parasuraman
                                           CTO, Digital Media
                                           Netezza& Big Data Solutions
Big Data Journeys
 A review of roadmaps taken by early
adopters to achieve their big data goals
                                                 Talking Points

                                           • Journeys of 4 organizations

                                           • Different Drivers and
                                             Considerations

                                           • Different paths to big data
                                             realization

                                           • Key learning
Big Data Considerations


1   Leading Financial Services Solution Provider           Volume


2   Large Online Content Publisher                          Value


3   Global Telecommunications Major                       Velocity


4   Emerging Digital Media Marketer                        Variety
1   Leading Financial Services solution provider
    Provider of Financial Services, Products and
    Services to both businesses and consumers


                                              • Provide personalized customer
                                                experience online

                                              • Anticipate user behavior and
Strategic shift to deliver goods and            guide them to specific
   services via Digital Channels                functionality

                                              • Maintain consistent experience
                                                across online, mobile and social
                                                channels
1   Leading Financial Services solution provider
    Provider of Financial Services, Products and
    Services to both businesses and consumers

                                                       Volume

                                              • Large volumes of Data

                                              • Data integration
Big Data Solution Considerations
                                              • Deep Analytics

                                              • Large number of attributes
1      Leading Financial Services solution provider
            2007 : Before the Digital Shift
                                           EDW                                   Analytics


                                              4                                   3
                                                                                             Biz. Users
Internal Data
   Sources


                                                                             2                Data
                                                                                             Analysts


                          1            •   Top 10 display advertiser in the US
  Digital Data                         •   25Billion Impressions per quarter
 (Clickstream)                         •   1Billion clicks per day during peak usage
                                       •   Regression analysis for conversion tracking
1       Leading Financial Services solution provider
                 2008 : Roadmap 1.0

                                      EDW                   Analytics




                                                                                         Biz. Users
Internal Data
   Sources

                                                                                           Data
                                                                                          Analysts




        Digital Data            Step 1: Move to Massively Parallel Data Warehousing Appliance –
       (Clickstream)            Address volume, scale and performance considerations
1       Leading Financial Services solution provider
                 2010 : Roadmap 2.0

                                 EDW + Analytics




                                                                             Biz. Users
Internal Data
   Sources                                    In DB
                                              Anal-
                                               ytics

                                                                              Data
                                                                             Analysts




        Digital Data              Step 2: Leverage In Database Analytics –
       (Clickstream)              Run analytics at scale, closer to data
1      Leading Financial Services solution provider
                2010 : Roadmap 3.0
                                 EDW + Analytics




                                                                                         Biz. Users
Internal Data
   Sources                                   In DB
                                             Anal-
                                              ytics

                                                                                          Data
                                                                                         Analysts



 Digital Data
(Clickstream)
                   Step 3: Offload data pre-processing, cleansing and normalization to
                   Hadoop – Elastic scalability + Analytics sandbox
2     Large Online Content Publisher
      One of the internet’s top destinations for
      specialized content



                                                                  Value


                                                   • Provide regular data feeds – no
    Support business partners and                    performance SLAs
    affiliate marketer’s data needs
                                                   • Manage cost of infrastructure
2        Large Online Content Publisher
                          2010 : Roadmap 1.0
                                               EDW + Analytics



                                    1                                 Biz. Users

    Data Sources
                                  ETL

•   15 Million unique visitors                                          Data
•   210 million page views
                                                                 2
                                                                       Analysts
•   2TB of new data per day
•   1 million+ new content items per day




                                                                     Partners &
                                                                      Affiliates
2     Large Online Content Publisher
               2011 : Roadmap 2.0

                                        EDW + Analytics




                                    1                          Biz. Users
Data Sources                        ELT




                                                                 Data
                                                                Analysts
                   2                                  3   4


                                                              Partners &
                                                               Affiliates
3    Global Telecommunications Major
      Leading cell phone carrier networks in the world




                                                            Velocity


                                               • Predict outages and congestions
    React to network disruptions                 before they appear
            immediately
                                               • Address disruptions in Real Time
3       Global Telecommunications Major
                  Till 2007: World of Voice and limited Data

                                              EDW                              Analytics



                                                                                           Call Centers


  Data Sources

                                                                                            Network
                                                                                            engineers
• Call Detail Records
• Network Transmissions logs
• Thousands of events per second

                                                               1
                                            Response Latency = Hours or Days
3        Global Telecommunications Major
                 2008 – 2011 – Voice, Data, Smartphones and 3G

                                        EDW + Analytics


                                                                        Call Centers


     Data Sources
                                                      In-database
                                                      modeling and
                                                         scoring         Network
• Call Detail Records                                                    engineers
• Network Transmissions logs
• Millions of events per second
                                       Response Latency = Minutes

                           Step 1: Adoption of Massively Parallel Data Warehousing
                           Appliance reduced overall latency from hours to minutes
3        Global Telecommunications Major
                         2011+Video Services, 4G LTE

                                    Stream processing                EDW + Analytics


                                                                                                   Call Centers


   Data Sources
                                                                                    In-database
                                                                                    modeling and
                                                                                       scoring      Network
• Call Detail Records                                                                               engineers
• Network Transmissions logs
• Hundreds of Millions of events
  per second
                                                       Response Latency = Seconds

                         Step 2: Stream processing provided Real Time analytics capability,
                            took processing workload off DW and was designed to scale
4   Digital Media Marketer
    Specializes in multi-channel marketing across
    online, offline, mobile and social channels

                                                           Variety


                                             • Manage large volumes of
                                               unstructured data
Monitor social media channels and
     engage with customers                   • Correlate structured and
                                               unstructured data to increase
                                               targeting and relevance
• 100TB+ data under management
4     Digital Media Marketer                                           • “Listen” to 100Million+ Tweets per day
                                                                       • Manage large volumes of unstructured data
               2010: Roadmap 1.0


                                      EDW + Analytics


                                 3                                 1
                                                                                                            Biz. Users
Data Sources
                                                                   2
                                                                                                              Data
                                                                                                             Analysts
                       • 100 Node Hadoop Cluster
                       • Manage Structured and Unstructured Data
                       • Components included Hive, HBase, Mahout
• 100TB+ data under management
4     Digital Media Marketer                          • “Listen” to 100Million+ Tweets per day
                                                      • Manage large volumes of unstructured data
               2012: Roadmap 2.0


                                   EDW + Analytics




                                                                                           Biz. Users

Data Sources


                                                                                             Data
                                                                                            Analysts



           Hadoop and Massively Parallel Data Warehouse co-existence EDW –
                  Manage unstructured and structured data at scale
Key Takeaways

1   Adopt your roadmap based on the big data consideration


    Massively Parallel Data-warehouse Appliances and Hadoop
2   are complementary technologies

3   Consider Evolutionary approach over a Big Bang approach


4   Your EDW will be non-monolithic – understand intra-product
    integration implications
Big Data Journeys
 A review of roadmaps taken by early
adopters to achieve their big data goals




                                           Bon Voyage!


                                            Krishnan Parasuraman
                                               @kparasuraman

More Related Content

What's hot

Marcoccio10 22
Marcoccio10 22Marcoccio10 22
Marcoccio10 22jaikms kms
 
What_BigData_means_to_your_organization
What_BigData_means_to_your_organizationWhat_BigData_means_to_your_organization
What_BigData_means_to_your_organizationAttila Barta
 
Bardess Moderated - Analytics and Business Intelligence - Society of Informat...
Bardess Moderated - Analytics and Business Intelligence - Society of Informat...Bardess Moderated - Analytics and Business Intelligence - Society of Informat...
Bardess Moderated - Analytics and Business Intelligence - Society of Informat...bardessweb
 
Reaping the benefits of Big Data and real time analytics
Reaping the benefits of Big Data and real time analyticsReaping the benefits of Big Data and real time analytics
Reaping the benefits of Big Data and real time analyticsThe Marketing Distillery
 
How to Create and Manage a Successful Analytics Organization
How to Create and Manage a Successful Analytics OrganizationHow to Create and Manage a Successful Analytics Organization
How to Create and Manage a Successful Analytics OrganizationDATAVERSITY
 
Demystifying BI For Mid-Market Enterprises
Demystifying BI For Mid-Market EnterprisesDemystifying BI For Mid-Market Enterprises
Demystifying BI For Mid-Market EnterprisesJamal_Shah
 
Big Data and BI Best Practices
Big Data and BI Best PracticesBig Data and BI Best Practices
Big Data and BI Best PracticesYellowfin
 
Simplifying Big Data Analytics for the Business
Simplifying Big Data Analytics for the BusinessSimplifying Big Data Analytics for the Business
Simplifying Big Data Analytics for the BusinessTeradata Aster
 
Big Data in Financial Services: How to Improve Performance with Data-Driven D...
Big Data in Financial Services: How to Improve Performance with Data-Driven D...Big Data in Financial Services: How to Improve Performance with Data-Driven D...
Big Data in Financial Services: How to Improve Performance with Data-Driven D...Perficient, Inc.
 
Go-To-Market with Capstone v3
Go-To-Market with Capstone v3Go-To-Market with Capstone v3
Go-To-Market with Capstone v3Tracy Hawkey
 
SmartData - Monetizing Data Assets
SmartData - Monetizing Data AssetsSmartData - Monetizing Data Assets
SmartData - Monetizing Data AssetsEd Dodds
 
Top 7 Capabilities for Next-Gen Master Data Management
Top 7 Capabilities for Next-Gen Master Data ManagementTop 7 Capabilities for Next-Gen Master Data Management
Top 7 Capabilities for Next-Gen Master Data ManagementDATAVERSITY
 
Slides: Data Monetization — Demonstrating Quantifiable Financial Benefits fro...
Slides: Data Monetization — Demonstrating Quantifiable Financial Benefits fro...Slides: Data Monetization — Demonstrating Quantifiable Financial Benefits fro...
Slides: Data Monetization — Demonstrating Quantifiable Financial Benefits fro...DATAVERSITY
 
Big Data is Here for Financial Services White Paper
Big Data is Here for Financial Services White PaperBig Data is Here for Financial Services White Paper
Big Data is Here for Financial Services White PaperExperian
 
Analytics driving innovation and efficiency in Banking
Analytics driving innovation and efficiency in BankingAnalytics driving innovation and efficiency in Banking
Analytics driving innovation and efficiency in BankingGianpaolo Zampol
 
Virtual Data Steward: Data Management 3.0
Virtual Data Steward: Data Management 3.0Virtual Data Steward: Data Management 3.0
Virtual Data Steward: Data Management 3.0CrowdFlower
 
I Npd Mfei 5 10
I Npd Mfei 5 10I Npd Mfei 5 10
I Npd Mfei 5 10kbmcgourty
 

What's hot (20)

Marcoccio10 22
Marcoccio10 22Marcoccio10 22
Marcoccio10 22
 
What_BigData_means_to_your_organization
What_BigData_means_to_your_organizationWhat_BigData_means_to_your_organization
What_BigData_means_to_your_organization
 
Bardess Moderated - Analytics and Business Intelligence - Society of Informat...
Bardess Moderated - Analytics and Business Intelligence - Society of Informat...Bardess Moderated - Analytics and Business Intelligence - Society of Informat...
Bardess Moderated - Analytics and Business Intelligence - Society of Informat...
 
Reaping the benefits of Big Data and real time analytics
Reaping the benefits of Big Data and real time analyticsReaping the benefits of Big Data and real time analytics
Reaping the benefits of Big Data and real time analytics
 
The dawn of Big Data
The dawn of Big DataThe dawn of Big Data
The dawn of Big Data
 
How to Create and Manage a Successful Analytics Organization
How to Create and Manage a Successful Analytics OrganizationHow to Create and Manage a Successful Analytics Organization
How to Create and Manage a Successful Analytics Organization
 
Buyer's guide to strategic analytics
Buyer's guide to strategic analyticsBuyer's guide to strategic analytics
Buyer's guide to strategic analytics
 
Demystifying BI For Mid-Market Enterprises
Demystifying BI For Mid-Market EnterprisesDemystifying BI For Mid-Market Enterprises
Demystifying BI For Mid-Market Enterprises
 
Big Data and BI Best Practices
Big Data and BI Best PracticesBig Data and BI Best Practices
Big Data and BI Best Practices
 
Simplifying Big Data Analytics for the Business
Simplifying Big Data Analytics for the BusinessSimplifying Big Data Analytics for the Business
Simplifying Big Data Analytics for the Business
 
Big Data in Financial Services: How to Improve Performance with Data-Driven D...
Big Data in Financial Services: How to Improve Performance with Data-Driven D...Big Data in Financial Services: How to Improve Performance with Data-Driven D...
Big Data in Financial Services: How to Improve Performance with Data-Driven D...
 
Go-To-Market with Capstone v3
Go-To-Market with Capstone v3Go-To-Market with Capstone v3
Go-To-Market with Capstone v3
 
SmartData - Monetizing Data Assets
SmartData - Monetizing Data AssetsSmartData - Monetizing Data Assets
SmartData - Monetizing Data Assets
 
Top 7 Capabilities for Next-Gen Master Data Management
Top 7 Capabilities for Next-Gen Master Data ManagementTop 7 Capabilities for Next-Gen Master Data Management
Top 7 Capabilities for Next-Gen Master Data Management
 
Slides: Data Monetization — Demonstrating Quantifiable Financial Benefits fro...
Slides: Data Monetization — Demonstrating Quantifiable Financial Benefits fro...Slides: Data Monetization — Demonstrating Quantifiable Financial Benefits fro...
Slides: Data Monetization — Demonstrating Quantifiable Financial Benefits fro...
 
Big Data is Here for Financial Services White Paper
Big Data is Here for Financial Services White PaperBig Data is Here for Financial Services White Paper
Big Data is Here for Financial Services White Paper
 
Analytics driving innovation and efficiency in Banking
Analytics driving innovation and efficiency in BankingAnalytics driving innovation and efficiency in Banking
Analytics driving innovation and efficiency in Banking
 
Virtual Data Steward: Data Management 3.0
Virtual Data Steward: Data Management 3.0Virtual Data Steward: Data Management 3.0
Virtual Data Steward: Data Management 3.0
 
I Npd Mfei 5 10
I Npd Mfei 5 10I Npd Mfei 5 10
I Npd Mfei 5 10
 
Big data baddata-gooddata
Big data baddata-gooddataBig data baddata-gooddata
Big data baddata-gooddata
 

Viewers also liked

Intentional Life Fundamentals
Intentional Life FundamentalsIntentional Life Fundamentals
Intentional Life Fundamentalsdijoe331
 
Att ta ställning
Att ta ställningAtt ta ställning
Att ta ställningsgpegur
 
C:\Documents And Settings\Alumno\Escritorio\Ci Pc14 Guia9
C:\Documents And Settings\Alumno\Escritorio\Ci Pc14 Guia9C:\Documents And Settings\Alumno\Escritorio\Ci Pc14 Guia9
C:\Documents And Settings\Alumno\Escritorio\Ci Pc14 Guia9giselamacuemi
 
Lead Like You Ride Program Overview.v2.1
Lead Like You Ride Program Overview.v2.1Lead Like You Ride Program Overview.v2.1
Lead Like You Ride Program Overview.v2.1dijoe331
 
AUTHENTIC TURKISH JAM
AUTHENTIC TURKISH JAMAUTHENTIC TURKISH JAM
AUTHENTIC TURKISH JAMSertac Sunman
 
Igor e cadu2
Igor e cadu2Igor e cadu2
Igor e cadu2cehmsc
 
Joanna Belbey Presentation - BDI 10/20/11 Insurance Social Communications Lea...
Joanna Belbey Presentation - BDI 10/20/11 Insurance Social Communications Lea...Joanna Belbey Presentation - BDI 10/20/11 Insurance Social Communications Lea...
Joanna Belbey Presentation - BDI 10/20/11 Insurance Social Communications Lea...Business Development Institute
 
Polymer melting and Mixing
Polymer melting and MixingPolymer melting and Mixing
Polymer melting and MixingAjay Beniwal
 
The anticipated messiah part 1
The anticipated messiah part 1The anticipated messiah part 1
The anticipated messiah part 1David Turner
 
IEC 61131-3 PLC Programming Languages: Beyond Ladder Logic
IEC 61131-3 PLC Programming Languages:  Beyond Ladder LogicIEC 61131-3 PLC Programming Languages:  Beyond Ladder Logic
IEC 61131-3 PLC Programming Languages: Beyond Ladder LogicDrives & Systems
 

Viewers also liked (13)

Intentional Life Fundamentals
Intentional Life FundamentalsIntentional Life Fundamentals
Intentional Life Fundamentals
 
Att ta ställning
Att ta ställningAtt ta ställning
Att ta ställning
 
References
ReferencesReferences
References
 
C:\Documents And Settings\Alumno\Escritorio\Ci Pc14 Guia9
C:\Documents And Settings\Alumno\Escritorio\Ci Pc14 Guia9C:\Documents And Settings\Alumno\Escritorio\Ci Pc14 Guia9
C:\Documents And Settings\Alumno\Escritorio\Ci Pc14 Guia9
 
Lead Like You Ride Program Overview.v2.1
Lead Like You Ride Program Overview.v2.1Lead Like You Ride Program Overview.v2.1
Lead Like You Ride Program Overview.v2.1
 
Livro Juventude
Livro JuventudeLivro Juventude
Livro Juventude
 
AUTHENTIC TURKISH JAM
AUTHENTIC TURKISH JAMAUTHENTIC TURKISH JAM
AUTHENTIC TURKISH JAM
 
Mohammed C.V
Mohammed C.VMohammed C.V
Mohammed C.V
 
Igor e cadu2
Igor e cadu2Igor e cadu2
Igor e cadu2
 
Joanna Belbey Presentation - BDI 10/20/11 Insurance Social Communications Lea...
Joanna Belbey Presentation - BDI 10/20/11 Insurance Social Communications Lea...Joanna Belbey Presentation - BDI 10/20/11 Insurance Social Communications Lea...
Joanna Belbey Presentation - BDI 10/20/11 Insurance Social Communications Lea...
 
Polymer melting and Mixing
Polymer melting and MixingPolymer melting and Mixing
Polymer melting and Mixing
 
The anticipated messiah part 1
The anticipated messiah part 1The anticipated messiah part 1
The anticipated messiah part 1
 
IEC 61131-3 PLC Programming Languages: Beyond Ladder Logic
IEC 61131-3 PLC Programming Languages:  Beyond Ladder LogicIEC 61131-3 PLC Programming Languages:  Beyond Ladder Logic
IEC 61131-3 PLC Programming Languages: Beyond Ladder Logic
 

Similar to Big Data Journeys: Review of roadmaps taken by early adopters to achieve their big data goals

OSC2012: Big Data Using Open Source: Netapp Project - Technical
OSC2012: Big Data Using Open Source: Netapp Project - TechnicalOSC2012: Big Data Using Open Source: Netapp Project - Technical
OSC2012: Big Data Using Open Source: Netapp Project - TechnicalAccenture the Netherlands
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantStuart Miniman
 
Healthcare Business Intelligence for Power Users
Healthcare Business Intelligence for Power UsersHealthcare Business Intelligence for Power Users
Healthcare Business Intelligence for Power UsersPerficient, Inc.
 
BISMART Bihealth. Microsoft Business Intelligence in health
BISMART Bihealth. Microsoft Business Intelligence in healthBISMART Bihealth. Microsoft Business Intelligence in health
BISMART Bihealth. Microsoft Business Intelligence in healthalbertisern
 
In memory analysis 衍華
In memory analysis 衍華In memory analysis 衍華
In memory analysis 衍華Lawrence Huang
 
Microservices Approaches for Continuous Data Integration
Microservices Approaches for Continuous Data IntegrationMicroservices Approaches for Continuous Data Integration
Microservices Approaches for Continuous Data IntegrationVMware Tanzu
 
Intel Cloud Summit: Big Data
Intel Cloud Summit: Big DataIntel Cloud Summit: Big Data
Intel Cloud Summit: Big DataIntelAPAC
 
Big Data Marketing Analytics
Big Data Marketing AnalyticsBig Data Marketing Analytics
Big Data Marketing AnalyticsAkash Tyagi
 
Trends for Modernizing Analytics and Data Warehousing in 2019
Trends for Modernizing Analytics and Data Warehousing in 2019Trends for Modernizing Analytics and Data Warehousing in 2019
Trends for Modernizing Analytics and Data Warehousing in 2019Arcadia Data
 
B2Bdatapartners Capabilities
B2Bdatapartners CapabilitiesB2Bdatapartners Capabilities
B2Bdatapartners CapabilitiesB2Bdatapartners
 
Cloud Migration Strategies that Ensure Greater Value for the Business
Cloud Migration Strategies that Ensure Greater Value for the BusinessCloud Migration Strategies that Ensure Greater Value for the Business
Cloud Migration Strategies that Ensure Greater Value for the BusinessDenodo
 
Business Intelligence, Portals, Dashboards and Operational Matrix with ShareP...
Business Intelligence, Portals, Dashboards and Operational Matrix with ShareP...Business Intelligence, Portals, Dashboards and Operational Matrix with ShareP...
Business Intelligence, Portals, Dashboards and Operational Matrix with ShareP...Optimus BT
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataSpringPeople
 
Hadoop: What It Is and What It's Not
Hadoop: What It Is and What It's NotHadoop: What It Is and What It's Not
Hadoop: What It Is and What It's NotInside Analysis
 
IBM Smarter Business 2012 - PureSystems - PureData
IBM Smarter Business 2012 - PureSystems - PureDataIBM Smarter Business 2012 - PureSystems - PureData
IBM Smarter Business 2012 - PureSystems - PureDataIBM Sverige
 
Bi 4.0 Migration Strategy and Best Practices
Bi 4.0 Migration Strategy and Best PracticesBi 4.0 Migration Strategy and Best Practices
Bi 4.0 Migration Strategy and Best PracticesEric Molner
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshowAccenture
 
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Mark Heid
 

Similar to Big Data Journeys: Review of roadmaps taken by early adopters to achieve their big data goals (20)

OSC2012: Big Data Using Open Source: Netapp Project - Technical
OSC2012: Big Data Using Open Source: Netapp Project - TechnicalOSC2012: Big Data Using Open Source: Netapp Project - Technical
OSC2012: Big Data Using Open Source: Netapp Project - Technical
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You Want
 
Healthcare Business Intelligence for Power Users
Healthcare Business Intelligence for Power UsersHealthcare Business Intelligence for Power Users
Healthcare Business Intelligence for Power Users
 
BISMART Bihealth. Microsoft Business Intelligence in health
BISMART Bihealth. Microsoft Business Intelligence in healthBISMART Bihealth. Microsoft Business Intelligence in health
BISMART Bihealth. Microsoft Business Intelligence in health
 
In memory analysis 衍華
In memory analysis 衍華In memory analysis 衍華
In memory analysis 衍華
 
Microservices Approaches for Continuous Data Integration
Microservices Approaches for Continuous Data IntegrationMicroservices Approaches for Continuous Data Integration
Microservices Approaches for Continuous Data Integration
 
Intel Cloud Summit: Big Data
Intel Cloud Summit: Big DataIntel Cloud Summit: Big Data
Intel Cloud Summit: Big Data
 
Big Data Marketing Analytics
Big Data Marketing AnalyticsBig Data Marketing Analytics
Big Data Marketing Analytics
 
Trends for Modernizing Analytics and Data Warehousing in 2019
Trends for Modernizing Analytics and Data Warehousing in 2019Trends for Modernizing Analytics and Data Warehousing in 2019
Trends for Modernizing Analytics and Data Warehousing in 2019
 
B2Bdatapartners Capabilities
B2Bdatapartners CapabilitiesB2Bdatapartners Capabilities
B2Bdatapartners Capabilities
 
Cloud Migration Strategies that Ensure Greater Value for the Business
Cloud Migration Strategies that Ensure Greater Value for the BusinessCloud Migration Strategies that Ensure Greater Value for the Business
Cloud Migration Strategies that Ensure Greater Value for the Business
 
Business Intelligence, Portals, Dashboards and Operational Matrix with ShareP...
Business Intelligence, Portals, Dashboards and Operational Matrix with ShareP...Business Intelligence, Portals, Dashboards and Operational Matrix with ShareP...
Business Intelligence, Portals, Dashboards and Operational Matrix with ShareP...
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
The New Enterprise Data Platform
The New Enterprise Data PlatformThe New Enterprise Data Platform
The New Enterprise Data Platform
 
Hadoop: What It Is and What It's Not
Hadoop: What It Is and What It's NotHadoop: What It Is and What It's Not
Hadoop: What It Is and What It's Not
 
IBM Smarter Business 2012 - PureSystems - PureData
IBM Smarter Business 2012 - PureSystems - PureDataIBM Smarter Business 2012 - PureSystems - PureData
IBM Smarter Business 2012 - PureSystems - PureData
 
Bi 4.0 Migration Strategy and Best Practices
Bi 4.0 Migration Strategy and Best PracticesBi 4.0 Migration Strategy and Best Practices
Bi 4.0 Migration Strategy and Best Practices
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshow
 
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
 
Secure Big Data Analytics - Hadoop & Intel
Secure Big Data Analytics - Hadoop & IntelSecure Big Data Analytics - Hadoop & Intel
Secure Big Data Analytics - Hadoop & Intel
 

Recently uploaded

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 

Recently uploaded (20)

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 

Big Data Journeys: Review of roadmaps taken by early adopters to achieve their big data goals

  • 1. Big Data Journeys A review of roadmaps taken by early adopters to achieve their big data goals TDWI Big Data Solution Summit San Deigo, CA // June 4-6, 2012 Krishnan Parasuraman CTO, Digital Media Netezza& Big Data Solutions
  • 2. Big Data Journeys A review of roadmaps taken by early adopters to achieve their big data goals Talking Points • Journeys of 4 organizations • Different Drivers and Considerations • Different paths to big data realization • Key learning
  • 3. Big Data Considerations 1 Leading Financial Services Solution Provider Volume 2 Large Online Content Publisher Value 3 Global Telecommunications Major Velocity 4 Emerging Digital Media Marketer Variety
  • 4. 1 Leading Financial Services solution provider Provider of Financial Services, Products and Services to both businesses and consumers • Provide personalized customer experience online • Anticipate user behavior and Strategic shift to deliver goods and guide them to specific services via Digital Channels functionality • Maintain consistent experience across online, mobile and social channels
  • 5. 1 Leading Financial Services solution provider Provider of Financial Services, Products and Services to both businesses and consumers Volume • Large volumes of Data • Data integration Big Data Solution Considerations • Deep Analytics • Large number of attributes
  • 6. 1 Leading Financial Services solution provider 2007 : Before the Digital Shift EDW Analytics 4 3 Biz. Users Internal Data Sources 2 Data Analysts 1 • Top 10 display advertiser in the US Digital Data • 25Billion Impressions per quarter (Clickstream) • 1Billion clicks per day during peak usage • Regression analysis for conversion tracking
  • 7. 1 Leading Financial Services solution provider 2008 : Roadmap 1.0 EDW Analytics Biz. Users Internal Data Sources Data Analysts Digital Data Step 1: Move to Massively Parallel Data Warehousing Appliance – (Clickstream) Address volume, scale and performance considerations
  • 8. 1 Leading Financial Services solution provider 2010 : Roadmap 2.0 EDW + Analytics Biz. Users Internal Data Sources In DB Anal- ytics Data Analysts Digital Data Step 2: Leverage In Database Analytics – (Clickstream) Run analytics at scale, closer to data
  • 9. 1 Leading Financial Services solution provider 2010 : Roadmap 3.0 EDW + Analytics Biz. Users Internal Data Sources In DB Anal- ytics Data Analysts Digital Data (Clickstream) Step 3: Offload data pre-processing, cleansing and normalization to Hadoop – Elastic scalability + Analytics sandbox
  • 10. 2 Large Online Content Publisher One of the internet’s top destinations for specialized content Value • Provide regular data feeds – no Support business partners and performance SLAs affiliate marketer’s data needs • Manage cost of infrastructure
  • 11. 2 Large Online Content Publisher 2010 : Roadmap 1.0 EDW + Analytics 1 Biz. Users Data Sources ETL • 15 Million unique visitors Data • 210 million page views 2 Analysts • 2TB of new data per day • 1 million+ new content items per day Partners & Affiliates
  • 12. 2 Large Online Content Publisher 2011 : Roadmap 2.0 EDW + Analytics 1 Biz. Users Data Sources ELT Data Analysts 2 3 4 Partners & Affiliates
  • 13. 3 Global Telecommunications Major Leading cell phone carrier networks in the world Velocity • Predict outages and congestions React to network disruptions before they appear immediately • Address disruptions in Real Time
  • 14. 3 Global Telecommunications Major Till 2007: World of Voice and limited Data EDW Analytics Call Centers Data Sources Network engineers • Call Detail Records • Network Transmissions logs • Thousands of events per second 1 Response Latency = Hours or Days
  • 15. 3 Global Telecommunications Major 2008 – 2011 – Voice, Data, Smartphones and 3G EDW + Analytics Call Centers Data Sources In-database modeling and scoring Network • Call Detail Records engineers • Network Transmissions logs • Millions of events per second Response Latency = Minutes Step 1: Adoption of Massively Parallel Data Warehousing Appliance reduced overall latency from hours to minutes
  • 16. 3 Global Telecommunications Major 2011+Video Services, 4G LTE Stream processing EDW + Analytics Call Centers Data Sources In-database modeling and scoring Network • Call Detail Records engineers • Network Transmissions logs • Hundreds of Millions of events per second Response Latency = Seconds Step 2: Stream processing provided Real Time analytics capability, took processing workload off DW and was designed to scale
  • 17. 4 Digital Media Marketer Specializes in multi-channel marketing across online, offline, mobile and social channels Variety • Manage large volumes of unstructured data Monitor social media channels and engage with customers • Correlate structured and unstructured data to increase targeting and relevance
  • 18. • 100TB+ data under management 4 Digital Media Marketer • “Listen” to 100Million+ Tweets per day • Manage large volumes of unstructured data 2010: Roadmap 1.0 EDW + Analytics 3 1 Biz. Users Data Sources 2 Data Analysts • 100 Node Hadoop Cluster • Manage Structured and Unstructured Data • Components included Hive, HBase, Mahout
  • 19. • 100TB+ data under management 4 Digital Media Marketer • “Listen” to 100Million+ Tweets per day • Manage large volumes of unstructured data 2012: Roadmap 2.0 EDW + Analytics Biz. Users Data Sources Data Analysts Hadoop and Massively Parallel Data Warehouse co-existence EDW – Manage unstructured and structured data at scale
  • 20. Key Takeaways 1 Adopt your roadmap based on the big data consideration Massively Parallel Data-warehouse Appliances and Hadoop 2 are complementary technologies 3 Consider Evolutionary approach over a Big Bang approach 4 Your EDW will be non-monolithic – understand intra-product integration implications
  • 21. Big Data Journeys A review of roadmaps taken by early adopters to achieve their big data goals Bon Voyage! Krishnan Parasuraman @kparasuraman