SlideShare a Scribd company logo
1 of 30
Download to read offline
Using Hadoop to
    Change Company Culture
                     Amy O’Connor
           Senior Director, Analytics



1
The Amazing Everyday




2
The Amazing Everyday




3
NOKIA’S HISTORY: 1865 TO Now




4
Data
    is our newest
     raw material


5
TRAFFIC
 US Government data
 suggests worsening
  conditions in urban
   areas…this year
average commute time
has risen by 9 minutes.

     Federal Highway
     Commission, Urban
     Congestion Report
     January 2011 to
     March 2011

 6
TRAFFIC
Global car sales are
 growing: there will
 be 50% more cars
sold in 10 years as
  there are today.

    IHS Automotive,
    October 2011




7
More data usually beats
       better algorithms
                    Anand Rajaramann




8
Devices in            •   Image Sensors
    use around            •   Accelerometers
    the world
                          •   Gyroscopes
                          •   Compasses
                          •   Pressure Sensors
           Probe points   •   Microphones
           collected
           monthly from
                          •   Light Sensors
           Nokia alone    •   Assisted GPS




9
18M         24


           80M
10
Data Storage & Analysis Landscape




11
Data Silos



     Traffic          Search              Consumer
     Probes            Logs                Profile
                Ad              Places               Device
               Data            Registry               Data




12
Smart Data

       Combining
         sets of
      behavioral &
     contextual data
13
A Good Way to Change
     Corporate Culture




14
Getting Children
     to Eat Peas…
     Tell them you expect them to
     eat their peas.
     Reward them with ice cream
     if they did.
     Explain why it’s good for
     them to eat their peas.
     Eat your own peas as a good
     role model.
                 Leann Lipps Birch,
                 Head of Human Development &
                 Family Studies
                 Pennsylvania State University
15
Getting Children
     to Eat Peas…
     Put them with children who
     love peas.

     Change the stories
     they tell.



                 Leann Lipps Birch,
                 Head of Human Development &
                 Family Studies
                 Pennsylvania State University
16
Identity                Media              Care & Marketing
       Consumer Profile     Products, Transactions     Device/User CRM

             SSO               Songs, Delivery          Activation Info

       Device Activation                                 Net Promotor
                              Advertising
          Campaigns                                        PC Suite
                                Ad Inventory

           Contacts
                            Campaign Promotions         Location
          Navteq                 Ad Canvas             Premium Content

        All Probes Data                                 Favorite Routes

          3D Imagery            Nokia                      Log Files

          Street View
                                Data                      Map Tiles

      Feature Recognition
                                Asset                      Imagery


     Device Programs              Social                  Search
           NAC, IIA              Social UGC                Log Files

       Windows Phone           Universal Share         Points of Interest

          Panel data            Journeys Data
                                                         Nokia IT
       Equipment Master           Event Info
                                                         Registrations

         Factory Data
                                                        Device Updates




17
Users
                                                         Analytics

      Decision-          Domain                             Offline         Predictive
       Makers           Expertise          Dashboards      Analysis         Analytics



       Data         Domain Expertise,                                       Key Value
      Analysts       Statistical Skills    TeraData                           Store


                    Domain Expertise,
        Data                               Oozie         Map          Pig        Hive
                     Statistical Skills,                Reduce
      Scientists
                    Computer Science


                                           Flume                                         HBas
                                                                                          e
                                                                               HDFS
     Developers/    Domain Expertise,      Scrib                                         HBas
     Applications   Computer Science         e                                            e

                                           FTP                                           HBas
                                                                                          e




18
Collaborative Working Model
      Present
                               Analytics

                                  Offline         Predictive
      Analyze    Dashboards      Analysis         Analytics
       and
     Aggregate

                                                  Key Value
                 TeraData                           Store
       Load


                 Oozie         Map          Pig        Hive
                              Reduce
     Transform



                 Flume                                         HBas
                                                                e
      Extract                                        HDFS
                 Scrib                                         HBas
                   e                                            e

                 FTP                                           HBas
     Platform                                                   e




19
Collaborative Working Model
                     To                           BI Tools:
      Present
                  DataOS,        Customer              Analytics
                                                   SPSS,     AD Hoc
                 Structured      Dashboard        Tableau,       Reporting
                    Data                          Cognos

                                                      Offline             Predictive
      Analyze     Hive QL
                                    Dashboards Mahout
                                     MR             Analysis              Analytics
                                                                     Rec,
       and        and Pig          Analysis       Machine
                                                                   Engines
     Aggregate    Queries            Job          Learning


                                                                          Key Value
                                    TeraData                       MR Agg   Store Agg
                                                                               MR
                                     User           Create
       Load      Metadata                                          Job and        Job and
                                   Metadata          Hive
                 Catalog                                           Data to         Data to
                                  Interfaces       Schema
                                                                    Oracle        Teradata

                                    Oozie            Map              Pig          Hive
                                   Monitor          Reduce
                  Create                            Data           Develop        Develop            Develop
     Transform   Library of          and           Model           Cleanse       Validation          Partition
                 MR Jobs           Manage         Definition         Job            Job                Job
                                  Transform

                                   Flume                                                              HBas
                  Catalog          Define
                                                   Monitor
                                                                  Integrate       Integrate             e
                                                                                                      Define
      Extract      Data           Standard
                                                     and
                                                   Manage
                                                                  Historical    HDFS
                                                                                 Streaming           Custom
                  Sources           ETL
                                   Scrib                            Data            Data              ETL
                                                                                                     HBas
                                                  Data Feed
                                      e                                                                 e

                                     FTP                                                              HBas
     Platform               Co-located developer clusters, pre-production cluster, product cluster     e


                                       Data OS        Product Teams and/or DataOS

20
Smart Data: Behavioral & Contextual
                                                          Analytics
     Aggregation
     • Update top searches table and
     Aggregation table in Oracle
       geo activity                         Dashboards
                                                             Offline         Predictive
                                                            Analysis         Analytics
     • Merge multiple data sources
     • Implement app logic
       (e.g.; round up                                                       Key Value
     Standard ETL
       latitude/longitude to                TeraData                           Store
     • Clean data, remove bad record
       3 decimals)
       and health checks
     • Partition data by date, hour, type                 Map
                                            Oozie                      Pig        Hive
     • Archive raw data                                  Reduce




                                            Flume                                         HBas
             Ad Router                                                                     e
                                                                                HDFS
                                            Scrib                                         HBas
                NAC                           e                                            e

                                            FTP                                           HBas
            Local Search                                                                   e




21
Merchant Portal Heatmap




22
Mapping the World




     23   © 2011 Nokia Company Confidential
23
Mapping the World




                           Probe data indicates the location, speed,
                           heading, time etc. about a mobile device.
                           Billions of probe records per week.
                                              Covers almost the entire world.
     24   © 2011 Nokia Company Confidential
24
Probe Density: Urban and Arterial




                                    100% AGR




25
792527719 (Jackson Blvd/Financial Pl)


      From Ref, One lane




26
27
24 Hours in Our Analytics Ecosystem


           ~2TB ingested
     350M messages via scribe
           >3000 MR jobs
          10TB processed


28
Spreading the Word


         Data Asset Catalog

     Smart Data Newsletter Stories

       Realtime Dashboards



29
The Amazing Everyday

          Thanks!



30

More Related Content

Viewers also liked

20873918 nokia-organizational-behaviour
20873918 nokia-organizational-behaviour20873918 nokia-organizational-behaviour
20873918 nokia-organizational-behaviour
Md Ahmed
 

Viewers also liked (11)

Powering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache HadoopPowering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache Hadoop
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big Data
 
20873918 nokia-organizational-behaviour
20873918 nokia-organizational-behaviour20873918 nokia-organizational-behaviour
20873918 nokia-organizational-behaviour
 
A Methodology for Building the Internet of Things
A Methodology for Building the Internet of ThingsA Methodology for Building the Internet of Things
A Methodology for Building the Internet of Things
 
Using Big Data to Drive Customer 360
Using Big Data to Drive Customer 360Using Big Data to Drive Customer 360
Using Big Data to Drive Customer 360
 
Nokia Strategy Presentation
Nokia Strategy PresentationNokia Strategy Presentation
Nokia Strategy Presentation
 
Nokia strategy and marketing
Nokia strategy and marketingNokia strategy and marketing
Nokia strategy and marketing
 
Collaborate to Win - Why Every Company Needs a Culture of Collaboration
Collaborate to Win - Why Every Company Needs a Culture of CollaborationCollaborate to Win - Why Every Company Needs a Culture of Collaboration
Collaborate to Win - Why Every Company Needs a Culture of Collaboration
 
Aligning Strategy and Culture
Aligning Strategy and CultureAligning Strategy and Culture
Aligning Strategy and Culture
 
Culture Eats Strategy for Breakfast - Greenspot by DartGroup Amsterdam - Cont...
Culture Eats Strategy for Breakfast - Greenspot by DartGroup Amsterdam - Cont...Culture Eats Strategy for Breakfast - Greenspot by DartGroup Amsterdam - Cont...
Culture Eats Strategy for Breakfast - Greenspot by DartGroup Amsterdam - Cont...
 
5 Steps for Building an Ideal Company culture (and what to watch out for!)
5 Steps for Building an Ideal Company culture (and what to watch out for!)5 Steps for Building an Ideal Company culture (and what to watch out for!)
5 Steps for Building an Ideal Company culture (and what to watch out for!)
 

Similar to Hadoop World 2011: Changing Company Culture with Hadoop - Amy O'Connor, Nokia

Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
m_hepburn
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform Architecture
Odinot Stanislas
 
Hadoop for shanghai dev meetup
Hadoop for shanghai dev meetupHadoop for shanghai dev meetup
Hadoop for shanghai dev meetup
Roby Chen
 
Big Data: Beyond the "Bigness" and the Technology (webcast)
Big Data: Beyond the "Bigness" and the Technology (webcast)Big Data: Beyond the "Bigness" and the Technology (webcast)
Big Data: Beyond the "Bigness" and the Technology (webcast)
Apigee | Google Cloud
 

Similar to Hadoop World 2011: Changing Company Culture with Hadoop - Amy O'Connor, Nokia (20)

Hadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation ArchitecturesHadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation Architectures
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform Architecture
 
Hadoop for shanghai dev meetup
Hadoop for shanghai dev meetupHadoop for shanghai dev meetup
Hadoop for shanghai dev meetup
 
Cetas Analytics as a Service for Predictive Analytics
Cetas Analytics as a Service for Predictive AnalyticsCetas Analytics as a Service for Predictive Analytics
Cetas Analytics as a Service for Predictive Analytics
 
Cetas Predictive Analytics Prezo
Cetas Predictive Analytics PrezoCetas Predictive Analytics Prezo
Cetas Predictive Analytics Prezo
 
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of SybaseBig Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
 
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
 
SAP HORTONWORKS
SAP HORTONWORKSSAP HORTONWORKS
SAP HORTONWORKS
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for Windows
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinery
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve Loughran
 
Big Data: Beyond the "Bigness" and the Technology (webcast)
Big Data: Beyond the "Bigness" and the Technology (webcast)Big Data: Beyond the "Bigness" and the Technology (webcast)
Big Data: Beyond the "Bigness" and the Technology (webcast)
 
Big data use cases
Big data use casesBig data use cases
Big data use cases
 
Big Data = Big Decisions
Big Data = Big DecisionsBig Data = Big Decisions
Big Data = Big Decisions
 
Building Big Data Applications
Building Big Data ApplicationsBuilding Big Data Applications
Building Big Data Applications
 
Big Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage StrategyBig Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage Strategy
 
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and BeyondHadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
 
Introducing Splunk – The Big Data Engine
Introducing Splunk – The Big Data EngineIntroducing Splunk – The Big Data Engine
Introducing Splunk – The Big Data Engine
 
hadoop @ Ibmbigdata
hadoop @ Ibmbigdatahadoop @ Ibmbigdata
hadoop @ Ibmbigdata
 

More from Cloudera, Inc.

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Hadoop World 2011: Changing Company Culture with Hadoop - Amy O'Connor, Nokia

  • 1. Using Hadoop to Change Company Culture Amy O’Connor Senior Director, Analytics 1
  • 5. Data is our newest raw material 5
  • 6. TRAFFIC US Government data suggests worsening conditions in urban areas…this year average commute time has risen by 9 minutes. Federal Highway Commission, Urban Congestion Report January 2011 to March 2011 6
  • 7. TRAFFIC Global car sales are growing: there will be 50% more cars sold in 10 years as there are today. IHS Automotive, October 2011 7
  • 8. More data usually beats better algorithms Anand Rajaramann 8
  • 9. Devices in • Image Sensors use around • Accelerometers the world • Gyroscopes • Compasses • Pressure Sensors Probe points • Microphones collected monthly from • Light Sensors Nokia alone • Assisted GPS 9
  • 10. 18M 24 80M 10
  • 11. Data Storage & Analysis Landscape 11
  • 12. Data Silos Traffic Search Consumer Probes Logs Profile Ad Places Device Data Registry Data 12
  • 13. Smart Data Combining sets of behavioral & contextual data 13
  • 14. A Good Way to Change Corporate Culture 14
  • 15. Getting Children to Eat Peas… Tell them you expect them to eat their peas. Reward them with ice cream if they did. Explain why it’s good for them to eat their peas. Eat your own peas as a good role model. Leann Lipps Birch, Head of Human Development & Family Studies Pennsylvania State University 15
  • 16. Getting Children to Eat Peas… Put them with children who love peas. Change the stories they tell. Leann Lipps Birch, Head of Human Development & Family Studies Pennsylvania State University 16
  • 17. Identity Media Care & Marketing Consumer Profile Products, Transactions Device/User CRM SSO Songs, Delivery Activation Info Device Activation Net Promotor Advertising Campaigns PC Suite Ad Inventory Contacts Campaign Promotions Location Navteq Ad Canvas Premium Content All Probes Data Favorite Routes 3D Imagery Nokia Log Files Street View Data Map Tiles Feature Recognition Asset Imagery Device Programs Social Search NAC, IIA Social UGC Log Files Windows Phone Universal Share Points of Interest Panel data Journeys Data Nokia IT Equipment Master Event Info Registrations Factory Data Device Updates 17
  • 18. Users Analytics Decision- Domain Offline Predictive Makers Expertise Dashboards Analysis Analytics Data Domain Expertise, Key Value Analysts Statistical Skills TeraData Store Domain Expertise, Data Oozie Map Pig Hive Statistical Skills, Reduce Scientists Computer Science Flume HBas e HDFS Developers/ Domain Expertise, Scrib HBas Applications Computer Science e e FTP HBas e 18
  • 19. Collaborative Working Model Present Analytics Offline Predictive Analyze Dashboards Analysis Analytics and Aggregate Key Value TeraData Store Load Oozie Map Pig Hive Reduce Transform Flume HBas e Extract HDFS Scrib HBas e e FTP HBas Platform e 19
  • 20. Collaborative Working Model To BI Tools: Present DataOS, Customer Analytics SPSS, AD Hoc Structured Dashboard Tableau, Reporting Data Cognos Offline Predictive Analyze Hive QL Dashboards Mahout MR Analysis Analytics Rec, and and Pig Analysis Machine Engines Aggregate Queries Job Learning Key Value TeraData MR Agg Store Agg MR User Create Load Metadata Job and Job and Metadata Hive Catalog Data to Data to Interfaces Schema Oracle Teradata Oozie Map Pig Hive Monitor Reduce Create Data Develop Develop Develop Transform Library of and Model Cleanse Validation Partition MR Jobs Manage Definition Job Job Job Transform Flume HBas Catalog Define Monitor Integrate Integrate e Define Extract Data Standard and Manage Historical HDFS Streaming Custom Sources ETL Scrib Data Data ETL HBas Data Feed e e FTP HBas Platform Co-located developer clusters, pre-production cluster, product cluster e Data OS Product Teams and/or DataOS 20
  • 21. Smart Data: Behavioral & Contextual Analytics Aggregation • Update top searches table and Aggregation table in Oracle geo activity Dashboards Offline Predictive Analysis Analytics • Merge multiple data sources • Implement app logic (e.g.; round up Key Value Standard ETL latitude/longitude to TeraData Store • Clean data, remove bad record 3 decimals) and health checks • Partition data by date, hour, type Map Oozie Pig Hive • Archive raw data Reduce Flume HBas Ad Router e HDFS Scrib HBas NAC e e FTP HBas Local Search e 21
  • 23. Mapping the World 23 © 2011 Nokia Company Confidential 23
  • 24. Mapping the World Probe data indicates the location, speed, heading, time etc. about a mobile device. Billions of probe records per week. Covers almost the entire world. 24 © 2011 Nokia Company Confidential 24
  • 25. Probe Density: Urban and Arterial 100% AGR 25
  • 26. 792527719 (Jackson Blvd/Financial Pl) From Ref, One lane 26
  • 27. 27
  • 28. 24 Hours in Our Analytics Ecosystem ~2TB ingested 350M messages via scribe >3000 MR jobs 10TB processed 28
  • 29. Spreading the Word Data Asset Catalog Smart Data Newsletter Stories Realtime Dashboards 29
  • 30. The Amazing Everyday Thanks! 30