SlideShare a Scribd company logo
THINKING BIG
                                             TOGETHER
                                     Demonstrating the Future
                                        of Data Science
                                                         Mike Maxey
                                 Office of Strategy — Greenplum, A Division of EMC



© Copyright 2012 EMC Corporation. All rights reserved.                               1
The New Normal
    DATA DEVICES

                                                                       Individuals
                      Law                                                                                                                               Employers
                  Enforcement                     Analytic                                                          Advertising
                                                                                        Information                                    Marketers
                                                  Services                                Brokers


                                     MEDICAL                                                          INTERNET

                                                                                                                            Websites




                                                                                                                                                     Data
                                                                                                                                                     Aggregators
                                   GOVERNMENT                                                                     RETAIL
     Data
     Users/Buyers
                                                                                                            Catalog
                                                                                                            Co-ops


                     Media                                    Credit
                                       Media                                              List
                                                             Bureaus
                                      Archives                                          Brokers
                                                                                                                                                      Private
                                                 PHONE/                                                                                            Investigators
                                                   TV                                             FINANCIAL      Delivery                            /Lawyers
                                                                           Government
                                                                                                                 Services
                                                     Banks




© Copyright 2012 EMC Corporation. All rights reserved.                                                                                                              2
Through 2015, organizations integrating high-
           value, diverse, new information types and
       sources into a coherent information management
        infrastructure will outperform their industry
            peers financially by more than 20%.



  Source: Gartner; Hype Cycle for Big Data, 2012; July 31, 2012



© Copyright 2012 EMC Corporation. All rights reserved.            3
WHAT DOES
                                       IT TAKE?

© Copyright 2012 EMC Corporation. All rights reserved.   4
1. New Applications




© Copyright 2012 EMC Corporation. All rights reserved.   5
© Copyright 2012 EMC Corporation. All rights reserved.   6
2. Data Science




© Copyright 2012 EMC Corporation. All rights reserved.   7
data•science art of mathematically
       sophisticated data engineers
       delivering insights from data into
       business decisions and systems




© Copyright 2012 EMC Corporation. All rights reserved.   8
10 Years Of Patient History


       Saving Lives and Money With Data Science


© Copyright 2012 EMC Corporation. All rights reserved.                9
3. The Right Platform




© Copyright 2012 EMC Corporation. All rights reserved.   10
Big Data Requires a Unified Platform

                                                                 COLLABORATION &
             3       People                                       PRODUCTIVITY

                                                   RICH SQL & APPLICATION SUPPORT
             2         Tools



             1          Data
                                                    STRUCTURED     UNSTRUCTURED


© Copyright 2012 EMC Corporation. All rights reserved.                              11
Big Data Requires a Unified Platform




             1          Data
                                                    STRUCTURED   UNSTRUCTURED


© Copyright 2012 EMC Corporation. All rights reserved.                          12
MPP Databases



         10-100x @ 1/10th
            BETTER PERFORMANCE                           THE EDW COST




© Copyright 2012 EMC Corporation. All rights reserved.                  13
―What used to take 24 hours on Oracle, I can
                 do in less than 10 minutes on Greenplum.‖



© Copyright 2012 EMC Corporation. All rights reserved.         14
Out-Of-The-Box Functionality




                               Enterprise Data           MPP Database   Hadoop
                                 Warehouse



© Copyright 2012 EMC Corporation. All rights reserved.                           15
hadoop programmatic batch
       processing at scale.




© Copyright 2012 EMC Corporation. All rights reserved.   16
―We offloaded transformations to Hadoop
                                      and saved money on day one.‖

                                            —Top Telecommunications Company




© Copyright 2012 EMC Corporation. All rights reserved.                        17
IT TAKES MORE THAN

                                                         ONE TOOL

© Copyright 2012 EMC Corporation. All rights reserved.                        18
Greenplum UAP Unifies MPP and Hadoop
Access
                 SQL            ODBC/JDBC                Java/Perl/Python     CLI   PigLatin   HQL   OTHER
& Query



                                                             PARALLEL QUERY
                                                              INTEGRATION

                             SQL                                PARALLEL
                                                                                    HDFS
                                                             IMPORT/EXPORT



                GREENPLUM DATABASE                                            GREENPLUM HD

                                                         Greenplum UAP

© Copyright 2012 EMC Corporation. All rights reserved.                                                       19
Big Data Requires a Unified Platform



                                                   RICH SQL & APPLICATION SUPPORT
             2         Tools



             1          Data
                                                    STRUCTURED    UNSTRUCTURED


© Copyright 2012 EMC Corporation. All rights reserved.                              20
Business Intelligence and Reporting

                                                         Answering and
                                                         enabling new
                                                         questions

                                                         Extending the
                                                         reach of data and
                                                         insights




© Copyright 2012 EMC Corporation. All rights reserved.                       21
Predictive Analytics

  End-to-end
  analytics in a
  single view

  Multiple levels of
  access, powerful
  and jargon-free




© Copyright 2012 EMC Corporation. All rights reserved.   22
Powerful Partner Ecosystem
                                                  BUSINESS         DATA
     ANALYTICS                                  INTELLIGENCE   INTEGRATION    INDUSTRY




    Discovix




                                                                             TECHNOLOGY




© Copyright 2012 EMC Corporation. All rights reserved.                                    23
Big Data Requires a Unified Platform

                                                                 COLLABORATION &
             3       People                                       PRODUCTIVITY

                                                   RICH SQL & APPLICATION SUPPORT
             2         Tools



             1          Data
                                                    STRUCTURED     UNSTRUCTURED


© Copyright 2012 EMC Corporation. All rights reserved.                              24
High Cost of Knowledge Sharing

    Process breaks when
    organization structure
    changes
    Very difficult knowledge
    transfer
    No ―insurance policy‖ for
    intellectual assets


© Copyright 2012 EMC Corporation. All rights reserved.   25
Big Data Productivity


      Real-time collaboration
      for the entire team

      Shared data,
      shared models,
      shared insights




© Copyright 2012 EMC Corporation. All rights reserved.   26
DEMONSTRATION


© Copyright 2012 EMC Corporation. All rights reserved.   27
GREENPLUM CHORUS

                                                         A Social Platform For
                                                         Collaborative
                                                         Data Science


© Copyright 2012 EMC Corporation. All rights reserved.                           28
Chorus Enables Collaborative
Data Science
      Quickly deliver value from
      your data
      Share domain knowledge,
      content, and findings
      Keep teams productive as
      organizations change


© Copyright 2012 EMC Corporation. All rights reserved.   29
OPEN SOURCE
                  NOW AVAILABLE

© Copyright 2012 EMC Corporation. All rights reserved.   30
Availability of the OpenChorus Project

    www.openchorus.org                                   Chorus open source available
                                                         on October 23rd, 2012
                                                         Apache 2.0 license
                                                         Promotes an ecosystem of
                                                         data sources, applications,
                                                         and data science community



© Copyright 2012 EMC Corporation. All rights reserved.                                  31
The largest provider of social media data for
                              enterprise use.




© Copyright 2012 EMC Corporation. All rights reserved.         32
© Copyright 2012 EMC Corporation. All rights reserved.   33
GNIP Twitter Access
    Access to historical
    Twitter feeds as Chorus
    data source through
    GNIP APIs
    Import Twitter into
    Chorus as sandbox data




© Copyright 2012 EMC Corporation. All rights reserved.   34
© Copyright 2012 EMC Corporation. All rights reserved.   35
Tableau 8: Think with your Data
   Visual Analytics                                      Business Integration




                                                         Fast


   Any Data




                                                         Web & Mobile
                                                         Authoring

© Copyright 2012 EMC Corporation. All rights reserved.                          36
Tableau Server Integration
    Provision Tableau
    Workbooks from Chorus
    data sources
    Link and co-author
    Tableau hosted work files
    Tag and annotate on
    Tableau assets from within
    Chorus

© Copyright 2012 EMC Corporation. All rights reserved.   37
© Copyright 2012 EMC Corporation. All rights reserved.   38
Kaggle Top 27




© Copyright 2012 EMC Corporation. All rights reserved.                   39
Kaggle Data Scientist Resources
    Solicit for data scientist
    resources from Chorus
    interface
        – Access Kaggle data scientist
          profiles
        – Package Chorus workspace
          assets in project proposals
        – Solicit for collaboration
          opportunities



© Copyright 2012 EMC Corporation. All rights reserved.   40
THINKING BIG
                                                         TOGETHER
                                                         greenplum.com/communities


                                                                 #greenplum



© Copyright 2012 EMC Corporation. All rights reserved.                               41

More Related Content

What's hot

Greenplum: Driving the future of Data Warehousing and Analytics
Greenplum: Driving the future of Data Warehousing and AnalyticsGreenplum: Driving the future of Data Warehousing and Analytics
Greenplum: Driving the future of Data Warehousing and Analyticseaiti
 
EMC Greenplum Database version 4.2
EMC Greenplum Database version 4.2 EMC Greenplum Database version 4.2
EMC Greenplum Database version 4.2
EMC
 
The IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse ApplianceThe IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse Appliance
IBM Sverige
 
The IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse applianceThe IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse appliance
IBM Danmark
 
Netezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataNetezza vs Teradata vs Exadata
Netezza vs Teradata vs Exadata
Asis Mohanty
 
Ibm pure data system for analytics n200x
Ibm pure data system for analytics n200xIbm pure data system for analytics n200x
Ibm pure data system for analytics n200x
IBM Sverige
 
Green Plum IIIT- Allahabad
Green Plum IIIT- Allahabad Green Plum IIIT- Allahabad
Green Plum IIIT- Allahabad
IIIT ALLAHABAD
 
Analytics on Hadoop
Analytics on HadoopAnalytics on Hadoop
Analytics on Hadoop
EMC
 
IBM Pure Data System for Analytics (Netezza)
IBM Pure Data System for Analytics (Netezza)IBM Pure Data System for Analytics (Netezza)
IBM Pure Data System for Analytics (Netezza)
Girish Srivastava
 
Teradata vs-exadata
Teradata vs-exadataTeradata vs-exadata
Teradata vs-exadataLouis liu
 
Ibm pure data system for analytics n3001
Ibm pure data system for analytics n3001Ibm pure data system for analytics n3001
Ibm pure data system for analytics n3001
Abhishek Satyam
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10keirdo1
 
Ibm db2 analytics accelerator high availability and disaster recovery
Ibm db2 analytics accelerator  high availability and disaster recoveryIbm db2 analytics accelerator  high availability and disaster recovery
Ibm db2 analytics accelerator high availability and disaster recovery
bupbechanhgmail
 
Netezza vs teradata
Netezza vs teradataNetezza vs teradata
Netezza vs teradata
Asis Mohanty
 
Teradata - Architecture of Teradata
Teradata - Architecture of TeradataTeradata - Architecture of Teradata
Teradata - Architecture of Teradata
Vibrant Technologies & Computers
 
SQL-H a new way to enable SQL analytics
SQL-H a new way to enable SQL analyticsSQL-H a new way to enable SQL analytics
SQL-H a new way to enable SQL analyticsDataWorks Summit
 
White Paper: Hadoop on EMC Isilon Scale-out NAS
White Paper: Hadoop on EMC Isilon Scale-out NAS   White Paper: Hadoop on EMC Isilon Scale-out NAS
White Paper: Hadoop on EMC Isilon Scale-out NAS
EMC
 
Netezza pure data
Netezza pure dataNetezza pure data
Netezza pure data
Hossein Sarshar
 

What's hot (19)

Greenplum: Driving the future of Data Warehousing and Analytics
Greenplum: Driving the future of Data Warehousing and AnalyticsGreenplum: Driving the future of Data Warehousing and Analytics
Greenplum: Driving the future of Data Warehousing and Analytics
 
EMC Greenplum Database version 4.2
EMC Greenplum Database version 4.2 EMC Greenplum Database version 4.2
EMC Greenplum Database version 4.2
 
The IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse ApplianceThe IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse Appliance
 
The IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse applianceThe IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse appliance
 
Netezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataNetezza vs Teradata vs Exadata
Netezza vs Teradata vs Exadata
 
Ibm pure data system for analytics n200x
Ibm pure data system for analytics n200xIbm pure data system for analytics n200x
Ibm pure data system for analytics n200x
 
Green Plum IIIT- Allahabad
Green Plum IIIT- Allahabad Green Plum IIIT- Allahabad
Green Plum IIIT- Allahabad
 
Analytics on Hadoop
Analytics on HadoopAnalytics on Hadoop
Analytics on Hadoop
 
IBM Pure Data System for Analytics (Netezza)
IBM Pure Data System for Analytics (Netezza)IBM Pure Data System for Analytics (Netezza)
IBM Pure Data System for Analytics (Netezza)
 
Teradata vs-exadata
Teradata vs-exadataTeradata vs-exadata
Teradata vs-exadata
 
Ibm pure data system for analytics n3001
Ibm pure data system for analytics n3001Ibm pure data system for analytics n3001
Ibm pure data system for analytics n3001
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10
 
Ibm db2 analytics accelerator high availability and disaster recovery
Ibm db2 analytics accelerator  high availability and disaster recoveryIbm db2 analytics accelerator  high availability and disaster recovery
Ibm db2 analytics accelerator high availability and disaster recovery
 
Netezza vs teradata
Netezza vs teradataNetezza vs teradata
Netezza vs teradata
 
Teradata - Architecture of Teradata
Teradata - Architecture of TeradataTeradata - Architecture of Teradata
Teradata - Architecture of Teradata
 
SQL-H a new way to enable SQL analytics
SQL-H a new way to enable SQL analyticsSQL-H a new way to enable SQL analytics
SQL-H a new way to enable SQL analytics
 
White Paper: Hadoop on EMC Isilon Scale-out NAS
White Paper: Hadoop on EMC Isilon Scale-out NAS   White Paper: Hadoop on EMC Isilon Scale-out NAS
White Paper: Hadoop on EMC Isilon Scale-out NAS
 
1 ieee98
1 ieee981 ieee98
1 ieee98
 
Netezza pure data
Netezza pure dataNetezza pure data
Netezza pure data
 

Similar to Demonstrating the Future of Data Science

Manoj Chugh - Welcome Note and Changing Role of CIO's
Manoj Chugh - Welcome Note and Changing Role of CIO'sManoj Chugh - Welcome Note and Changing Role of CIO's
Manoj Chugh - Welcome Note and Changing Role of CIO'sEMC Forum India
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
EMC
 
EMC Forum India 2011, Day 2 - Welcome Note by Manoj Chugh
EMC Forum India 2011, Day 2 - Welcome Note by Manoj ChughEMC Forum India 2011, Day 2 - Welcome Note by Manoj Chugh
EMC Forum India 2011, Day 2 - Welcome Note by Manoj ChughEMC Forum India
 
Face to Face with Big Data
Face to Face with Big Data Face to Face with Big Data
Face to Face with Big Data
EMC
 
Open Data for Enterprises
Open Data for EnterprisesOpen Data for Enterprises
Open Data for Enterprises
Andreas Blumauer
 
M12S13 - RIM for the Next Generation: A Call to Action
 M12S13 - RIM for the Next Generation: A Call to Action M12S13 - RIM for the Next Generation: A Call to Action
M12S13 - RIM for the Next Generation: A Call to Action
MER Conference
 
Rob anderson
Rob andersonRob anderson
Rob andersonEduserv
 
Big data cloud cloud circle keynote_final laura colvine 8th november 2012
Big data cloud cloud circle keynote_final laura colvine 8th november 2012Big data cloud cloud circle keynote_final laura colvine 8th november 2012
Big data cloud cloud circle keynote_final laura colvine 8th november 2012IBM
 
M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems
 M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems
M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems
MER Conference
 
Partnership for the Private Cloud
Partnership for the Private CloudPartnership for the Private Cloud
Partnership for the Private Cloud
Cisco Canada
 
Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisaties
Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisatiesData Pioneers - Roland Haeve (Atos Nederland) - Big data in organisaties
Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisaties
Multiscope
 
Informatica Presents: 10 Best Practices for Successful MDM Implementations fr...
Informatica Presents: 10 Best Practices for Successful MDM Implementations fr...Informatica Presents: 10 Best Practices for Successful MDM Implementations fr...
Informatica Presents: 10 Best Practices for Successful MDM Implementations fr...DATAVERSITY
 
Cw13 cloud meets big data by ibrahim alloub-emc
Cw13 cloud meets big data by ibrahim alloub-emcCw13 cloud meets big data by ibrahim alloub-emc
Cw13 cloud meets big data by ibrahim alloub-emcinevitablecloud
 
Keynote by Mario Derba at Optimized Data Center event, Milano
Keynote by Mario Derba at Optimized Data Center event, MilanoKeynote by Mario Derba at Optimized Data Center event, Milano
Keynote by Mario Derba at Optimized Data Center event, Milano
Mario Derba
 
MITA Beyond MMIS Presentation
MITA Beyond MMIS PresentationMITA Beyond MMIS Presentation
MITA Beyond MMIS Presentation
REMilk
 
Keynote by Mario Derba at Oracle Optimized Data Center event in Paris
Keynote by Mario Derba at Oracle Optimized Data Center event in Paris Keynote by Mario Derba at Oracle Optimized Data Center event in Paris
Keynote by Mario Derba at Oracle Optimized Data Center event in Paris
Mario Derba
 
Keynote - Randy Newell of IBM
Keynote - Randy Newell of IBMKeynote - Randy Newell of IBM
Keynote - Randy Newell of IBM
jowen_evansdata
 

Similar to Demonstrating the Future of Data Science (20)

Manoj Chugh - Welcome Note and Changing Role of CIO's
Manoj Chugh - Welcome Note and Changing Role of CIO'sManoj Chugh - Welcome Note and Changing Role of CIO's
Manoj Chugh - Welcome Note and Changing Role of CIO's
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
EMC Forum India 2011, Day 2 - Welcome Note by Manoj Chugh
EMC Forum India 2011, Day 2 - Welcome Note by Manoj ChughEMC Forum India 2011, Day 2 - Welcome Note by Manoj Chugh
EMC Forum India 2011, Day 2 - Welcome Note by Manoj Chugh
 
Face to Face with Big Data
Face to Face with Big Data Face to Face with Big Data
Face to Face with Big Data
 
Open Data for Enterprises
Open Data for EnterprisesOpen Data for Enterprises
Open Data for Enterprises
 
M12S13 - RIM for the Next Generation: A Call to Action
 M12S13 - RIM for the Next Generation: A Call to Action M12S13 - RIM for the Next Generation: A Call to Action
M12S13 - RIM for the Next Generation: A Call to Action
 
Rob anderson
Rob andersonRob anderson
Rob anderson
 
101 ab 1445-1515
101 ab 1445-1515101 ab 1445-1515
101 ab 1445-1515
 
101 ab 1445-1515
101 ab 1445-1515101 ab 1445-1515
101 ab 1445-1515
 
Big data cloud cloud circle keynote_final laura colvine 8th november 2012
Big data cloud cloud circle keynote_final laura colvine 8th november 2012Big data cloud cloud circle keynote_final laura colvine 8th november 2012
Big data cloud cloud circle keynote_final laura colvine 8th november 2012
 
M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems
 M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems
M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems
 
Partnership for the Private Cloud
Partnership for the Private CloudPartnership for the Private Cloud
Partnership for the Private Cloud
 
Greenplum hadoop
Greenplum hadoopGreenplum hadoop
Greenplum hadoop
 
Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisaties
Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisatiesData Pioneers - Roland Haeve (Atos Nederland) - Big data in organisaties
Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisaties
 
Informatica Presents: 10 Best Practices for Successful MDM Implementations fr...
Informatica Presents: 10 Best Practices for Successful MDM Implementations fr...Informatica Presents: 10 Best Practices for Successful MDM Implementations fr...
Informatica Presents: 10 Best Practices for Successful MDM Implementations fr...
 
Cw13 cloud meets big data by ibrahim alloub-emc
Cw13 cloud meets big data by ibrahim alloub-emcCw13 cloud meets big data by ibrahim alloub-emc
Cw13 cloud meets big data by ibrahim alloub-emc
 
Keynote by Mario Derba at Optimized Data Center event, Milano
Keynote by Mario Derba at Optimized Data Center event, MilanoKeynote by Mario Derba at Optimized Data Center event, Milano
Keynote by Mario Derba at Optimized Data Center event, Milano
 
MITA Beyond MMIS Presentation
MITA Beyond MMIS PresentationMITA Beyond MMIS Presentation
MITA Beyond MMIS Presentation
 
Keynote by Mario Derba at Oracle Optimized Data Center event in Paris
Keynote by Mario Derba at Oracle Optimized Data Center event in Paris Keynote by Mario Derba at Oracle Optimized Data Center event in Paris
Keynote by Mario Derba at Oracle Optimized Data Center event in Paris
 
Keynote - Randy Newell of IBM
Keynote - Randy Newell of IBMKeynote - Randy Newell of IBM
Keynote - Randy Newell of IBM
 

Recently uploaded

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 

Recently uploaded (20)

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 

Demonstrating the Future of Data Science

  • 1. THINKING BIG TOGETHER Demonstrating the Future of Data Science Mike Maxey Office of Strategy — Greenplum, A Division of EMC © Copyright 2012 EMC Corporation. All rights reserved. 1
  • 2. The New Normal DATA DEVICES Individuals Law Employers Enforcement Analytic Advertising Information Marketers Services Brokers MEDICAL INTERNET Websites Data Aggregators GOVERNMENT RETAIL Data Users/Buyers Catalog Co-ops Media Credit Media List Bureaus Archives Brokers Private PHONE/ Investigators TV FINANCIAL Delivery /Lawyers Government Services Banks © Copyright 2012 EMC Corporation. All rights reserved. 2
  • 3. Through 2015, organizations integrating high- value, diverse, new information types and sources into a coherent information management infrastructure will outperform their industry peers financially by more than 20%. Source: Gartner; Hype Cycle for Big Data, 2012; July 31, 2012 © Copyright 2012 EMC Corporation. All rights reserved. 3
  • 4. WHAT DOES IT TAKE? © Copyright 2012 EMC Corporation. All rights reserved. 4
  • 5. 1. New Applications © Copyright 2012 EMC Corporation. All rights reserved. 5
  • 6. © Copyright 2012 EMC Corporation. All rights reserved. 6
  • 7. 2. Data Science © Copyright 2012 EMC Corporation. All rights reserved. 7
  • 8. data•science art of mathematically sophisticated data engineers delivering insights from data into business decisions and systems © Copyright 2012 EMC Corporation. All rights reserved. 8
  • 9. 10 Years Of Patient History Saving Lives and Money With Data Science © Copyright 2012 EMC Corporation. All rights reserved. 9
  • 10. 3. The Right Platform © Copyright 2012 EMC Corporation. All rights reserved. 10
  • 11. Big Data Requires a Unified Platform COLLABORATION & 3 People PRODUCTIVITY RICH SQL & APPLICATION SUPPORT 2 Tools 1 Data STRUCTURED UNSTRUCTURED © Copyright 2012 EMC Corporation. All rights reserved. 11
  • 12. Big Data Requires a Unified Platform 1 Data STRUCTURED UNSTRUCTURED © Copyright 2012 EMC Corporation. All rights reserved. 12
  • 13. MPP Databases 10-100x @ 1/10th BETTER PERFORMANCE THE EDW COST © Copyright 2012 EMC Corporation. All rights reserved. 13
  • 14. ―What used to take 24 hours on Oracle, I can do in less than 10 minutes on Greenplum.‖ © Copyright 2012 EMC Corporation. All rights reserved. 14
  • 15. Out-Of-The-Box Functionality Enterprise Data MPP Database Hadoop Warehouse © Copyright 2012 EMC Corporation. All rights reserved. 15
  • 16. hadoop programmatic batch processing at scale. © Copyright 2012 EMC Corporation. All rights reserved. 16
  • 17. ―We offloaded transformations to Hadoop and saved money on day one.‖ —Top Telecommunications Company © Copyright 2012 EMC Corporation. All rights reserved. 17
  • 18. IT TAKES MORE THAN ONE TOOL © Copyright 2012 EMC Corporation. All rights reserved. 18
  • 19. Greenplum UAP Unifies MPP and Hadoop Access SQL ODBC/JDBC Java/Perl/Python CLI PigLatin HQL OTHER & Query PARALLEL QUERY INTEGRATION SQL PARALLEL HDFS IMPORT/EXPORT GREENPLUM DATABASE GREENPLUM HD Greenplum UAP © Copyright 2012 EMC Corporation. All rights reserved. 19
  • 20. Big Data Requires a Unified Platform RICH SQL & APPLICATION SUPPORT 2 Tools 1 Data STRUCTURED UNSTRUCTURED © Copyright 2012 EMC Corporation. All rights reserved. 20
  • 21. Business Intelligence and Reporting Answering and enabling new questions Extending the reach of data and insights © Copyright 2012 EMC Corporation. All rights reserved. 21
  • 22. Predictive Analytics End-to-end analytics in a single view Multiple levels of access, powerful and jargon-free © Copyright 2012 EMC Corporation. All rights reserved. 22
  • 23. Powerful Partner Ecosystem BUSINESS DATA ANALYTICS INTELLIGENCE INTEGRATION INDUSTRY Discovix TECHNOLOGY © Copyright 2012 EMC Corporation. All rights reserved. 23
  • 24. Big Data Requires a Unified Platform COLLABORATION & 3 People PRODUCTIVITY RICH SQL & APPLICATION SUPPORT 2 Tools 1 Data STRUCTURED UNSTRUCTURED © Copyright 2012 EMC Corporation. All rights reserved. 24
  • 25. High Cost of Knowledge Sharing Process breaks when organization structure changes Very difficult knowledge transfer No ―insurance policy‖ for intellectual assets © Copyright 2012 EMC Corporation. All rights reserved. 25
  • 26. Big Data Productivity Real-time collaboration for the entire team Shared data, shared models, shared insights © Copyright 2012 EMC Corporation. All rights reserved. 26
  • 27. DEMONSTRATION © Copyright 2012 EMC Corporation. All rights reserved. 27
  • 28. GREENPLUM CHORUS A Social Platform For Collaborative Data Science © Copyright 2012 EMC Corporation. All rights reserved. 28
  • 29. Chorus Enables Collaborative Data Science Quickly deliver value from your data Share domain knowledge, content, and findings Keep teams productive as organizations change © Copyright 2012 EMC Corporation. All rights reserved. 29
  • 30. OPEN SOURCE NOW AVAILABLE © Copyright 2012 EMC Corporation. All rights reserved. 30
  • 31. Availability of the OpenChorus Project www.openchorus.org Chorus open source available on October 23rd, 2012 Apache 2.0 license Promotes an ecosystem of data sources, applications, and data science community © Copyright 2012 EMC Corporation. All rights reserved. 31
  • 32. The largest provider of social media data for enterprise use. © Copyright 2012 EMC Corporation. All rights reserved. 32
  • 33. © Copyright 2012 EMC Corporation. All rights reserved. 33
  • 34. GNIP Twitter Access Access to historical Twitter feeds as Chorus data source through GNIP APIs Import Twitter into Chorus as sandbox data © Copyright 2012 EMC Corporation. All rights reserved. 34
  • 35. © Copyright 2012 EMC Corporation. All rights reserved. 35
  • 36. Tableau 8: Think with your Data Visual Analytics Business Integration Fast Any Data Web & Mobile Authoring © Copyright 2012 EMC Corporation. All rights reserved. 36
  • 37. Tableau Server Integration Provision Tableau Workbooks from Chorus data sources Link and co-author Tableau hosted work files Tag and annotate on Tableau assets from within Chorus © Copyright 2012 EMC Corporation. All rights reserved. 37
  • 38. © Copyright 2012 EMC Corporation. All rights reserved. 38
  • 39. Kaggle Top 27 © Copyright 2012 EMC Corporation. All rights reserved. 39
  • 40. Kaggle Data Scientist Resources Solicit for data scientist resources from Chorus interface – Access Kaggle data scientist profiles – Package Chorus workspace assets in project proposals – Solicit for collaboration opportunities © Copyright 2012 EMC Corporation. All rights reserved. 40
  • 41. THINKING BIG TOGETHER greenplum.com/communities #greenplum © Copyright 2012 EMC Corporation. All rights reserved. 41

Editor's Notes

  1. SCRIPT:“For many, the ability to move data between Hadoop and a SQL analytical database is the ultimate.Not at Greenplum. We’ve gone well beyond “connectors” to allow our SQL database to access data wherever it lives.”gNet permits not only bulk movement, but also direct query access as we’ll see later. Looking at what gNet does, it not only connects the engines, but extends the massively-parallel engines with massively-parallel communications between them, and builds the necessary software layers for rapid movement and direct query access across that high-performance integration.When deployed in Greenplum’s unique Modular DCA, performance of both bulk data movement and direct data access is further enhanced because DCAs include carefully-designed switching infrastructures that assure minimum switching latency as nodes in Greenplum Database communicate directly with nodes in Greenplum HD.NOTES:
  2. Our expansive partner network ensures you protect your existing investments while having the opportunity to leverage the best available technology.Greenplum has deep partnerships with industry leading organizations such as the SAS institute, Informatica and alpine data labs. Finally, we are fortunate to work with a number of leading applications providers like Silverspring networks who leverage Greenplum as a powerful backend technology. Greenplum is proud to work with this extraordinary partner ecosystem.
  3. What we are announcing
  4. So, we’ve solved for the platform, but remember you also need the Data Scientists. (BUILD: Add Chorus) We are also announcing that we’ve joined forces with Kaggle to solve for the supply of Data Scientists, by integrating Kaggle’s data science community with Chorus, and creating a whole new data science marketplace. (BUILD: Add + Kaggle)Kaggle, as many of you know is: The leading platform for predictive modeling competitionsOver 57K participants, from over 100 countries and 200 universitiesOffers companies a cost-effective way to harness the “cognitive surplus” of the world’s best data scientistsI’d like to now invite Anthony, CEO of Kaggle to give his perspective on this exciting new integration
  5. And this is what we mean by thinking big together, with Chorus, the collaboration platform for Data Science, now open-sourced, and our partnership with Kaggle to deliver a new data science marketplace. This is big. This is solving the biggest problems facing Big Data and Data Science. This will enable organizations to reach their inner predictive enterprise.How do you learn more about this?