THINKING BIG                                             TOGETHER                                     Demonstrating the Fu...
The New Normal    DATA DEVICES                                                                       Individuals          ...
Through 2015, organizations integrating high-           value, diverse, new information types and       sources into a coh...
WHAT DOES                                       IT TAKE?© Copyright 2012 EMC Corporation. All rights reserved.   4
1. New Applications© Copyright 2012 EMC Corporation. All rights reserved.   5
© Copyright 2012 EMC Corporation. All rights reserved.   6
2. Data Science© Copyright 2012 EMC Corporation. All rights reserved.   7
data•science art of mathematically       sophisticated data engineers       delivering insights from data into       busin...
10 Years Of Patient History       Saving Lives and Money With Data Science© Copyright 2012 EMC Corporation. All rights res...
3. The Right Platform© Copyright 2012 EMC Corporation. All rights reserved.   10
Big Data Requires a Unified Platform                                                                 COLLABORATION &      ...
Big Data Requires a Unified Platform             1          Data                                                    STRUCT...
MPP Databases         10-100x @ 1/10th            BETTER PERFORMANCE                           THE EDW COST© Copyright 201...
―What used to take 24 hours on Oracle, I can                 do in less than 10 minutes on Greenplum.‖© Copyright 2012 EMC...
Out-Of-The-Box Functionality                               Enterprise Data           MPP Database   Hadoop                ...
hadoop programmatic batch       processing at scale.© Copyright 2012 EMC Corporation. All rights reserved.   16
―We offloaded transformations to Hadoop                                      and saved money on day one.‖                 ...
IT TAKES MORE THAN                                                         ONE TOOL© Copyright 2012 EMC Corporation. All r...
Greenplum UAP Unifies MPP and HadoopAccess                 SQL            ODBC/JDBC                Java/Perl/Python     CL...
Big Data Requires a Unified Platform                                                   RICH SQL & APPLICATION SUPPORT     ...
Business Intelligence and Reporting                                                         Answering and                 ...
Predictive Analytics  End-to-end  analytics in a  single view  Multiple levels of  access, powerful  and jargon-free© Copy...
Powerful Partner Ecosystem                                                  BUSINESS         DATA     ANALYTICS           ...
Big Data Requires a Unified Platform                                                                 COLLABORATION &      ...
High Cost of Knowledge Sharing    Process breaks when    organization structure    changes    Very difficult knowledge    ...
Big Data Productivity      Real-time collaboration      for the entire team      Shared data,      shared models,      sha...
DEMONSTRATION© Copyright 2012 EMC Corporation. All rights reserved.   27
GREENPLUM CHORUS                                                         A Social Platform For                            ...
Chorus Enables CollaborativeData Science      Quickly deliver value from      your data      Share domain knowledge,      ...
OPEN SOURCE                  NOW AVAILABLE© Copyright 2012 EMC Corporation. All rights reserved.   30
Availability of the OpenChorus Project    www.openchorus.org                                   Chorus open source availabl...
The largest provider of social media data for                              enterprise use.© Copyright 2012 EMC Corporation...
© Copyright 2012 EMC Corporation. All rights reserved.   33
GNIP Twitter Access    Access to historical    Twitter feeds as Chorus    data source through    GNIP APIs    Import Twitt...
© Copyright 2012 EMC Corporation. All rights reserved.   35
Tableau 8: Think with your Data   Visual Analytics                                      Business Integration              ...
Tableau Server Integration    Provision Tableau    Workbooks from Chorus    data sources    Link and co-author    Tableau ...
© Copyright 2012 EMC Corporation. All rights reserved.   38
Kaggle Top 27© Copyright 2012 EMC Corporation. All rights reserved.                   39
Kaggle Data Scientist Resources    Solicit for data scientist    resources from Chorus    interface        – Access Kaggle...
THINKING BIG                                                         TOGETHER                                             ...
Upcoming SlideShare
Loading in...5
×

Demonstrating the Future of Data Science

1,258

Published on

Collaboration by individuals, organizations, and communities with the right tools and resources is essential in achieving success with data science. Join us for a live demonstration of how you can leverage a data science platform, an open-source model, internal and external data, analytics tools, and visualization using Hadoop. See how unprecedented access to data scientists can deliver entirely new levels of insight to push the boundaries of what’s possible. Find out what you can do NOW to move your data science efforts forward.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,258
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • SCRIPT:“For many, the ability to move data between Hadoop and a SQL analytical database is the ultimate.Not at Greenplum. We’ve gone well beyond “connectors” to allow our SQL database to access data wherever it lives.”gNet permits not only bulk movement, but also direct query access as we’ll see later. Looking at what gNet does, it not only connects the engines, but extends the massively-parallel engines with massively-parallel communications between them, and builds the necessary software layers for rapid movement and direct query access across that high-performance integration.When deployed in Greenplum’s unique Modular DCA, performance of both bulk data movement and direct data access is further enhanced because DCAs include carefully-designed switching infrastructures that assure minimum switching latency as nodes in Greenplum Database communicate directly with nodes in Greenplum HD.NOTES:
  • Our expansive partner network ensures you protect your existing investments while having the opportunity to leverage the best available technology.Greenplum has deep partnerships with industry leading organizations such as the SAS institute, Informatica and alpine data labs. Finally, we are fortunate to work with a number of leading applications providers like Silverspring networks who leverage Greenplum as a powerful backend technology. Greenplum is proud to work with this extraordinary partner ecosystem.
  • What we are announcing
  • So, we’ve solved for the platform, but remember you also need the Data Scientists. (BUILD: Add Chorus) We are also announcing that we’ve joined forces with Kaggle to solve for the supply of Data Scientists, by integrating Kaggle’s data science community with Chorus, and creating a whole new data science marketplace. (BUILD: Add + Kaggle)Kaggle, as many of you know is: The leading platform for predictive modeling competitionsOver 57K participants, from over 100 countries and 200 universitiesOffers companies a cost-effective way to harness the “cognitive surplus” of the world’s best data scientistsI’d like to now invite Anthony, CEO of Kaggle to give his perspective on this exciting new integration
  • And this is what we mean by thinking big together, with Chorus, the collaboration platform for Data Science, now open-sourced, and our partnership with Kaggle to deliver a new data science marketplace. This is big. This is solving the biggest problems facing Big Data and Data Science. This will enable organizations to reach their inner predictive enterprise.How do you learn more about this?
  • Demonstrating the Future of Data Science

    1. 1. THINKING BIG TOGETHER Demonstrating the Future of Data Science Mike Maxey Office of Strategy — Greenplum, A Division of EMC© Copyright 2012 EMC Corporation. All rights reserved. 1
    2. 2. The New Normal DATA DEVICES Individuals Law Employers Enforcement Analytic Advertising Information Marketers Services Brokers MEDICAL INTERNET Websites Data Aggregators GOVERNMENT RETAIL Data Users/Buyers Catalog Co-ops Media Credit Media List Bureaus Archives Brokers Private PHONE/ Investigators TV FINANCIAL Delivery /Lawyers Government Services Banks© Copyright 2012 EMC Corporation. All rights reserved. 2
    3. 3. Through 2015, organizations integrating high- value, diverse, new information types and sources into a coherent information management infrastructure will outperform their industry peers financially by more than 20%. Source: Gartner; Hype Cycle for Big Data, 2012; July 31, 2012© Copyright 2012 EMC Corporation. All rights reserved. 3
    4. 4. WHAT DOES IT TAKE?© Copyright 2012 EMC Corporation. All rights reserved. 4
    5. 5. 1. New Applications© Copyright 2012 EMC Corporation. All rights reserved. 5
    6. 6. © Copyright 2012 EMC Corporation. All rights reserved. 6
    7. 7. 2. Data Science© Copyright 2012 EMC Corporation. All rights reserved. 7
    8. 8. data•science art of mathematically sophisticated data engineers delivering insights from data into business decisions and systems© Copyright 2012 EMC Corporation. All rights reserved. 8
    9. 9. 10 Years Of Patient History Saving Lives and Money With Data Science© Copyright 2012 EMC Corporation. All rights reserved. 9
    10. 10. 3. The Right Platform© Copyright 2012 EMC Corporation. All rights reserved. 10
    11. 11. Big Data Requires a Unified Platform COLLABORATION & 3 People PRODUCTIVITY RICH SQL & APPLICATION SUPPORT 2 Tools 1 Data STRUCTURED UNSTRUCTURED© Copyright 2012 EMC Corporation. All rights reserved. 11
    12. 12. Big Data Requires a Unified Platform 1 Data STRUCTURED UNSTRUCTURED© Copyright 2012 EMC Corporation. All rights reserved. 12
    13. 13. MPP Databases 10-100x @ 1/10th BETTER PERFORMANCE THE EDW COST© Copyright 2012 EMC Corporation. All rights reserved. 13
    14. 14. ―What used to take 24 hours on Oracle, I can do in less than 10 minutes on Greenplum.‖© Copyright 2012 EMC Corporation. All rights reserved. 14
    15. 15. Out-Of-The-Box Functionality Enterprise Data MPP Database Hadoop Warehouse© Copyright 2012 EMC Corporation. All rights reserved. 15
    16. 16. hadoop programmatic batch processing at scale.© Copyright 2012 EMC Corporation. All rights reserved. 16
    17. 17. ―We offloaded transformations to Hadoop and saved money on day one.‖ —Top Telecommunications Company© Copyright 2012 EMC Corporation. All rights reserved. 17
    18. 18. IT TAKES MORE THAN ONE TOOL© Copyright 2012 EMC Corporation. All rights reserved. 18
    19. 19. Greenplum UAP Unifies MPP and HadoopAccess SQL ODBC/JDBC Java/Perl/Python CLI PigLatin HQL OTHER& Query PARALLEL QUERY INTEGRATION SQL PARALLEL HDFS IMPORT/EXPORT GREENPLUM DATABASE GREENPLUM HD Greenplum UAP© Copyright 2012 EMC Corporation. All rights reserved. 19
    20. 20. Big Data Requires a Unified Platform RICH SQL & APPLICATION SUPPORT 2 Tools 1 Data STRUCTURED UNSTRUCTURED© Copyright 2012 EMC Corporation. All rights reserved. 20
    21. 21. Business Intelligence and Reporting Answering and enabling new questions Extending the reach of data and insights© Copyright 2012 EMC Corporation. All rights reserved. 21
    22. 22. Predictive Analytics End-to-end analytics in a single view Multiple levels of access, powerful and jargon-free© Copyright 2012 EMC Corporation. All rights reserved. 22
    23. 23. Powerful Partner Ecosystem BUSINESS DATA ANALYTICS INTELLIGENCE INTEGRATION INDUSTRY Discovix TECHNOLOGY© Copyright 2012 EMC Corporation. All rights reserved. 23
    24. 24. Big Data Requires a Unified Platform COLLABORATION & 3 People PRODUCTIVITY RICH SQL & APPLICATION SUPPORT 2 Tools 1 Data STRUCTURED UNSTRUCTURED© Copyright 2012 EMC Corporation. All rights reserved. 24
    25. 25. High Cost of Knowledge Sharing Process breaks when organization structure changes Very difficult knowledge transfer No ―insurance policy‖ for intellectual assets© Copyright 2012 EMC Corporation. All rights reserved. 25
    26. 26. Big Data Productivity Real-time collaboration for the entire team Shared data, shared models, shared insights© Copyright 2012 EMC Corporation. All rights reserved. 26
    27. 27. DEMONSTRATION© Copyright 2012 EMC Corporation. All rights reserved. 27
    28. 28. GREENPLUM CHORUS A Social Platform For Collaborative Data Science© Copyright 2012 EMC Corporation. All rights reserved. 28
    29. 29. Chorus Enables CollaborativeData Science Quickly deliver value from your data Share domain knowledge, content, and findings Keep teams productive as organizations change© Copyright 2012 EMC Corporation. All rights reserved. 29
    30. 30. OPEN SOURCE NOW AVAILABLE© Copyright 2012 EMC Corporation. All rights reserved. 30
    31. 31. Availability of the OpenChorus Project www.openchorus.org Chorus open source available on October 23rd, 2012 Apache 2.0 license Promotes an ecosystem of data sources, applications, and data science community© Copyright 2012 EMC Corporation. All rights reserved. 31
    32. 32. The largest provider of social media data for enterprise use.© Copyright 2012 EMC Corporation. All rights reserved. 32
    33. 33. © Copyright 2012 EMC Corporation. All rights reserved. 33
    34. 34. GNIP Twitter Access Access to historical Twitter feeds as Chorus data source through GNIP APIs Import Twitter into Chorus as sandbox data© Copyright 2012 EMC Corporation. All rights reserved. 34
    35. 35. © Copyright 2012 EMC Corporation. All rights reserved. 35
    36. 36. Tableau 8: Think with your Data Visual Analytics Business Integration Fast Any Data Web & Mobile Authoring© Copyright 2012 EMC Corporation. All rights reserved. 36
    37. 37. Tableau Server Integration Provision Tableau Workbooks from Chorus data sources Link and co-author Tableau hosted work files Tag and annotate on Tableau assets from within Chorus© Copyright 2012 EMC Corporation. All rights reserved. 37
    38. 38. © Copyright 2012 EMC Corporation. All rights reserved. 38
    39. 39. Kaggle Top 27© Copyright 2012 EMC Corporation. All rights reserved. 39
    40. 40. Kaggle Data Scientist Resources Solicit for data scientist resources from Chorus interface – Access Kaggle data scientist profiles – Package Chorus workspace assets in project proposals – Solicit for collaboration opportunities© Copyright 2012 EMC Corporation. All rights reserved. 40
    41. 41. THINKING BIG TOGETHER greenplum.com/communities #greenplum© Copyright 2012 EMC Corporation. All rights reserved. 41

    ×