Demonstrating the Future of Data Science

1,724 views

Published on

Collaboration by individuals, organizations, and communities with the right tools and resources is essential in achieving success with data science. Join us for a live demonstration of how you can leverage a data science platform, an open-source model, internal and external data, analytics tools, and visualization using Hadoop. See how unprecedented access to data scientists can deliver entirely new levels of insight to push the boundaries of what’s possible. Find out what you can do NOW to move your data science efforts forward.

Published in: Technology
  • Be the first to comment

Demonstrating the Future of Data Science

  1. 1. THINKING BIG TOGETHER Demonstrating the Future of Data Science Mike Maxey Office of Strategy — Greenplum, A Division of EMC© Copyright 2012 EMC Corporation. All rights reserved. 1
  2. 2. The New Normal DATA DEVICES Individuals Law Employers Enforcement Analytic Advertising Information Marketers Services Brokers MEDICAL INTERNET Websites Data Aggregators GOVERNMENT RETAIL Data Users/Buyers Catalog Co-ops Media Credit Media List Bureaus Archives Brokers Private PHONE/ Investigators TV FINANCIAL Delivery /Lawyers Government Services Banks© Copyright 2012 EMC Corporation. All rights reserved. 2
  3. 3. Through 2015, organizations integrating high- value, diverse, new information types and sources into a coherent information management infrastructure will outperform their industry peers financially by more than 20%. Source: Gartner; Hype Cycle for Big Data, 2012; July 31, 2012© Copyright 2012 EMC Corporation. All rights reserved. 3
  4. 4. WHAT DOES IT TAKE?© Copyright 2012 EMC Corporation. All rights reserved. 4
  5. 5. 1. New Applications© Copyright 2012 EMC Corporation. All rights reserved. 5
  6. 6. © Copyright 2012 EMC Corporation. All rights reserved. 6
  7. 7. 2. Data Science© Copyright 2012 EMC Corporation. All rights reserved. 7
  8. 8. data•science art of mathematically sophisticated data engineers delivering insights from data into business decisions and systems© Copyright 2012 EMC Corporation. All rights reserved. 8
  9. 9. 10 Years Of Patient History Saving Lives and Money With Data Science© Copyright 2012 EMC Corporation. All rights reserved. 9
  10. 10. 3. The Right Platform© Copyright 2012 EMC Corporation. All rights reserved. 10
  11. 11. Big Data Requires a Unified Platform COLLABORATION & 3 People PRODUCTIVITY RICH SQL & APPLICATION SUPPORT 2 Tools 1 Data STRUCTURED UNSTRUCTURED© Copyright 2012 EMC Corporation. All rights reserved. 11
  12. 12. Big Data Requires a Unified Platform 1 Data STRUCTURED UNSTRUCTURED© Copyright 2012 EMC Corporation. All rights reserved. 12
  13. 13. MPP Databases 10-100x @ 1/10th BETTER PERFORMANCE THE EDW COST© Copyright 2012 EMC Corporation. All rights reserved. 13
  14. 14. ―What used to take 24 hours on Oracle, I can do in less than 10 minutes on Greenplum.‖© Copyright 2012 EMC Corporation. All rights reserved. 14
  15. 15. Out-Of-The-Box Functionality Enterprise Data MPP Database Hadoop Warehouse© Copyright 2012 EMC Corporation. All rights reserved. 15
  16. 16. hadoop programmatic batch processing at scale.© Copyright 2012 EMC Corporation. All rights reserved. 16
  17. 17. ―We offloaded transformations to Hadoop and saved money on day one.‖ —Top Telecommunications Company© Copyright 2012 EMC Corporation. All rights reserved. 17
  18. 18. IT TAKES MORE THAN ONE TOOL© Copyright 2012 EMC Corporation. All rights reserved. 18
  19. 19. Greenplum UAP Unifies MPP and HadoopAccess SQL ODBC/JDBC Java/Perl/Python CLI PigLatin HQL OTHER& Query PARALLEL QUERY INTEGRATION SQL PARALLEL HDFS IMPORT/EXPORT GREENPLUM DATABASE GREENPLUM HD Greenplum UAP© Copyright 2012 EMC Corporation. All rights reserved. 19
  20. 20. Big Data Requires a Unified Platform RICH SQL & APPLICATION SUPPORT 2 Tools 1 Data STRUCTURED UNSTRUCTURED© Copyright 2012 EMC Corporation. All rights reserved. 20
  21. 21. Business Intelligence and Reporting Answering and enabling new questions Extending the reach of data and insights© Copyright 2012 EMC Corporation. All rights reserved. 21
  22. 22. Predictive Analytics End-to-end analytics in a single view Multiple levels of access, powerful and jargon-free© Copyright 2012 EMC Corporation. All rights reserved. 22
  23. 23. Powerful Partner Ecosystem BUSINESS DATA ANALYTICS INTELLIGENCE INTEGRATION INDUSTRY Discovix TECHNOLOGY© Copyright 2012 EMC Corporation. All rights reserved. 23
  24. 24. Big Data Requires a Unified Platform COLLABORATION & 3 People PRODUCTIVITY RICH SQL & APPLICATION SUPPORT 2 Tools 1 Data STRUCTURED UNSTRUCTURED© Copyright 2012 EMC Corporation. All rights reserved. 24
  25. 25. High Cost of Knowledge Sharing Process breaks when organization structure changes Very difficult knowledge transfer No ―insurance policy‖ for intellectual assets© Copyright 2012 EMC Corporation. All rights reserved. 25
  26. 26. Big Data Productivity Real-time collaboration for the entire team Shared data, shared models, shared insights© Copyright 2012 EMC Corporation. All rights reserved. 26
  27. 27. DEMONSTRATION© Copyright 2012 EMC Corporation. All rights reserved. 27
  28. 28. GREENPLUM CHORUS A Social Platform For Collaborative Data Science© Copyright 2012 EMC Corporation. All rights reserved. 28
  29. 29. Chorus Enables CollaborativeData Science Quickly deliver value from your data Share domain knowledge, content, and findings Keep teams productive as organizations change© Copyright 2012 EMC Corporation. All rights reserved. 29
  30. 30. OPEN SOURCE NOW AVAILABLE© Copyright 2012 EMC Corporation. All rights reserved. 30
  31. 31. Availability of the OpenChorus Project www.openchorus.org Chorus open source available on October 23rd, 2012 Apache 2.0 license Promotes an ecosystem of data sources, applications, and data science community© Copyright 2012 EMC Corporation. All rights reserved. 31
  32. 32. The largest provider of social media data for enterprise use.© Copyright 2012 EMC Corporation. All rights reserved. 32
  33. 33. © Copyright 2012 EMC Corporation. All rights reserved. 33
  34. 34. GNIP Twitter Access Access to historical Twitter feeds as Chorus data source through GNIP APIs Import Twitter into Chorus as sandbox data© Copyright 2012 EMC Corporation. All rights reserved. 34
  35. 35. © Copyright 2012 EMC Corporation. All rights reserved. 35
  36. 36. Tableau 8: Think with your Data Visual Analytics Business Integration Fast Any Data Web & Mobile Authoring© Copyright 2012 EMC Corporation. All rights reserved. 36
  37. 37. Tableau Server Integration Provision Tableau Workbooks from Chorus data sources Link and co-author Tableau hosted work files Tag and annotate on Tableau assets from within Chorus© Copyright 2012 EMC Corporation. All rights reserved. 37
  38. 38. © Copyright 2012 EMC Corporation. All rights reserved. 38
  39. 39. Kaggle Top 27© Copyright 2012 EMC Corporation. All rights reserved. 39
  40. 40. Kaggle Data Scientist Resources Solicit for data scientist resources from Chorus interface – Access Kaggle data scientist profiles – Package Chorus workspace assets in project proposals – Solicit for collaboration opportunities© Copyright 2012 EMC Corporation. All rights reserved. 40
  41. 41. THINKING BIG TOGETHER greenplum.com/communities #greenplum© Copyright 2012 EMC Corporation. All rights reserved. 41

×