Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The new dominant companies are running on data

123 views

Published on

The cost of Digital Transformation is dropping rapidly. The technologies and methodologies are evolving to open up new opportunities for new and established corporations to drive business. We will examine specific examples of how and why a combination of robust infrastructure, cloud first and machine learning can take your company to the next level of value and efficiency.

Rich Dill, SnapLogic's enterprise solutions architect, at Big Data LDN 2017.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

The new dominant companies are running on data

  1. 1. The new dominant companies are running on data Take your company to the next level of value and efficiency Rich Dill– Enterprise Solutions Architect– rdill@snaplogic.com
  2. 2. ©2017 SnapLogic, Inc. All Rights Reserved Confidential Content 2 What problem do we want to solve? How do we get value from all this data?
  3. 3. What is the solution? Confidential Content 3 Sometimes it is not obvious to everyone involved ©2017 SnapLogic, Inc. All Rights Reserved Decisions made without facts are opinions ◦ What are the facts? Again and again and again – what are the facts? Shun wishful thinking, ignore divine revelation, forget what “the stars foretell,” avoid opinion, care not what the neighbors think, never mind the unguessable “verdict of history” – what are the facts, and to how many decimal places? You pilot always into an unknown future; facts are your single clue. Get the facts!” RH Turn your latent assets into liquid to realize their value - No longer latent but now liquid ◦ Data has to be on the move - It must be leveraged by the masses The business goal ◦ Actually deliver on the promise of transforming data into actionable information ◦ Predictive analytics improve forecasting ◦ Prescriptive analytics can guide business behaviour ◦ Geolocation analytics can improve resource utilization and inventory turns What are the results? - Delivering insights to executives yields direction - Delivering insights to line workers yields results
  4. 4. corporate overview Not everyone has the same problem Use cases are variations on a common theme Confidential Content 4 ©2017 SnapLogic, Inc. All Rights Reserved
  5. 5. Sampling of Industry Focused Use Cases Umbrella Industry Fraud Detection Upsell & Cross-sell Customer360 Fault Prediction Sentiment Analysis Personalization M & A Management Consulting Manufacturing X X Retail X X X X X X Healthcare X X Financial Services X X X X X Energy X X Logistics & Transportation X X X Services X X CPG X X X Computer Software X X Telecom X X X X X X X Deployment Pattern Data Refinery or Data Lake Pop. Hub-and-Spoke Hub-and-Spoke Data Refinery Data Refinery or Data Lake Pop. Data Refinery or Data Lake Pop. Common Data Modeling Common Data Modeling
  6. 6. Data Lake Population Data Lake Storage: S3, HDFS, Processing/Transformation Ingestion Source System 1 Source System 2 Source System 3 Source System N Pull Push Stream Streaming Database SaaS App File
  7. 7. 7 Data Refinery Data Lake OLAP Push Storage: S3, HDFS, Processing/Transformation Ingestion Pull Push Stream Source System 1 Source System 2 Source System 3 Source System N Streaming Database SaaS App File
  8. 8. 8 Common Data Model Data Lake Source System 1 Source System 2 Source System 3 Source System N Source System 1 Source System 2 Source System 3 Source System N Source System 1 Source System 2 Source System 3 Source System N HDFS, S3, Blob Staging ** Source System 1 Source System 1 Downstream Apps Push Streaming Database SaaS App File Processing/Transformation Ingestion Pull Push Stream Storage: S3, HDFS,
  9. 9. 9 Hub-and-Spoke Data Lake EDWPush Data Mart Data Mart Data Mart Data Science Workbench Pull Push Stream Storage: S3, HDFS, Processing/Transformation Ingestion Source System 1 Source System 2 Source System 3 Source System N Streaming Database SaaS App File
  10. 10. corporate overview The first solution: custom built
  11. 11. Michelangelo@Uber Confidential Content 11 Welcome my son to the machine… ©2017 SnapLogic, Inc. All Rights Reserved The problem ◦ “There were no systems in place to build reliable, uniform, and reproducible pipelines for creating and managing training and prediction data at scale.” The solution: Machine Learning as a Service ◦ ML-as-a-service platform that democratizes machine learning and makes scaling AI to meet the needs of business as easy as requesting a ride. Michelangelo consists of a mix of open source systems and components built in-house. The primary open sourced components used are HDFS, Spark, Samza, Cassandra, MLLib, XGBoost, and TensorFlow. Cost ◦ Two years ◦ $60 million Results ◦ A Wall Street Journal report claims SoftBank has been in touch with Uber with the apparent goal of buying a “multi-billion dollar stake” in the company. To date, Uber has raised close to $12 billion from investors, with its most recent valuation reportedly above $60 billion. July 25, 2017
  12. 12. ©2017 SnapLogic, Inc. All Rights Reserved Confidential Content 12 A Model feature report
  13. 13. Building on success Confidential Content 13 Both the systems and staff continue to learn and evolve ©2017 SnapLogic, Inc. All Rights Reserved “As the platform layers mature, we plan to invest in higher level tools and services to drive democratization of machine learning and better support the needs of our business” For more information ◦ https://eng.uber.com/michelangelo/
  14. 14. corporate overview The second solution: custom integration
  15. 15. The five year plan Confidential Content 15 Rome was not built in a day ©2017 SnapLogic, Inc. All Rights Reserved The problem ◦ A large multinational corporation grew in part by acquisition ◦ Technology stacks and silos as far as the eye can see ◦ They had one or more of every kind of technology ◦ They had hundreds of data warehouses and data marts The cost ◦ Implementing any new business processes were blindingly expensive, took too long and were not what the user was expecting or needed The solution ◦ Simplify, standardized, consolidate and adopt a cloud strategy ◦ Insert a Data Lake into the data lifecycle ◦ Adopt a Citizen Integrator model where ever possible The business result ◦ The combination of migration from a perpetual software license model to SaaS and the reduced labor costs of the Citizen Integrator model resulted in savings in the millions
  16. 16. The evolving data lifecycle Confidential Content 16 ©2017 SnapLogic, Inc. All Rights Reserved Data Lake Source System 1 Source System 2 Source System 3 Source System N Source System 1 Source System 2 Source System 3 Source System N Source System 1 Source System 2 Source System 3 Source System N Source System 1 Source System 2 Source System 3 Source System N Source System 1 Source System 2 Source System 3 Source System N Source System 1 Source System 2 Source System 3 Source System N EDW Data Mart Data Mart Data Mart Data Science Workbench EDW Data Mart Data Mart Data Mart Two stages, OLTP to DW and Data marts Three stages, OLTP to Data Lake, the to on shore Data marts and DW
  17. 17. Results Confidential Content 17 Happy productive business users ©2017 SnapLogic, Inc. All Rights Reserved Faster time to market for new programs with agility and LOB alignment Over 500 users from almost all business units Savings in the millions A more agile business environment
  18. 18. corporate overview The third is a solution
  19. 19. The solution approach Confidential Content 19 Business goal drive the architectural requirements ©2017 SnapLogic, Inc. All Rights Reserved The problem/business goal ◦ Obtain a customer 360 view by removing the constraints of an on-premises environment and move to a cloud-first environment where multiple departments/constituents can access data and obtain insights. Key Characteristics of a cloud-first enterprise stack: ◦ Scalable ◦ Collaborative ◦ Promotes easy data sharing ◦ Reduces on-premises maintenance overhead with auto updates The process ◦ Upgrade the cloud data warehouse ◦ Move legacy BI to a modern tool like Tableau or PowerBI, for greater data fluency ◦ Create a foundation for an AI/ML workbench for predictive analytics ◦ Use ML framework like TensorFlow from Google generates Java code that runs anywhere
  20. 20. 20 Proposed Enterprise Stack Amazon S3 Amazon EMR SnapLogic (AWS Deployed) Pull Push Stream Push Tableau Streaming Database Webservices File SAS Cognos Analytics Kafka, JMS Hbase, Hive, Dynamo, Mongo, Redshift, SQLServer, AzureSQL, Aurora, MySQL REST, SOAP Flat Files, XML, JSon, Excel, Word doc, PDF, S3, FTP/SFTP, ORC, Parquet Sources & Targets Social Media Facebook, LinkedIn, Twitter Machine Learning Integration Point
  21. 21. Key Benefits of Proposed Architecture Confidential Content 21 ©2017 SnapLogic, Inc. All Rights Reserved Enables migration in phases rather than all at once Promotes data re-use and reduces time to insight across the organization Scalable and flexible to accommodate company’s changing needs Reduced maintenance costs to enable IT to stay focused on enabling the business Complete view of the customer with real-time data updates Better focused marketing programs (less waste, higher performance) Greater customer loyalty due to more relevant customer engagement
  22. 22. Observations from the field Confidential Content 22 Some observations and a few of Rich’s rules of technology ©2017 SnapLogic, Inc. All Rights Reserved Technology is a tool, use the right one for the job ◦ It amazes me how some engineers have almost religious beliefs in their favorite technology - If the only tool you have is a hammer… Software evolves like a funnel ◦ Early releases have limitations that are fixed with later releases We work in an industry where change is constant ◦ Absolute truths can change every 5-10 years ◦ The rate of change can make you old, or keep you young. As the Iron Giant said, choose! Different technologies require different approaches and techniques ◦ I don’t code Scala like C or Cobol ◦ “A mind is like a parachute it only functions when it is open” Thomas Dewar The adoption curve entails risk… and costs ◦ There is a reason we call it the bleeding edge Open source is not free ◦ The money you save on license cost, you will spend on additional labor, plus 25%
  23. 23. ©2017 SnapLogic, Inc. All Rights Reserved Confidential Content Q & A
  24. 24. Thank You San Mateo, CA Boulder, CO New York, NY London, UK Melbourne, AUS Hyderabad, India www.snaplogic.com Rich Dill– Enterprise Solutions Architect rdill@snaplogic.com

×