Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Google на конференции Big Data Russia

4,721 views

Published on

Презентация от компании Google Russia — на конференции Big Data Russia (http://bigdatarussia.ru/).

Published in: Business

Google на конференции Big Data Russia

  1. 1. Big Data with Google Cloud Platform Focus on insight, not infrastructure Google confidential │ Do not distribute Google confidential │ Do not distribute Daniel Bergqvist Solution Engineer, Big Data Technologies Olga Strelova Cloud Platform Sales, Tel: +7 495 734-71-41, olgastrelova@google.com
  2. 2. Google confidential │ Do not distribute Why Big Data?
  3. 3. Google confidential │ Do not distribute Big Data is driving Big Value Used data from telematic sensors in over 46K vehicles to: ● Reduce daily routes by 85 million miles ● Saved 8.4 million gallons of fuel ● Saved over $30 million in miles cut/driver/day Created Snapshot device to collect data on driving habits and user behavior in real-time Calculated applicable discount to driver’s monthly premium based on their individual behavior Analyzed the activity of their entire customer base (over 7M customers and 19B images) Uncovered trends that improved customer acquisition, retention and value through optimized marketing
  4. 4. Google confidential │ Do not distribute Trends Increasing Digitization of Human & Economic Activity Falling Costs of Storage & Computing Increasing Pace of Innovation
  5. 5. Google confidential │ Do not distribute Opportunities with Big Data Recognize and seize market trends before your competitors Capture business value from information Create a smarter, learning organization 1 2 3
  6. 6. Big Data remains inaccessible Big Data is Hard Big Data is Expensive Google confidential │ Do not distribute Complex technical infrastructure to support distributed computing Requires specialized expertise Time consuming Storage costs scale with larger datasets Computing resources must be provisioned for peak-loads Personnel are expensive
  7. 7. Google is making Big Data accessible Big Data is Hard Big Data is Expensive Google confidential │ Do not distribute No complex data architecture required Use the technical and product skillsets you already have Pay on-demand for only the resources you use Take advantage of falling prices & Moore’s Law Reduce infrastructure management burden Easy Affordable Query within seconds and get real-time results
  8. 8. Google confidential │ Do not distribute Where did these come from?
  9. 9. To organize the world’s information and make it Google confidential │ Do not distribute universally accessible and useful
  10. 10. Google confidential │ Do not distribute
  11. 11. Google Services in Numbers Search 1B Searches/Month >25% of F500 (GSA) Android 1.5M+ activation per day 900+ M devices YouTube 100 hours of video uploaded per minute G+ 500M+ accounts; 135M+ active in stream Apps 500M+ Gmail Chrome 310M+ browser users Maps & Earth 1B+ downloads; 200M+ mobile; 10M+ activations on iOS Cloud Platform 4.75M+ apps; 250K+ developers
  12. 12. GFS MillWheel Google confidential │ Do not distribute Google is a pioneer in Big Data MapReduce Dremel Spanner Big Table Colossus Flume 2002 2004 2006 2008 2010 2012 2013
  13. 13. We help you manage the entire lifecycle of Big Data Open Source Tools Google confidential │ Do not distribute Store Capture Analyze BigQuery Dataflow Pub/Sub Process Storage SQL Datastore Dataflow
  14. 14. • Event management system that simplifies analytics application architecture • Connect your services with reliable, many-to-many asynchronous messaging • Guarantees that messages will be delivered whether or not all consumers are online • Provides a single global ingestion point, not dependent on zone or regional availability • Scales to what you need with no wasted capacity Google confidential │ Do not distribute Our Big Data products Computing Patterns Cloud Pub/Sub Cloud Dataflow BigQuery Open Source Tools • Successor to MapReduce and based on Google technologies, including Flume and MillWheel • Fully managed service • Create data pipelines that ingest, transform and analyze in batch or streaming mode • Takes care of deploying, maintaining and scaling infrastructure • Interactive analysis of large scale datasets, providing real-time insights • Run fast, SQL queries against virtually limitless datasets in seconds • Full visibility and control with pricing, only pay for querying and storage • No complex data architecture required • Run Hadoop and other FOSS on Cloud platform; take advantage of performance, ease of use and cost efficiency • Using cloud resources eliminates capital costs and reduces administration time • With one command line, start a cluster running Hadoop, Hive, Pig, Spark or Shark in order to get up and running quickly and without worrying about configuration hassles • Using GCP storage products allows you to take advantage of accessing data within any Hadoop deployment
  15. 15. Google confidential │ Do not distribute Lets look at specific examples
  16. 16. Google confidential │ Do not distribute 1. Marketing Analytics The Technology Using Google Cloud Platform for marketing analytics enables a deeper understanding of how marketing investments are performing What Cloud Platform offers: ● Easily micro-segment by looking for discreet patterns in large sets of customer data ● Measure campaigns by combining multiple datasets that can track campaigns across channels and users across stages of the buying funnel ● Market-mix modeling to optimize spend across channels ● Identify patterns and trends in real-time to improve customer acquisition and ROI Integration between Google Analytics Premium and BigQuery allows for data mashups, analysis of user interaction across multiple devices, and complex queries at lightening speed to gain deeper, broader insights Cloud Dataflow helps you ingest and analyze data from both live campaigns, existing CRM tools, and any other data sources you need Open Source Tools and Connectors allow you to harness the power of many open-source tools such as Hadoop and Spark to provide flexibility when analyzing campaign data BigQuery enables interactive analysis of unlimited amounts of data allowing you to seize opportunities and optimize in a timely manner, thereby increasing acquisition and ROI
  17. 17. Boosting Sales While Improving Shopping Experience Google confidential │ Do not distribute Home furnishing retailer Rooms To Go simplifies the consumer shopping experience by offering completely designed room packages.
  18. 18. Google confidential │ Do not distribute 2. Sensor Data & IoT The Technology Using Google Cloud Platform for sensor data & IoT enables use of diffuse data sources to optimize large-scale systems & improve production processes What Cloud Platform offers: ● Scalable, reliable platform for capturing and managing IoT data ● Ability to run analytics (streaming and historical) over this data ● Improve customer experiences based on faster responses to events ● Cost effective storage needed to process vast amounts of data Google Cloud Storage, Cloud SQL, and Datastore provide scalable and secure ways to store data Pub/Sub provides a reliable system for event collection and management Dataflow allows to filter, aggregate and enrich data both for streaming and batch analysis under one API BigQuery allows for interactive analysis of unlimited data to uncover trends in large databases and across all customers in order to improve customer experience
  19. 19. Connected Equipments/Devices Lennox International Inc. is an American company. Through its subsidiaries, it is a provider of climate control products for the heating, ventilation, air conditioning, and refrigeration markets in housing and commercial sectors around the world. Goal: Capture detailed product performance data and ambient conditions from the installed units for better innovation and customer service ● Innovation: Finding out areas for product improvements and new designs ● Customer Delight: Providing energy settings advice proactively to customer based on usage, weather conditions etc... ● Customer Service: Predictive maintenance to avoid major breakdowns ● Cost Savings: Better understanding of failure points feeding back into better design, helping reduce warranty and replacement costs
  20. 20. Google confidential │ Do not distribute 3. Log Data The Technology Using Google Cloud Platform for Log Data enables easy management of massive log files constantly ingesting real-time data with much shorter response times What Cloud Platform offers: ● Better management of massive log files ● An efficient platform for capturing, managing and analyzing IoT infrastructure ● The ability to continuously identify customer trends and take timely actions BigQuery handles log files of massive volume, constantly ingesting real-time data with much shorter response times Pub/Sub provides a fully managed service for reliable event ingestion, distribution and notifications, which automatically scales to what you need with no wasted capacity Dataflow is a pipeline management system that allows you to examine a real-time stream of data as well as compare it to historical data in order to capture significant patterns and activities Apps running in Compute Engine and App Engine benefit from advanced log analytics based on data streaming with real-time alerts
  21. 21. Phones BigQuery Storage BigQuery Workflows Big Query Compute Engine Hadoop MapReduce Workflows App Engine Cloud Storage Big Query • Business Analysts • Applications • Visualizations Motorola
  22. 22. Google confidential │ Do not distribute 4. SaaS The Technology Using Google Cloud Platform for SaaS enables ease of management for analytics What Cloud Platform offers: ● Ease of integration with open source tools ● A platform to capture, process and analyze large scale analytics without needing to worry about building a complex infrastructure ● Technology that scales and requires minimal administration ● The most cost effective, fastest way to store and analyze data Connectors and Tools for Hadoop data sources allow you to easily install different open source processing frameworks such as Spark, Shark, Hive and Pig to take advantage of interoperability and portability within all these frameworks as well as other Google Cloud Platform products under one system Dataflow takes care of ingestion, transformation and analysis of data, providing real-time access to application and consumer data across a set of devices Compute Engine allows you to easily scale up and down depending on your workload. Also, per minute billing lets you pay for exactly what you use and sustained-use discounts automatically reward you for running steady-state workloads BigQuery provides a 99.9% uptime SLA and you only pay for the storage you need and queries you run, giving you full visibility and control Cloud Storage and Big Query require no hardware/software eliminating capital expenditure or the need to build complex infrastructure
  23. 23. Google confidential │ Do not distribute Streak - CRM in email Managing millions of interactions and recommendations/ day with Prediction API and BigQuery
  24. 24. Google confidential │ Do not distribute 5. Traditional Hadoop Workloads The Technology Using Google Cloud Platform for Hadoop Workloads enables an easy and effective way to unlock the power of the Apache Hadoop framework What Cloud Platform offers: ● Quick startup times ● Unmatched value with per-minute billing to optimize for scale and speed ● Agility to mix and match data with multiple open source software and cloud services without worrying about configuration ● Greater stability for running Hadoop ● Flexibility and control of resizing your cluster depending on workload ● An easy way to leverage the Hadoop framework without worrying about investing in costly infrastructures and administration Compute Engine virtual machines start in seconds bdutil allows you to easily deploy and use the best tools from the open-source ecosystem. With one command line, you can start a cluster running Hadoop, Hive, Pig, Spark or Shark in order to get up and running quickly without worrying about configuration hassles Cloud Storage frees you from the burden of investing in complex disks and machines and provides flexibility to scale up and down when needed Connectors provide access to Cloud Storage, BigQuery and Datastore, which allow you to turn down your cluster without losing any of your data and take advantage of accessing your data within any of your Hadoop deployments
  25. 25. Google confidential │ Do not distribute Cdiscount.com France's largest e-commerce site, Cdiscount.com, is using Compute Engine because it's 15x faster than their on premise data warehouse.
  26. 26. Google confidential │ Do not distribute Google probably processes more information than any company on the planet and tends to have to invent tools to cope with the data. As a result its technology runs a good five to 10 years ahead of the competition. Bloomberg Businessweek, June 2014
  27. 27. Google confidential │ Do not distribute

×