"Big Data & Machine Learning Innovation with Google Cloud Platform", Shira Kimchi Google Cloud Platform, Business Manager of MEA
Watch more from Data Natives Tel Aviv 2016 here: http://bit.ly/2hw1MY0
Visit the conference website to learn more: http://telaviv.datanatives.io/
Follow Data Natives:
https://www.facebook.com/DataNatives
https://twitter.com/DataNativesConf
Stay Connected to Data Natives by Email: Subscribe to our newsletter to get the news first about Data Natives 2017: http://bit.ly/1WMJAqS
About the Author:
Shira is the Business manager of MEA Google Cloud Platform, responsible for business strategy and growth of Google Cloud Platform solutions across Middle east and Africa.Shira has a rich technical background across Infrastructure and cloud services, working with Startups and Enterprise organizations. Prior to Google, Shira was an Account Executive at Microsoft leading the relationship with key Hi-Tech enterprise accounts. In the last few years her business focus is Cloud - including SAAS|PAAS|IAAS solutions that answer the business needs of various customers.
4. 4Data & Analytics
3rd Gen Data Platforms Challenges
Data access to a variety
of data sources.
Develop and build
analytic models.
Data preparation,
exploration and visualization.
Deploy models and integrate
them into business processes
and applications.
High performance and scalability
for both development
and deployment.
Perform platform, project
and model management.
5. 55
â Doug Cutting, Hadoop Co-Creator
âGoogle is living a few years
in the future and sending the
rest of us messagesâ
6. 6Data & Analytics
Google Cloud Platform Vision
Single-node computing
âSome assembly requiredâ
True, on-demand cloud
An actual, global
elastic cloud
3rd Wave
Invest your energy
in great apps
Colocation
Your kit, someone
elseâs building.
Yours to manage.
1st Wave
Today's Cloud:
Virtualized
Data Centers
Standard virtual kit,
for rent. Still yours
to manage.
2nd Wave
8. 8Data & Analytics
Exploration &
Collaboration
Databases Storage
Data
Preparation
& Processing
Analytics
Advanced
Analytics &
Intelligence
Google Cloud Data Platform
Mobile apps
Sensors and
devices
Web apps
Relational
Key-value
Document
SQL
Wide
Column
Object
Stream
processing
Batch
processing
Data
preparation
Federated
query
Data catalog
Data
exploration
Data
visualization
Developers
Data scientists
Business
analysts
Development
environment
for Machine
Learning
Pre-Trained
Machine
Learning
models
Data
Ingestion
Messaging
Logs
9. 9Data & Analytics
Data
Preparation &
Processing
Cloud Dataflow
Cloud Dataproc
Exploration &
Collaboration
Google BigQuery
Cloud Datalab
Google
Analytics 360
Cloud Dataproc
Google Cloud Data Platform
Mobile apps
Sensors and
devices
Web apps
Developers
Data scientists
Business
analysts
Data Ingestion
Cloud Pub/Sub
App Engine
Databases/
Storage
Cloud SQL
Cloud Bigtable
Cloud Datastore
Cloud Storage
Analytics
Google BigQuery
Google
Analytics 360
Cloud Dataproc
Google Drive
Advanced
Analytics &
Intelligence
Cloud Machine
Learning
Translate API
Vision API
Speech API
10. 10Data & Analytics
Managed Data Services - Focus on Insight vs Infrastructure
PB+ Scale, No-Ops, Batch & Streaming of Data
Insights/
Programming
Resource
Provisioning
Performance
Tuning
Monitoring
Reliability
Deployment &
Configuration
Handling
Growing Scale
Utilization
Improvements
Insights/
Programming
11. proprietary & confidential | not for distribution
"We are very excited about the productivity
benefits offered by Cloud Dataflow and Cloud
Pub/Sub. It took half a day to rewrite
something that had previously taken over six
months to build using Spark"
Paul Clarke, Director of Technology, Ocado
http://googlecloudplatform.blogspot.co.uk/2015/08/Announcing-General-Availability-of-Google-Cloud-Dataflow-and-Cloud-Pub-Sub.html
12. Hadoop + Local SSD
5X the IOPS at 0.5 the cost of AWS local SSD
Up to 1.5TB per instance
680,000 read IOPS and < 1ms latency1
2
3
13. 13Data & Analytics
â Mattias P Johansson, Software Engineer, Spotify
âWith Google Cloud Platform, we benefitted by having a
virtual supercomputer on demand, without having to deal
with all the usual space, power, cooling and networking
issues.
Just a few years ago, we would have needed to use the
largest supercomputers on the planet to do what weâre
now able to do with Googleâ
â Mark Johnson, CEO, Descartes Labs
âRight at the start of the partnership we were able
to reduce time to insight from 96 hours to 30
minutes by using BigQuery.â
â Gary Sanders, Head of Digital Analytics, Lloyds Banking Group
âEveryone involved unanimously picked GCP. It came
down to this: we believe the core technology is better.â
â Peter Bakkam, Platform Lead, Quizlet
Do you feel this way about your Data Warehouse?
14. 14Data & Analytics
Data Warehouses/Lakes Machine Intelligence
Data Warehouse is the foundation of something bigger
Predictive
+
Prescriptive
analytics
=
Advanced
analytics
Cloud
On
Premises
Machine
Learning
APIs
Train
your own
Models
15. 15Data & Analytics
Automatically
categorize, and
automatically
extract value
Evaluate the model by
applying it against
additional manually
categorized data, correct
and tune
Machine intelligence is already making a huge difference
and there are many, many more opportunities
Capture lots of examples
of correct evaluations for
that categorization, and
use them to train an ML
model
Identify categorizations
that provide value,
categories youâre
already evaluating for
by hand today
1 2 3 4
16. 16| THE LEADERS CIRCLE
Rapidly accelerating use of deep learning at Google
AlphaGo
Android
Apps
Gmail
Maps
Photos
Robotics
Speech
Search
Translation
YouTube
and many others ...
Used across areas:
2012 2013 2014 2015
1500
1000
500
0
Number of directories containing model description files
18. 18Data & Analytics
Machine learning will drive every
successful huge IPO win in the next 5 years.â
âEric Schmidt
Executive Chairman, Alphabet Inc
ML + Google = :-)
Mission is to organize the worldâs information
Information = data, data = oxygen
Use of data can determine success
a lot of info in pictures
23 billion words in Wikipedia
40 billion textual lines in StreetView
Make the point that we didnât realize how valuable the pictures were originally and later we revisited and extracted all this additional value.
Willing to make a bet all the audience have similarly valuable data.
Data access to a variety of data sources. The gartner advanced analytics customer reference survey indicates that while the majority of users are analyzing transactional data, new data sources â such as text, log and sensor data, and location data â are becoming increasingly common.
Data preparation, exploration and visualization is a key area of functionality as analysis is performed by users who may lack familiarity with the data and have increasingly high expectations of tools for automating data discovery, visualization and preparation.
The ability to develop and build analytic models, including clustering, classification and predictive models, forecasting models, simulation models and optimization models.
Ability to deploy models and integrate them into business processes and applications. Deployment is a significant pain point for many organizations, so allowing easy adoption of models as part of a business process or application â rather than them just being exported as code or a database score â improves project success rates.
Capabilities to perform platform, project and model management. The need to be able to validate the performance of models and track them once deployed is necessary; the ability to reuse models and audit their development and usage can be mandatory, rather than just desired, in certain more regulated industries and environments.
High performance and scalability for both development and deployment. The ability to perform at high levels of speed and accuracy with large volumes data and streaming data is still critical for organizations, and with rising data volumes becomes even more of a differentiator.
Speaker:
Right now, Big Data = Big problems
1 - Removing the complexity of building and maintaining a Big Data system: Unlike with other Cloud services, Google provides the industryâs only NoOps Big Data platform. NoOps means that application developers will never have to speak with an operations professional again. NoOps will achieve this nirvana, by using cloud infrastructure-as-a-service to get the resources they need when they need them.
2 - Capture and store all data for all business functions: Developers can capture data using Pub/Sub or porting data from other Google services (i.e. Google Analytics). In addition, Google Cloud Storage offers developers durable and highly available object storage. Google created three simple storage product options to help developers improve the performance of their applications while keeping their costs low. These three product options use the same API, providing a simple and consistent method of access.
3 - Continuously accommodating greater data volumes and new data sources: Google understands that the amount of data companies have to store and analyze is growing exponentially. This is why weâre constantly innovating in order to offer cheaper and faster storage services (Nearline) but also making analysis tool such as BigQuery faster.
4 - Finding value in existing data very easily: Google BigQuery is designed to make it easy to analyze large amounts of data quickly. BigQuery enables analysts and developers to run fast SQL-like join and aggregate queries on datasets without the need for batch-based processing.
5 - Reducing the time from data collection to action: Google Cloud Platform for Big Data offers a proven and integrated end to end solution to make sense of large amounts of information in a very short amount of time. The end to end process of data management happens in the following stages:
Capture data using Pub/Sub, porting data from other Google services i.e. Google Analytics
Process data using DataFlow, 3rd party offerings i.e. Hadoop
Store data using Google Cloud storage, Standard, DRA and Nearline or Bigtable and BigQuery
Analyze data using BigQuery, 3rd party offerings i.e. Spark
6 - Removing the hurdles to innovate and iterate with Big Data: Google has led the industry with innovations in software infrastructure such as MapReduce, BigTable and Dremel. Today, Google is pushing the next generation of innovation with products such as Spanner and Flume. When you build on Cloud Platform, you get access to Googleâs technology innovations faster.
By 2020, predictive and prescriptive analytics will attract 40% of enterprises' net new investment in business intelligence and analytics.
By 2018, more than half of large organizations globally will compete using advanced analytics and proprietary algorithms, causing the disruption of entire industries.
advanced analytics is the analysis of all kinds of data using sophisticated quantitative methods (such as statistics, descriptive and predictive data mining, machine learning, simulation and optimization) to produce insights that traditional approaches to business intelligence (BI) â such as query and reporting â are unlikely to discover.
Understanding that the space is categorizable, testable. Caetgorically.
Do you have sources of data where you have correctly categorized already. Humans have interpreted data and put it into categories. Might Computationally/operationally expensive way to do this...how little data do I need.
Test and tune.
Process/app can be used to collect
Automatic clustering/ Kmeans- Are there things grouped in there, categories I canât see.
Some groupings will seem irrelevant artificial-est
Google Products:
AlphaGo, Apps, Maps, Photos, Gmail, Speech, Android, YouTube, Translation, Robotics Research, Image Understanding, Natural Language Understanding, Drug Discovery
Outside of Google, most popular use case are:
To summarize,
Cloud Machine Learning provides the latest innovations in vision and speech from Google Research and services like Photos, Google app, Translate, and Inbox.
These ML driven capabilities are now simple APIs in Translate API, Vision API, and Speech API.
Translate and Vision are fully launched. Cloud Speech and NL entered Beta.
Weâre very excited to bring this innovative technology to you guys