SlideShare a Scribd company logo
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
November 1, 2017 | 11:00 AM PT
Automating Big Data
Technologies for Faster Time-
to-Value
© 2017, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Today’s Presenters
David Potes, Solutions Architect, Amazon Web Services
Minesh Patel, Technical Director, Qubole
Seth Myers, Senior Data Scientist, Demandbase
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Today’s Agenda
1. An overview of AWS and AWS Marketplace, with an emphasis on
AWS data lake solutions and Qubole
2. Overview of the Qubole solutions featured in our story
3. Challenges faced by Demandbase
4. The Demandbase success story with AWS and Qubole
5. Q&A/Discussion
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Learning Objectives
1. How to dramatically reduce management complexities for analytics
operations
2. How to reduce the costs of processing and analyzing data in a data
lake on AWS
3. How to operate at the scale and efficiency of a large enterprise,
with a small data team
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Introduction to Data Lake
Concepts
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Unlocking Data
Most companies and organizations are embarking on
ambitious innovation initiatives to unlock their data.
The data already exists but goes unused or is locked away
from complimentary data sets in isolated data silos.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Enter Data Lake Architectures
Data Lake is a new and increasingly
popular architecture to store and analyze
massive volumes and heterogeneous
types of data.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Benefits of a Data Lake – All Data in One Place
Store and analyze all of your data,
from all of your sources, in one
centralized location.
“Why is the data distributed in
many locations? Where is the
single source of truth ?”
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Benefits of a Data Lake – Quick Ingest
Quickly ingest data
without needing to force it into a
pre-defined schema.
“How can I collect data quickly
from various sources and store
it efficiently?”
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Benefits of a Data Lake – Storage vs Compute
Separating your storage and compute
allows you to scale each component as
required
“How can I scale up with the
volume of data being generated?”
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Benefits of a Data Lake – Schema on Read
“Is there a way I can apply multiple
analytics and processing frameworks
to the same data?”
A Data Lake enables ad-hoc
analysis by applying schemas
on read, not write.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Approach to Data Lake
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon S3 is the Data Lake
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Designed Benefits of an Amazon S3 Data Lake
Fixed Cluster Data Lake Amazon S3 Data Lake
• Limited to only the single tool contained
on the cluster (i.e. Hadoop or data
warehouse or Cassandra, etc.). Use
cases & ecosystem tools change
rapidly
• Expensive to add nodes to add storage
capacity
• Expensive to replicate data against
node loss
• Complexity in scaling local storage
capacity
• Long refresh cycles to add additional
storage equipment
• Decouple storage and compute by
making Amazon S3 object based
storage, not a fixed tool cluster the data
lake
• Flexibility to use any and all tools in the
ecosystem. The right tool for the job
• Future proof your architecture. As new
use cases and new tools emerge you
can plug and play current best of breed.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why Amazon S3 for Data Lake?
Designed for 11 9s
of durability
Designed for
99.99% availability
Durable Available High performance
 Multiple upload
 Range GET
 Store as much as you need
 Scale storage and compute
independently
 No minimum usage commitments
Scalable
 Amazon EMR
 Amazon Redshift
 Amazon DynamoDB
Integrated
 Simple REST API
 AWS SDKs
 Read-after-create consistency
 Event notification
 Lifecycle policies
Easy to use
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Automating Complex Tasks
Qubole makes Big Data technologies swift and simple
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
About Qubole
One of the largest cloud-
agnostic Big Data as a Service
companies
Founded by the pioneers of “big
data” @ Facebook and the
creators of Apache Hive
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Poll Question #1
What is the status of your big data initiative?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The Vision
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Qubole Data Service
Amazon
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Autonomous Data Management
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Qubole Cloud Agents
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Total Cost Savings Among Qubole Customers in 2016
and 2017
Cluster Life
Cycle
Management
$150M
Workload-
aware
Autoscaling
$121M
Spot
Shopper
$40M
 Cluster Life Cycle Management
Savings
– Amount saved by automatically
terminating a cluster when inactive
 Workload-aware Auto-scaling Saving
– Amount saved by predictively adjusting
the number of nodes to meet demand
 Spot Shopper savings
– Amount saved by utilizing SPOT
instances
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Architectural Diagram
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Poll Question #2
What big data technology are you using or evaluating?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why Qubole?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Demandbase Automates With
Qubole
Demandbase provides more value for their B2B marketing customers
by automating Big Data and Machine Learning operations.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Who is Demandbase?
Demandbase is a B2B marketing automation company that leverages
artificial intelligence to automate all aspects of the advertising, selling,
and marketing process.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The Challenge
• Many factors determine which accounts a business should target
• Do they have a need/budget for the product?
• Are they currently in-market for the product?
• Do they have decision makers ready to buy?
• These insights must come from many different types of big datasets
• Demandbase’s previous account identification tool took multiple days to
run
• Our clients could not iterate or modify their strategies with such slow
turn-around
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The Data Used to Identify Accounts
• To determine an account’s need for the product
• We have firmographic information on 14 Million accounts
• We’ve built a knowledge graph of all accounts using NLP
technology that crawls 350 TB of web pages a month
• To determine if an account is in-market
• We track 700 Billion web interactions a year, each one mapped
to employees across all accounts
• To identify decision makers
• We are currently tracking over a 100 Million employees across
all accounts
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
All 14M accounts are scored,
top 5K available to user
Keywords extracts from 700B
web interactions
Buyers at each account
identified from 100M+ contacts
Company 2
Company 3
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The Solution
• The user requests a new list of accounts with a button-
press
• 60 EC2 servers are spun up
• A machine learning algorithm is built using Spark and MLLIB
• For each of 14 Million accounts
• Information about relevant web interactions, buyers, online content, etc. fed into
machine learning model
• The model scores each account
• Top 5K accounts are pushed to web app, along with
relevant info
• From button-press to new account list – 20 minutes
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Qubole Makes This Possible
• Qubole manages all of our EC2 instances
• So far, we’ve tested 20 different concurrent models (20 X 60
EC2 servers) successfully
• Qubole keeps our costs down through dynamic bidding and
heterogeneous server clusters
• Our web app calls Qubole’s easy-to-implement Play API, which
spins up the EC2 instances and deploys our Spark job
• With Qubole taking care of the infrastructure, we could focus on
developing the machine learning
• Qubole allowed us to build a self-serve machine-learning-as-service
solution
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Next Steps and Further Information
• Try a pre-configured production-ready Qubole deployment on AWS Data Lake:
• https://aws.amazon.com/quickstart/architecture/qubole-on-data-lake-foundation/
• Buy on AWS Marketplace:
• https://aws.amazon.com/marketplace/pp/B06XX76R24
• Learn more about Qubole:
• https://www.qubole.com/products/qds-for-aws/
• Learn more about Demandbase:
• https://www.demandbase.com/technology/
• Try AWS:
• https://aws.amazon.com/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Q & A
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!

More Related Content

What's hot

Supercharge Your Machine Learning Solutions with Amazon SageMaker
Supercharge Your Machine Learning Solutions with Amazon SageMakerSupercharge Your Machine Learning Solutions with Amazon SageMaker
Supercharge Your Machine Learning Solutions with Amazon SageMaker
Amazon Web Services
 
ABD214_Real-time User Insights for Mobile and Web Applications with Amazon Pi...
ABD214_Real-time User Insights for Mobile and Web Applications with Amazon Pi...ABD214_Real-time User Insights for Mobile and Web Applications with Amazon Pi...
ABD214_Real-time User Insights for Mobile and Web Applications with Amazon Pi...
Amazon Web Services
 
Fanatics Ingests Streaming Data to a Data Lake on AWS
Fanatics Ingests Streaming Data to a Data Lake on AWSFanatics Ingests Streaming Data to a Data Lake on AWS
Fanatics Ingests Streaming Data to a Data Lake on AWS
Amazon Web Services
 
GPSWKS401_Designing a Cloud Enterprise Data Warehouse
GPSWKS401_Designing a Cloud Enterprise Data WarehouseGPSWKS401_Designing a Cloud Enterprise Data Warehouse
GPSWKS401_Designing a Cloud Enterprise Data Warehouse
Amazon Web Services
 
Access Control in AWS Glue Data Catalog (ANT376) - AWS re:Invent 2018
Access Control in AWS Glue Data Catalog (ANT376) - AWS re:Invent 2018Access Control in AWS Glue Data Catalog (ANT376) - AWS re:Invent 2018
Access Control in AWS Glue Data Catalog (ANT376) - AWS re:Invent 2018
Amazon Web Services
 
Introduction to Data Analysis, Storage & Processing Solutions
Introduction to Data Analysis, Storage & Processing SolutionsIntroduction to Data Analysis, Storage & Processing Solutions
Introduction to Data Analysis, Storage & Processing Solutions
Anjani Phuyal
 
MAE402-Media Intelligence for the Cloud with Amazon AI.pdf
MAE402-Media Intelligence for the Cloud with Amazon AI.pdfMAE402-Media Intelligence for the Cloud with Amazon AI.pdf
MAE402-Media Intelligence for the Cloud with Amazon AI.pdf
Amazon Web Services
 
AMF302-Alexa Wheres My Car A Test Drive of the AWS Connected Car Reference.pdf
AMF302-Alexa Wheres My Car A Test Drive of the AWS Connected Car Reference.pdfAMF302-Alexa Wheres My Car A Test Drive of the AWS Connected Car Reference.pdf
AMF302-Alexa Wheres My Car A Test Drive of the AWS Connected Car Reference.pdf
Amazon Web Services
 
How Twilio Scaled Its Data Driven Culture - ABD309 - re:Invent 2017
How Twilio Scaled Its Data Driven Culture - ABD309 - re:Invent 2017How Twilio Scaled Its Data Driven Culture - ABD309 - re:Invent 2017
How Twilio Scaled Its Data Driven Culture - ABD309 - re:Invent 2017
Amazon Web Services
 
Solve Common Voice UI Challenges with Advanced Dialog Management Techniques (...
Solve Common Voice UI Challenges with Advanced Dialog Management Techniques (...Solve Common Voice UI Challenges with Advanced Dialog Management Techniques (...
Solve Common Voice UI Challenges with Advanced Dialog Management Techniques (...
Amazon Web Services
 
Optimize Amazon EC2 Instances, AWS Fargate Containers, & Lambda Functions (CM...
Optimize Amazon EC2 Instances, AWS Fargate Containers, & Lambda Functions (CM...Optimize Amazon EC2 Instances, AWS Fargate Containers, & Lambda Functions (CM...
Optimize Amazon EC2 Instances, AWS Fargate Containers, & Lambda Functions (CM...
Amazon Web Services
 
AWS re:Invent 2017 Recap - Solutions Updates
AWS re:Invent 2017 Recap - Solutions UpdatesAWS re:Invent 2017 Recap - Solutions Updates
AWS re:Invent 2017 Recap - Solutions Updates
Amazon Web Services
 
Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018
Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018
Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018
Amazon Web Services
 
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
Amazon Web Services Korea
 
Creating Rich, Interactive Business Dashboards in Amazon QuickSight (ANT339) ...
Creating Rich, Interactive Business Dashboards in Amazon QuickSight (ANT339) ...Creating Rich, Interactive Business Dashboards in Amazon QuickSight (ANT339) ...
Creating Rich, Interactive Business Dashboards in Amazon QuickSight (ANT339) ...
Amazon Web Services
 
Searching Your Data with Amazon Elasticsearch Service (ANT384) - AWS re:Inven...
Searching Your Data with Amazon Elasticsearch Service (ANT384) - AWS re:Inven...Searching Your Data with Amazon Elasticsearch Service (ANT384) - AWS re:Inven...
Searching Your Data with Amazon Elasticsearch Service (ANT384) - AWS re:Inven...
Amazon Web Services
 
Build Machine Learning Solutions on Data Lakes (ARC321) - AWS re:Invent 2018
Build Machine Learning Solutions on Data Lakes (ARC321) - AWS re:Invent 2018Build Machine Learning Solutions on Data Lakes (ARC321) - AWS re:Invent 2018
Build Machine Learning Solutions on Data Lakes (ARC321) - AWS re:Invent 2018
Amazon Web Services
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Amazon Web Services
 
GPSTEC307_Too Many Tools
GPSTEC307_Too Many ToolsGPSTEC307_Too Many Tools
GPSTEC307_Too Many Tools
Amazon Web Services
 
AWS Database and Analytics State of the Union
AWS Database and Analytics State of the UnionAWS Database and Analytics State of the Union
AWS Database and Analytics State of the Union
Amazon Web Services
 

What's hot (20)

Supercharge Your Machine Learning Solutions with Amazon SageMaker
Supercharge Your Machine Learning Solutions with Amazon SageMakerSupercharge Your Machine Learning Solutions with Amazon SageMaker
Supercharge Your Machine Learning Solutions with Amazon SageMaker
 
ABD214_Real-time User Insights for Mobile and Web Applications with Amazon Pi...
ABD214_Real-time User Insights for Mobile and Web Applications with Amazon Pi...ABD214_Real-time User Insights for Mobile and Web Applications with Amazon Pi...
ABD214_Real-time User Insights for Mobile and Web Applications with Amazon Pi...
 
Fanatics Ingests Streaming Data to a Data Lake on AWS
Fanatics Ingests Streaming Data to a Data Lake on AWSFanatics Ingests Streaming Data to a Data Lake on AWS
Fanatics Ingests Streaming Data to a Data Lake on AWS
 
GPSWKS401_Designing a Cloud Enterprise Data Warehouse
GPSWKS401_Designing a Cloud Enterprise Data WarehouseGPSWKS401_Designing a Cloud Enterprise Data Warehouse
GPSWKS401_Designing a Cloud Enterprise Data Warehouse
 
Access Control in AWS Glue Data Catalog (ANT376) - AWS re:Invent 2018
Access Control in AWS Glue Data Catalog (ANT376) - AWS re:Invent 2018Access Control in AWS Glue Data Catalog (ANT376) - AWS re:Invent 2018
Access Control in AWS Glue Data Catalog (ANT376) - AWS re:Invent 2018
 
Introduction to Data Analysis, Storage & Processing Solutions
Introduction to Data Analysis, Storage & Processing SolutionsIntroduction to Data Analysis, Storage & Processing Solutions
Introduction to Data Analysis, Storage & Processing Solutions
 
MAE402-Media Intelligence for the Cloud with Amazon AI.pdf
MAE402-Media Intelligence for the Cloud with Amazon AI.pdfMAE402-Media Intelligence for the Cloud with Amazon AI.pdf
MAE402-Media Intelligence for the Cloud with Amazon AI.pdf
 
AMF302-Alexa Wheres My Car A Test Drive of the AWS Connected Car Reference.pdf
AMF302-Alexa Wheres My Car A Test Drive of the AWS Connected Car Reference.pdfAMF302-Alexa Wheres My Car A Test Drive of the AWS Connected Car Reference.pdf
AMF302-Alexa Wheres My Car A Test Drive of the AWS Connected Car Reference.pdf
 
How Twilio Scaled Its Data Driven Culture - ABD309 - re:Invent 2017
How Twilio Scaled Its Data Driven Culture - ABD309 - re:Invent 2017How Twilio Scaled Its Data Driven Culture - ABD309 - re:Invent 2017
How Twilio Scaled Its Data Driven Culture - ABD309 - re:Invent 2017
 
Solve Common Voice UI Challenges with Advanced Dialog Management Techniques (...
Solve Common Voice UI Challenges with Advanced Dialog Management Techniques (...Solve Common Voice UI Challenges with Advanced Dialog Management Techniques (...
Solve Common Voice UI Challenges with Advanced Dialog Management Techniques (...
 
Optimize Amazon EC2 Instances, AWS Fargate Containers, & Lambda Functions (CM...
Optimize Amazon EC2 Instances, AWS Fargate Containers, & Lambda Functions (CM...Optimize Amazon EC2 Instances, AWS Fargate Containers, & Lambda Functions (CM...
Optimize Amazon EC2 Instances, AWS Fargate Containers, & Lambda Functions (CM...
 
AWS re:Invent 2017 Recap - Solutions Updates
AWS re:Invent 2017 Recap - Solutions UpdatesAWS re:Invent 2017 Recap - Solutions Updates
AWS re:Invent 2017 Recap - Solutions Updates
 
Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018
Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018
Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018
 
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
 
Creating Rich, Interactive Business Dashboards in Amazon QuickSight (ANT339) ...
Creating Rich, Interactive Business Dashboards in Amazon QuickSight (ANT339) ...Creating Rich, Interactive Business Dashboards in Amazon QuickSight (ANT339) ...
Creating Rich, Interactive Business Dashboards in Amazon QuickSight (ANT339) ...
 
Searching Your Data with Amazon Elasticsearch Service (ANT384) - AWS re:Inven...
Searching Your Data with Amazon Elasticsearch Service (ANT384) - AWS re:Inven...Searching Your Data with Amazon Elasticsearch Service (ANT384) - AWS re:Inven...
Searching Your Data with Amazon Elasticsearch Service (ANT384) - AWS re:Inven...
 
Build Machine Learning Solutions on Data Lakes (ARC321) - AWS re:Invent 2018
Build Machine Learning Solutions on Data Lakes (ARC321) - AWS re:Invent 2018Build Machine Learning Solutions on Data Lakes (ARC321) - AWS re:Invent 2018
Build Machine Learning Solutions on Data Lakes (ARC321) - AWS re:Invent 2018
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis Analytics
 
GPSTEC307_Too Many Tools
GPSTEC307_Too Many ToolsGPSTEC307_Too Many Tools
GPSTEC307_Too Many Tools
 
AWS Database and Analytics State of the Union
AWS Database and Analytics State of the UnionAWS Database and Analytics State of the Union
AWS Database and Analytics State of the Union
 

Viewers also liked

Opportunities derived by AI
Opportunities derived by AIOpportunities derived by AI
Opportunities derived by AI
Amazon Web Services
 
Infrastructure Security: Your Minimum Security Baseline
Infrastructure Security: Your Minimum Security BaselineInfrastructure Security: Your Minimum Security Baseline
Infrastructure Security: Your Minimum Security Baseline
Amazon Web Services
 
運用大數據掌握您的客戶
運用大數據掌握您的客戶運用大數據掌握您的客戶
運用大數據掌握您的客戶
Amazon Web Services
 
Voice of the Customer: Zocdoc and Elevating Security While Moving to AWS
Voice of the Customer: Zocdoc and Elevating Security While Moving to AWSVoice of the Customer: Zocdoc and Elevating Security While Moving to AWS
Voice of the Customer: Zocdoc and Elevating Security While Moving to AWS
Amazon Web Services
 
Turn Big Data into Big Value on Informatica and AWS
Turn Big Data into Big Value on Informatica and AWSTurn Big Data into Big Value on Informatica and AWS
Turn Big Data into Big Value on Informatica and AWS
Amazon Web Services
 
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...
Amazon Web Services
 
Welcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewWelcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution Overview
Amazon Web Services
 
Digital Transformation - Transformation Day Public Sector London 2017
Digital Transformation - Transformation Day Public Sector London 2017Digital Transformation - Transformation Day Public Sector London 2017
Digital Transformation - Transformation Day Public Sector London 2017
Amazon Web Services
 
Getting Started with Kubernetes on AWS
Getting Started with Kubernetes on AWSGetting Started with Kubernetes on AWS
Getting Started with Kubernetes on AWS
Amazon Web Services
 
AWS 機器學習 I ─ 人工智慧 AI
AWS 機器學習 I ─ 人工智慧 AIAWS 機器學習 I ─ 人工智慧 AI
AWS 機器學習 I ─ 人工智慧 AI
Amazon Web Services
 
Using Big Data to Driving Big Engagement
Using Big Data to Driving Big EngagementUsing Big Data to Driving Big Engagement
Using Big Data to Driving Big Engagement
Amazon Web Services
 
Delivering DevOps on AWS - Transformation Day Public Sector London 2017
Delivering DevOps on AWS - Transformation Day Public Sector London 2017Delivering DevOps on AWS - Transformation Day Public Sector London 2017
Delivering DevOps on AWS - Transformation Day Public Sector London 2017
Amazon Web Services
 
Security Best Practices - Transformation Day Public Sector London 2017
Security Best Practices - Transformation Day Public Sector London 2017Security Best Practices - Transformation Day Public Sector London 2017
Security Best Practices - Transformation Day Public Sector London 2017
Amazon Web Services
 
Big Data Experience Sharing: Building Collaborative Data Analytics Platform -...
Big Data Experience Sharing: Building Collaborative Data Analytics Platform -...Big Data Experience Sharing: Building Collaborative Data Analytics Platform -...
Big Data Experience Sharing: Building Collaborative Data Analytics Platform -...
Amazon Web Services
 
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNetAWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
Amazon Web Services
 
智能零售解決方案
智能零售解決方案智能零售解決方案
智能零售解決方案
Amazon Web Services
 
Deploying SAP Solutions on AWS
Deploying SAP Solutions on AWSDeploying SAP Solutions on AWS
Deploying SAP Solutions on AWS
Amazon Web Services
 
Set it and Forget it: Auto Scaling Target Tracking Policies - AWS Online Tech...
Set it and Forget it: Auto Scaling Target Tracking Policies - AWS Online Tech...Set it and Forget it: Auto Scaling Target Tracking Policies - AWS Online Tech...
Set it and Forget it: Auto Scaling Target Tracking Policies - AWS Online Tech...
Amazon Web Services
 
Run Your CI/CD Pipeline at Scale for a Fraction of the Cost - AWS Online Tech...
Run Your CI/CD Pipeline at Scale for a Fraction of the Cost - AWS Online Tech...Run Your CI/CD Pipeline at Scale for a Fraction of the Cost - AWS Online Tech...
Run Your CI/CD Pipeline at Scale for a Fraction of the Cost - AWS Online Tech...
Amazon Web Services
 
Cloud Native, Cloud First and Hybrid: How Different Organizations are Approac...
Cloud Native, Cloud First and Hybrid: How Different Organizations are Approac...Cloud Native, Cloud First and Hybrid: How Different Organizations are Approac...
Cloud Native, Cloud First and Hybrid: How Different Organizations are Approac...
Amazon Web Services
 

Viewers also liked (20)

Opportunities derived by AI
Opportunities derived by AIOpportunities derived by AI
Opportunities derived by AI
 
Infrastructure Security: Your Minimum Security Baseline
Infrastructure Security: Your Minimum Security BaselineInfrastructure Security: Your Minimum Security Baseline
Infrastructure Security: Your Minimum Security Baseline
 
運用大數據掌握您的客戶
運用大數據掌握您的客戶運用大數據掌握您的客戶
運用大數據掌握您的客戶
 
Voice of the Customer: Zocdoc and Elevating Security While Moving to AWS
Voice of the Customer: Zocdoc and Elevating Security While Moving to AWSVoice of the Customer: Zocdoc and Elevating Security While Moving to AWS
Voice of the Customer: Zocdoc and Elevating Security While Moving to AWS
 
Turn Big Data into Big Value on Informatica and AWS
Turn Big Data into Big Value on Informatica and AWSTurn Big Data into Big Value on Informatica and AWS
Turn Big Data into Big Value on Informatica and AWS
 
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...
 
Welcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewWelcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution Overview
 
Digital Transformation - Transformation Day Public Sector London 2017
Digital Transformation - Transformation Day Public Sector London 2017Digital Transformation - Transformation Day Public Sector London 2017
Digital Transformation - Transformation Day Public Sector London 2017
 
Getting Started with Kubernetes on AWS
Getting Started with Kubernetes on AWSGetting Started with Kubernetes on AWS
Getting Started with Kubernetes on AWS
 
AWS 機器學習 I ─ 人工智慧 AI
AWS 機器學習 I ─ 人工智慧 AIAWS 機器學習 I ─ 人工智慧 AI
AWS 機器學習 I ─ 人工智慧 AI
 
Using Big Data to Driving Big Engagement
Using Big Data to Driving Big EngagementUsing Big Data to Driving Big Engagement
Using Big Data to Driving Big Engagement
 
Delivering DevOps on AWS - Transformation Day Public Sector London 2017
Delivering DevOps on AWS - Transformation Day Public Sector London 2017Delivering DevOps on AWS - Transformation Day Public Sector London 2017
Delivering DevOps on AWS - Transformation Day Public Sector London 2017
 
Security Best Practices - Transformation Day Public Sector London 2017
Security Best Practices - Transformation Day Public Sector London 2017Security Best Practices - Transformation Day Public Sector London 2017
Security Best Practices - Transformation Day Public Sector London 2017
 
Big Data Experience Sharing: Building Collaborative Data Analytics Platform -...
Big Data Experience Sharing: Building Collaborative Data Analytics Platform -...Big Data Experience Sharing: Building Collaborative Data Analytics Platform -...
Big Data Experience Sharing: Building Collaborative Data Analytics Platform -...
 
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNetAWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
 
智能零售解決方案
智能零售解決方案智能零售解決方案
智能零售解決方案
 
Deploying SAP Solutions on AWS
Deploying SAP Solutions on AWSDeploying SAP Solutions on AWS
Deploying SAP Solutions on AWS
 
Set it and Forget it: Auto Scaling Target Tracking Policies - AWS Online Tech...
Set it and Forget it: Auto Scaling Target Tracking Policies - AWS Online Tech...Set it and Forget it: Auto Scaling Target Tracking Policies - AWS Online Tech...
Set it and Forget it: Auto Scaling Target Tracking Policies - AWS Online Tech...
 
Run Your CI/CD Pipeline at Scale for a Fraction of the Cost - AWS Online Tech...
Run Your CI/CD Pipeline at Scale for a Fraction of the Cost - AWS Online Tech...Run Your CI/CD Pipeline at Scale for a Fraction of the Cost - AWS Online Tech...
Run Your CI/CD Pipeline at Scale for a Fraction of the Cost - AWS Online Tech...
 
Cloud Native, Cloud First and Hybrid: How Different Organizations are Approac...
Cloud Native, Cloud First and Hybrid: How Different Organizations are Approac...Cloud Native, Cloud First and Hybrid: How Different Organizations are Approac...
Cloud Native, Cloud First and Hybrid: How Different Organizations are Approac...
 

Similar to Architecting an Open Data Lake for the Enterprise

TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
Amazon Web Services
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
Amazon Web Services
 
Architecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the EnterpriseArchitecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the Enterprise
Amazon Web Services
 
McGraw-Hill Optimizes Analytics Workloads with Databricks
 McGraw-Hill Optimizes Analytics Workloads with Databricks McGraw-Hill Optimizes Analytics Workloads with Databricks
McGraw-Hill Optimizes Analytics Workloads with Databricks
Amazon Web Services
 
FSV305-Optimizing Payments Collections with Containers and Machine Learning
FSV305-Optimizing Payments Collections with Containers and Machine LearningFSV305-Optimizing Payments Collections with Containers and Machine Learning
FSV305-Optimizing Payments Collections with Containers and Machine Learning
Amazon Web Services
 
ARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million UsersARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million Users
Amazon Web Services
 
GPSWKS301_Comprehensive Big Data Architecture Made Easy
GPSWKS301_Comprehensive Big Data Architecture Made EasyGPSWKS301_Comprehensive Big Data Architecture Made Easy
GPSWKS301_Comprehensive Big Data Architecture Made Easy
Amazon Web Services
 
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...
Amazon Web Services
 
How Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsHow Amazon.com uses AWS Analytics
How Amazon.com uses AWS Analytics
Amazon Web Services
 
How Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsHow Amazon.com uses AWS Analytics
How Amazon.com uses AWS Analytics
Amazon Web Services
 
ABD307_Deep Analytics for Global AWS Marketing Organization
ABD307_Deep Analytics for Global AWS Marketing OrganizationABD307_Deep Analytics for Global AWS Marketing Organization
ABD307_Deep Analytics for Global AWS Marketing Organization
Amazon Web Services
 
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Amazon Web Services
 
A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Mas...
A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Mas...A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Mas...
A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Mas...
Amazon Web Services
 
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWSAWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
Amazon Web Services
 
How Amazon.com Uses AWS Analytics
How Amazon.com Uses AWS AnalyticsHow Amazon.com Uses AWS Analytics
How Amazon.com Uses AWS Analytics
Amazon Web Services
 
Design, Build, and Modernize Your Web Applications with AWS
 Design, Build, and Modernize Your Web Applications with AWS Design, Build, and Modernize Your Web Applications with AWS
Design, Build, and Modernize Your Web Applications with AWS
Donnie Prakoso
 
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
Amazon Web Services
 
STG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data OceansSTG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data Oceans
Amazon Web Services
 
How Amazon uses AWS Analytics
How Amazon uses AWS AnalyticsHow Amazon uses AWS Analytics
How Amazon uses AWS Analytics
Amazon Web Services
 
How Amazon.com Uses AWS Analytics: Data Analytics Week SF
How Amazon.com Uses AWS Analytics: Data Analytics Week SFHow Amazon.com Uses AWS Analytics: Data Analytics Week SF
How Amazon.com Uses AWS Analytics: Data Analytics Week SF
Amazon Web Services
 

Similar to Architecting an Open Data Lake for the Enterprise (20)

TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 
Architecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the EnterpriseArchitecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the Enterprise
 
McGraw-Hill Optimizes Analytics Workloads with Databricks
 McGraw-Hill Optimizes Analytics Workloads with Databricks McGraw-Hill Optimizes Analytics Workloads with Databricks
McGraw-Hill Optimizes Analytics Workloads with Databricks
 
FSV305-Optimizing Payments Collections with Containers and Machine Learning
FSV305-Optimizing Payments Collections with Containers and Machine LearningFSV305-Optimizing Payments Collections with Containers and Machine Learning
FSV305-Optimizing Payments Collections with Containers and Machine Learning
 
ARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million UsersARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million Users
 
GPSWKS301_Comprehensive Big Data Architecture Made Easy
GPSWKS301_Comprehensive Big Data Architecture Made EasyGPSWKS301_Comprehensive Big Data Architecture Made Easy
GPSWKS301_Comprehensive Big Data Architecture Made Easy
 
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...
 
How Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsHow Amazon.com uses AWS Analytics
How Amazon.com uses AWS Analytics
 
How Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsHow Amazon.com uses AWS Analytics
How Amazon.com uses AWS Analytics
 
ABD307_Deep Analytics for Global AWS Marketing Organization
ABD307_Deep Analytics for Global AWS Marketing OrganizationABD307_Deep Analytics for Global AWS Marketing Organization
ABD307_Deep Analytics for Global AWS Marketing Organization
 
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
 
A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Mas...
A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Mas...A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Mas...
A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Mas...
 
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWSAWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
 
How Amazon.com Uses AWS Analytics
How Amazon.com Uses AWS AnalyticsHow Amazon.com Uses AWS Analytics
How Amazon.com Uses AWS Analytics
 
Design, Build, and Modernize Your Web Applications with AWS
 Design, Build, and Modernize Your Web Applications with AWS Design, Build, and Modernize Your Web Applications with AWS
Design, Build, and Modernize Your Web Applications with AWS
 
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
 
STG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data OceansSTG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data Oceans
 
How Amazon uses AWS Analytics
How Amazon uses AWS AnalyticsHow Amazon uses AWS Analytics
How Amazon uses AWS Analytics
 
How Amazon.com Uses AWS Analytics: Data Analytics Week SF
How Amazon.com Uses AWS Analytics: Data Analytics Week SFHow Amazon.com Uses AWS Analytics: Data Analytics Week SF
How Amazon.com Uses AWS Analytics: Data Analytics Week SF
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
Amazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
Amazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
Amazon Web Services
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Amazon Web Services
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
Amazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
Amazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Amazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
Amazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Amazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Architecting an Open Data Lake for the Enterprise

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. November 1, 2017 | 11:00 AM PT Automating Big Data Technologies for Faster Time- to-Value © 2017, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Today’s Presenters David Potes, Solutions Architect, Amazon Web Services Minesh Patel, Technical Director, Qubole Seth Myers, Senior Data Scientist, Demandbase
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Today’s Agenda 1. An overview of AWS and AWS Marketplace, with an emphasis on AWS data lake solutions and Qubole 2. Overview of the Qubole solutions featured in our story 3. Challenges faced by Demandbase 4. The Demandbase success story with AWS and Qubole 5. Q&A/Discussion
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Learning Objectives 1. How to dramatically reduce management complexities for analytics operations 2. How to reduce the costs of processing and analyzing data in a data lake on AWS 3. How to operate at the scale and efficiency of a large enterprise, with a small data team
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Introduction to Data Lake Concepts
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Unlocking Data Most companies and organizations are embarking on ambitious innovation initiatives to unlock their data. The data already exists but goes unused or is locked away from complimentary data sets in isolated data silos.
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Enter Data Lake Architectures Data Lake is a new and increasingly popular architecture to store and analyze massive volumes and heterogeneous types of data.
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of a Data Lake – All Data in One Place Store and analyze all of your data, from all of your sources, in one centralized location. “Why is the data distributed in many locations? Where is the single source of truth ?”
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of a Data Lake – Quick Ingest Quickly ingest data without needing to force it into a pre-defined schema. “How can I collect data quickly from various sources and store it efficiently?”
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of a Data Lake – Storage vs Compute Separating your storage and compute allows you to scale each component as required “How can I scale up with the volume of data being generated?”
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of a Data Lake – Schema on Read “Is there a way I can apply multiple analytics and processing frameworks to the same data?” A Data Lake enables ad-hoc analysis by applying schemas on read, not write.
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Approach to Data Lake
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon S3 is the Data Lake
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Designed Benefits of an Amazon S3 Data Lake Fixed Cluster Data Lake Amazon S3 Data Lake • Limited to only the single tool contained on the cluster (i.e. Hadoop or data warehouse or Cassandra, etc.). Use cases & ecosystem tools change rapidly • Expensive to add nodes to add storage capacity • Expensive to replicate data against node loss • Complexity in scaling local storage capacity • Long refresh cycles to add additional storage equipment • Decouple storage and compute by making Amazon S3 object based storage, not a fixed tool cluster the data lake • Flexibility to use any and all tools in the ecosystem. The right tool for the job • Future proof your architecture. As new use cases and new tools emerge you can plug and play current best of breed.
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why Amazon S3 for Data Lake? Designed for 11 9s of durability Designed for 99.99% availability Durable Available High performance  Multiple upload  Range GET  Store as much as you need  Scale storage and compute independently  No minimum usage commitments Scalable  Amazon EMR  Amazon Redshift  Amazon DynamoDB Integrated  Simple REST API  AWS SDKs  Read-after-create consistency  Event notification  Lifecycle policies Easy to use
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Automating Complex Tasks Qubole makes Big Data technologies swift and simple
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. About Qubole One of the largest cloud- agnostic Big Data as a Service companies Founded by the pioneers of “big data” @ Facebook and the creators of Apache Hive
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Poll Question #1 What is the status of your big data initiative?
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The Vision
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Qubole Data Service Amazon
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Autonomous Data Management
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Qubole Cloud Agents
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Total Cost Savings Among Qubole Customers in 2016 and 2017 Cluster Life Cycle Management $150M Workload- aware Autoscaling $121M Spot Shopper $40M  Cluster Life Cycle Management Savings – Amount saved by automatically terminating a cluster when inactive  Workload-aware Auto-scaling Saving – Amount saved by predictively adjusting the number of nodes to meet demand  Spot Shopper savings – Amount saved by utilizing SPOT instances
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Architectural Diagram
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Poll Question #2 What big data technology are you using or evaluating?
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why Qubole?
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Demandbase Automates With Qubole Demandbase provides more value for their B2B marketing customers by automating Big Data and Machine Learning operations.
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Who is Demandbase? Demandbase is a B2B marketing automation company that leverages artificial intelligence to automate all aspects of the advertising, selling, and marketing process.
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The Challenge • Many factors determine which accounts a business should target • Do they have a need/budget for the product? • Are they currently in-market for the product? • Do they have decision makers ready to buy? • These insights must come from many different types of big datasets • Demandbase’s previous account identification tool took multiple days to run • Our clients could not iterate or modify their strategies with such slow turn-around
  • 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The Data Used to Identify Accounts • To determine an account’s need for the product • We have firmographic information on 14 Million accounts • We’ve built a knowledge graph of all accounts using NLP technology that crawls 350 TB of web pages a month • To determine if an account is in-market • We track 700 Billion web interactions a year, each one mapped to employees across all accounts • To identify decision makers • We are currently tracking over a 100 Million employees across all accounts
  • 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. All 14M accounts are scored, top 5K available to user Keywords extracts from 700B web interactions Buyers at each account identified from 100M+ contacts Company 2 Company 3
  • 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The Solution • The user requests a new list of accounts with a button- press • 60 EC2 servers are spun up • A machine learning algorithm is built using Spark and MLLIB • For each of 14 Million accounts • Information about relevant web interactions, buyers, online content, etc. fed into machine learning model • The model scores each account • Top 5K accounts are pushed to web app, along with relevant info • From button-press to new account list – 20 minutes
  • 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Qubole Makes This Possible • Qubole manages all of our EC2 instances • So far, we’ve tested 20 different concurrent models (20 X 60 EC2 servers) successfully • Qubole keeps our costs down through dynamic bidding and heterogeneous server clusters • Our web app calls Qubole’s easy-to-implement Play API, which spins up the EC2 instances and deploys our Spark job • With Qubole taking care of the infrastructure, we could focus on developing the machine learning • Qubole allowed us to build a self-serve machine-learning-as-service solution
  • 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Next Steps and Further Information • Try a pre-configured production-ready Qubole deployment on AWS Data Lake: • https://aws.amazon.com/quickstart/architecture/qubole-on-data-lake-foundation/ • Buy on AWS Marketplace: • https://aws.amazon.com/marketplace/pp/B06XX76R24 • Learn more about Qubole: • https://www.qubole.com/products/qds-for-aws/ • Learn more about Demandbase: • https://www.demandbase.com/technology/ • Try AWS: • https://aws.amazon.com/
  • 35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Q & A
  • 36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you!