Data has gravity – The AWS
Data Management approach
Adolfo Abreu
Business Development Latam Team
Database, Big Data, Analytics, AI, ML, IoT
What do you know about Amazon?
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.2
Unprecedented
SCALE
Hyper
SPEED
Relentless
INNOVATION
Customer
OBSESSION
Amazon Web Services
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.3
Applying amazon.com concepts to infrastructure
Unprecedented
SCALE
Hyper
SPEED
Relentless
INNOVATION
Customer
OBSESSION
Infinite IT resources
available in minutes
18 regions,
55 availability zones
New capabilities
launched daily
Working Backwards
from the customer
PR
FAQ
Personalized
Recommendations
Fulfillment automation
& Inventory Management
Drones Voice driven
Interactions
Inventing entirely new
Customer experiences
At Amazon, we’ve been making investments in Machine
Learning for the last 20 years
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Analytics
Redshift
Data warehousing
EMR
Hadoop + Spark
Athena
Interactive analytics
Kinesis Data Analytics
Real time
Elasticsearch Service
Operational Analytics
QuickSight SageMaker
S3/Glacier
Glue
ETL & Data Catalog
Lake Formation
Data Lakes
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams
Data Movement
Business Intelligence & Machine Learning
Data Lake
Our portfolio - purpose-built for builders
Analytics
QuickSight SageMaker
S3/Glacier
Glue
ETL & Data Catalog
Lake Formation
Data Lakes
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams
Data Movement
Business Intelligence & Machine Learning
Data Lake
Redshift
Data warehousing
EMR
Hadoop + Spark
Kinesis Data Analytics
Real time
Elasticsearch Service
Operational Analytics
Athena
Interactive analytics
RDS
MySQL, PostgreSQL, MariaDB,
Oracle, SQL Server
Aurora
MySQL, PostgreSQL
DynamoDB
Key value, Document
ElastiCache
Redis, Memcached
Neptune
Graph
Timestream
Time Series
QLDB
Ledger Database
RDS on VMware
Databases
Our portfolio - purpose-built for builders
Redshift
Data warehousing
EMR
Hadoop + Spark
Athena
Interactive analytics
Kinesis Data
Analytics Real time
Elasticsearch Service
Operational Analytics
RDS
MySQL, PostgreSQL, MariaDB,
Oracle, SQL Server
Aurora
MySQL, PostgreSQL
QuickSight SageMaker
DynamoDB
Key value, Document
ElastiCache
Redis, Memcached
Neptune
Graph
Timestream
Time Series
QLDB
Ledger Database
S3/Glacier
Glue
ETL & Data Catalog
Lake Formation
Data Lakes
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams
Data Movement
Analytics Databases
Business Intelligence & Machine Learning
Data Lake
Managed
Blockchain
Blockchain
Templates
Blockchain
RDS on VMware
Our portfolio - purpose-built for builders
Quickly build new
apps in the cloud
Gain new
insights
“Lift and shift” existing
apps to the cloud
Three type of projects
Traditionally, analytics looked like this
Relational data
GBs-TBs scale [not designed for PB/EBs]
Expensive: Large initial capex + $10K-$50K/TB/year
90% of data was thrown away because of cost
OLTP ERP CRM LOB
Data Warehouse
Business Intelligence
Snowball
Snowmobile Kinesis
Data Firehose
Kinesis
Data Streams
S3
Redshift
EMR
Athena Kinesis
Elasticsearch Service
Data lakes on AWS
Kinesis
Video Streams
AI Services
QuickSight
Exabyte scale
Store and analyze relational and non-relational data
Purpose-built analytics tools
Cost effective
• Store at 2.3 cents per GB-month in Amazon S3
• Query with Amazon Athena at ½ cent per GB scanned
• DW with Amazon Redshift for $1,000/TB/year
Give access to everyone
• Amazon QuickSight: $0.30 for 30 minutes of use
How it works
S3
IAM KMS
OLTP
ERP
CRM
LOB
Devices
Web
Sensors
Social Kinesis
Build data lakes quickly
• Identify, crawl, and catalog sources
• Ingest and clean data
• Transform into optimal formats
Simplify security management
• Enforce encryption
• Define access policies
• Implement audit login
Enable self-service and combined analytics
• Analysts discover all data available for analysis from a
single data catalog
• Use multiple analytics tools over the same data
Athena
Redshift
AI Services
EMR
QuickSight
Data
catalog
Relational Key-value Document In-memory Graph Time-series Ledger
DynamoDB NeptuneAmazon RDS
Aurora CommercialCommunity
Timestream QLDBElastiCacheDocumentDB
AWS databases services
Amazon RDS
Managed relational database service with a choice of six popular database engines
Available & durable
Automatic Multi-AZ data
replication; automated backup,
snapshots, failover
Easy to administer
No need for infrastructure
provisioning, installing and
maintaining DB software
Highly scalable
Scale database compute
and storage with a few
clicks with no
application downtime
Fast & secure
SSD storage and
guaranteed provisioned I/O;
data encryption at rest and
in transit
Amazon QuickSight
First BI service with pay-per-session pricing for everyone in your organization
Serverless, cloud-powered BI service (no servers to manage)
Scale from 10s of users to 100s of thousands of users
Pay only for what you use
• Readers: $0.30/30 min session with a $5/user/month max
• Authors: $18/month/Author
Integrates with S3, Athena, Redshift, RDS, Aurora, & EMR
AWS Directory Service
Microsoft AD
Custom Date Format Dashboard Save As Aggregate Calculations Readers Groups
Private VPC
25 GB SPICE
tables
Spark and Presto Connector Scheduled refresh Just In Time Provisioning One-click upgrade
Search Totals Excel Custom Range
100+
new features released since
launch
Federated SSO
Athena connector Export to CSV S3 Analytics
Week Aggregation Aurora PostgreSQL Calculations in SPICE
Cross Account
S3 Access
Aggregate Filters Hourly refresh
Row level security Hourly refresh
10K Filter Values On-screen controls
Redshift Spectrum
Support
KPI Chart
Spark Connector
AWS Directory Service
AD Connector
Tabular Reports Data labels
URL Actions
Combo Charts
Audit logging
with CloudTrail
Geospatial maps Count Distinct Parameters Relative Date Filters Filter Groups
Table calculations Snowflake Connector SaaS Connectors Teradata Connector HIPAA PCI compliance
Amazon QuickSight has been innovating quickly
Amazon QuickSight—embedded dashboards
Supercharge your applications with embedded dashboards
Fully interactive with drill down, filtering, & external links
No servers to manage, no long-term commitments
Pay for usage with pay-per-session reader pricing
Easy embedding with JavaScript SDK
Discover all the hidden trends and
anomalies on millions of metrics
Amazon QuickSight—ML Insights
Example: anomaly detection
“Sales for office supplies in APAC was
15% above expected.”
Amazon QuickSight—ML Insights
Example: anomaly detection
“SMB Segment was the top
contributor.”
Amazon QuickSight—ML Insights
Example: anomaly detection
“It’s significant because SMB typically
only accounts for 30% of sales.”
Amazon QuickSight—ML Insights
Example: anomaly detection
QuickSight ML-powered forecasting Traditional BI forecasting
Captures seasonality and upward trends
Automatically excludes bad data
High confidence band
Captures only seasonality
Missing upward trend
Confidence band influenced by bad data
QuickSight ML Insights vs. traditional BI forecasting
VS.
Insights in plain language narrative
Embedded within your dashboard
No more staring at dashboards for hours!
Fully customizable to meet every need
No coding needed. Easy-to-use UI templates.
Amazon QuickSight—ML Insights
Auto-narratives
Most enterprise database & analytics cloud customers
Most startup database & analytics cloud customers
Artificial Intelligence, Machine
Learning, Deep Learning, Data
Science #buzZZzzzWords…
?!
Artificial Intelligence
• “the science and engineering of making intelligent machines”
(John McCarthy - ~1950s)
Weak AI Strong AI (AGI)
I like artificial intelligence
Lubię sztuczną inteligencję
Machine Learning
“is a field of computer science that gives computers
the ability to learn without being explicitly
programmed”
(Arthur Samuel - ~1959)
Deep Learning
• “is a subarea of machine learning that uses deep neural networks to model
complex problems”
Deep	Learning
Machine	Learning
Artificial	Intelligence
Autonomous cars
Object detection
Centimeter-accurate positioningReal time per pixel
image segmentation
Autonomous cars
Text to Speech
Visual Search
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Find real value in raw data streams
Customer & Industry Maturity
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Advanced
knowledge
Basic understanding
of AI/ML
Low or no
knowledge of AI/ML
§ Cross Industry Standard
Process for Data Mining
§ Current de facto process for
doing Data Science
§ Highlights the cyclical and
iterative natures of Data
Science
https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining
Machine Learning modeling process
Artificial Intelligence @ AWS
Application
Services
Platform
Services
Frameworks
& Infrastructure
Apache MXNet PyTorchCognitive Toolkit Keras
Caffe2
& Caffe
TensorFlow
AWS Deep Learning AMI (Ubuntu & Amazon Linux – Cuda 8 & 9)
GPU
(P2 & P3)
MobileCPU IoT (Greengrass)
Amazon Machine
Learning
Mechanical
Turk
Spark & EMR
Vision:
Rekognition
Rekognition Video
Speech:
Polly
Transcribe
Language:
Lex
Translate
Comprehend
Textract
Gluon
SageMaker
Ground Truth
RL, Neo, Pipeline
DeepLens
DeepRacer
Marketplace
Time Series:
Forecast
Recommendation:
Personalize
Elastic
Inference
Inferentia
RoboMaker
Thank you!
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.41

Big Data Analytics, Machine Learning e Inteligência Artificial

  • 1.
    Data has gravity– The AWS Data Management approach Adolfo Abreu Business Development Latam Team Database, Big Data, Analytics, AI, ML, IoT
  • 2.
    What do youknow about Amazon? © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.2 Unprecedented SCALE Hyper SPEED Relentless INNOVATION Customer OBSESSION
  • 3.
    Amazon Web Services ©2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.3 Applying amazon.com concepts to infrastructure Unprecedented SCALE Hyper SPEED Relentless INNOVATION Customer OBSESSION Infinite IT resources available in minutes 18 regions, 55 availability zones New capabilities launched daily Working Backwards from the customer PR FAQ
  • 4.
    Personalized Recommendations Fulfillment automation & InventoryManagement Drones Voice driven Interactions Inventing entirely new Customer experiences At Amazon, we’ve been making investments in Machine Learning for the last 20 years
  • 5.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved.
  • 8.
    Analytics Redshift Data warehousing EMR Hadoop +Spark Athena Interactive analytics Kinesis Data Analytics Real time Elasticsearch Service Operational Analytics QuickSight SageMaker S3/Glacier Glue ETL & Data Catalog Lake Formation Data Lakes Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams Data Movement Business Intelligence & Machine Learning Data Lake Our portfolio - purpose-built for builders
  • 9.
    Analytics QuickSight SageMaker S3/Glacier Glue ETL &Data Catalog Lake Formation Data Lakes Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams Data Movement Business Intelligence & Machine Learning Data Lake Redshift Data warehousing EMR Hadoop + Spark Kinesis Data Analytics Real time Elasticsearch Service Operational Analytics Athena Interactive analytics RDS MySQL, PostgreSQL, MariaDB, Oracle, SQL Server Aurora MySQL, PostgreSQL DynamoDB Key value, Document ElastiCache Redis, Memcached Neptune Graph Timestream Time Series QLDB Ledger Database RDS on VMware Databases Our portfolio - purpose-built for builders
  • 10.
    Redshift Data warehousing EMR Hadoop +Spark Athena Interactive analytics Kinesis Data Analytics Real time Elasticsearch Service Operational Analytics RDS MySQL, PostgreSQL, MariaDB, Oracle, SQL Server Aurora MySQL, PostgreSQL QuickSight SageMaker DynamoDB Key value, Document ElastiCache Redis, Memcached Neptune Graph Timestream Time Series QLDB Ledger Database S3/Glacier Glue ETL & Data Catalog Lake Formation Data Lakes Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams Data Movement Analytics Databases Business Intelligence & Machine Learning Data Lake Managed Blockchain Blockchain Templates Blockchain RDS on VMware Our portfolio - purpose-built for builders
  • 11.
    Quickly build new appsin the cloud Gain new insights “Lift and shift” existing apps to the cloud Three type of projects
  • 12.
    Traditionally, analytics lookedlike this Relational data GBs-TBs scale [not designed for PB/EBs] Expensive: Large initial capex + $10K-$50K/TB/year 90% of data was thrown away because of cost OLTP ERP CRM LOB Data Warehouse Business Intelligence
  • 13.
    Snowball Snowmobile Kinesis Data Firehose Kinesis DataStreams S3 Redshift EMR Athena Kinesis Elasticsearch Service Data lakes on AWS Kinesis Video Streams AI Services QuickSight Exabyte scale Store and analyze relational and non-relational data Purpose-built analytics tools Cost effective • Store at 2.3 cents per GB-month in Amazon S3 • Query with Amazon Athena at ½ cent per GB scanned • DW with Amazon Redshift for $1,000/TB/year Give access to everyone • Amazon QuickSight: $0.30 for 30 minutes of use
  • 14.
    How it works S3 IAMKMS OLTP ERP CRM LOB Devices Web Sensors Social Kinesis Build data lakes quickly • Identify, crawl, and catalog sources • Ingest and clean data • Transform into optimal formats Simplify security management • Enforce encryption • Define access policies • Implement audit login Enable self-service and combined analytics • Analysts discover all data available for analysis from a single data catalog • Use multiple analytics tools over the same data Athena Redshift AI Services EMR QuickSight Data catalog
  • 15.
    Relational Key-value DocumentIn-memory Graph Time-series Ledger DynamoDB NeptuneAmazon RDS Aurora CommercialCommunity Timestream QLDBElastiCacheDocumentDB AWS databases services
  • 16.
    Amazon RDS Managed relationaldatabase service with a choice of six popular database engines Available & durable Automatic Multi-AZ data replication; automated backup, snapshots, failover Easy to administer No need for infrastructure provisioning, installing and maintaining DB software Highly scalable Scale database compute and storage with a few clicks with no application downtime Fast & secure SSD storage and guaranteed provisioned I/O; data encryption at rest and in transit
  • 17.
    Amazon QuickSight First BIservice with pay-per-session pricing for everyone in your organization Serverless, cloud-powered BI service (no servers to manage) Scale from 10s of users to 100s of thousands of users Pay only for what you use • Readers: $0.30/30 min session with a $5/user/month max • Authors: $18/month/Author Integrates with S3, Athena, Redshift, RDS, Aurora, & EMR
  • 18.
    AWS Directory Service MicrosoftAD Custom Date Format Dashboard Save As Aggregate Calculations Readers Groups Private VPC 25 GB SPICE tables Spark and Presto Connector Scheduled refresh Just In Time Provisioning One-click upgrade Search Totals Excel Custom Range 100+ new features released since launch Federated SSO Athena connector Export to CSV S3 Analytics Week Aggregation Aurora PostgreSQL Calculations in SPICE Cross Account S3 Access Aggregate Filters Hourly refresh Row level security Hourly refresh 10K Filter Values On-screen controls Redshift Spectrum Support KPI Chart Spark Connector AWS Directory Service AD Connector Tabular Reports Data labels URL Actions Combo Charts Audit logging with CloudTrail Geospatial maps Count Distinct Parameters Relative Date Filters Filter Groups Table calculations Snowflake Connector SaaS Connectors Teradata Connector HIPAA PCI compliance Amazon QuickSight has been innovating quickly
  • 19.
    Amazon QuickSight—embedded dashboards Superchargeyour applications with embedded dashboards Fully interactive with drill down, filtering, & external links No servers to manage, no long-term commitments Pay for usage with pay-per-session reader pricing Easy embedding with JavaScript SDK
  • 20.
    Discover all thehidden trends and anomalies on millions of metrics Amazon QuickSight—ML Insights Example: anomaly detection
  • 21.
    “Sales for officesupplies in APAC was 15% above expected.” Amazon QuickSight—ML Insights Example: anomaly detection
  • 22.
    “SMB Segment wasthe top contributor.” Amazon QuickSight—ML Insights Example: anomaly detection
  • 23.
    “It’s significant becauseSMB typically only accounts for 30% of sales.” Amazon QuickSight—ML Insights Example: anomaly detection
  • 24.
    QuickSight ML-powered forecastingTraditional BI forecasting Captures seasonality and upward trends Automatically excludes bad data High confidence band Captures only seasonality Missing upward trend Confidence band influenced by bad data QuickSight ML Insights vs. traditional BI forecasting VS.
  • 25.
    Insights in plainlanguage narrative Embedded within your dashboard No more staring at dashboards for hours! Fully customizable to meet every need No coding needed. Easy-to-use UI templates. Amazon QuickSight—ML Insights Auto-narratives
  • 26.
    Most enterprise database& analytics cloud customers
  • 27.
    Most startup database& analytics cloud customers
  • 28.
    Artificial Intelligence, Machine Learning,Deep Learning, Data Science #buzZZzzzWords… ?!
  • 29.
    Artificial Intelligence • “thescience and engineering of making intelligent machines” (John McCarthy - ~1950s) Weak AI Strong AI (AGI) I like artificial intelligence Lubię sztuczną inteligencję
  • 30.
    Machine Learning “is afield of computer science that gives computers the ability to learn without being explicitly programmed” (Arthur Samuel - ~1959)
  • 31.
    Deep Learning • “isa subarea of machine learning that uses deep neural networks to model complex problems” Deep Learning Machine Learning Artificial Intelligence
  • 32.
  • 33.
    Centimeter-accurate positioningReal timeper pixel image segmentation Autonomous cars
  • 35.
  • 36.
  • 37.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Find real value in raw data streams
  • 38.
    Customer & IndustryMaturity © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Advanced knowledge Basic understanding of AI/ML Low or no knowledge of AI/ML
  • 39.
    § Cross IndustryStandard Process for Data Mining § Current de facto process for doing Data Science § Highlights the cyclical and iterative natures of Data Science https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining Machine Learning modeling process
  • 40.
    Artificial Intelligence @AWS Application Services Platform Services Frameworks & Infrastructure Apache MXNet PyTorchCognitive Toolkit Keras Caffe2 & Caffe TensorFlow AWS Deep Learning AMI (Ubuntu & Amazon Linux – Cuda 8 & 9) GPU (P2 & P3) MobileCPU IoT (Greengrass) Amazon Machine Learning Mechanical Turk Spark & EMR Vision: Rekognition Rekognition Video Speech: Polly Transcribe Language: Lex Translate Comprehend Textract Gluon SageMaker Ground Truth RL, Neo, Pipeline DeepLens DeepRacer Marketplace Time Series: Forecast Recommendation: Personalize Elastic Inference Inferentia RoboMaker
  • 41.
    Thank you! © 2018,Amazon Web Services, Inc. or its Affiliates. All rights reserved.41