Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Analytics on AWS

2,147 views

Published on

The slides from the event in Milan on October 29th, 2015

Published in: Data & Analytics
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Data Analytics on AWS

  1. 1. Data Analytics on AWS Danilo Poccia Technical Evangelist @danilop Carlos Conde Sr. Manager Technical Evangelism @caarlco
  2. 2. THE MORE DATA YOU COLLECT THE MORE VALUE YOU CAN DERIVE FROM IT
  3. 3. THE COST OF DATA GENERATION IS FALLING
  4. 4. GENERATE ➔ STORE ➔ ANALYZE ➔ SHARE
  5. 5. GENERATE ➔ STORE ➔ ANALYZE ➔ SHARE Lower cost, 
 higher throughput
  6. 6. GENERATE ➔ STORE ➔ ANALYZE ➔ SHARE Lower cost, 
 higher throughput Highly
 constrained
  7. 7. + ELASTIC AND HIGHLY SCALABLE + NO UPFRONT CAPITAL EXPENSE + ONLY PAY FOR WHAT YOU USE
 + AVAILABLE ON-DEMAND = REMOVE CONSTRAINTS
  8. 8. GENERATE ➔ STORE ➔ ANALYZE ➔ SHARE
  9. 9. GENERATE ➔ STORE ➔ ANALYZE ➔ SHARE AWS Import / Export
 AWS Direct Connect
  10. 10. Inbound data transfer is free Multipart upload to S3 AWS Direct Connect AWS Import / Export
  11. 11. Amazon Snowball • Petabyte-scale data transport solution • 50 TB per appliance • 10Gbps connectivity to device • Tamper resistant, 256-bit encryption and Trusted Platform Module • Low Cost • End-to-end tracking via Amazon SNS, text message or the AWS Console
  12. 12. GENERATE ➔ STORE ➔ ANALYZE ➔ SHARE Amazon S3,
 Amazon Glacier,
 Amazon DynamoDB,
 Amazon RDS,
 Amazon Redshift,
 AWS Storage Gateway,
 Data on Amazon EC2
  13. 13. AMAZON S3
 SIMPLE STORAGE SERVICE
  14. 14. CASE STUDY: SPOTIFY ADDS 20,000 TRACKS/DAY TO ITS CATALOGUE
  15. 15. AMAZON 
 DYNAMODB HIGH-PERFORMANCE, FULLY MANAGED NoSQL DATABASE SERVICE
  16. 16. DURABLE & AVAILABLE
 CONSISTENT, DISK-ONLY 
 WRITES (SSD)
  17. 17. LOW LATENCY
 AVERAGE READS < 5MS,
 WRITES < 10MS
  18. 18. NO ADMINISTRATION
  19. 19. CASE STUDY: SHAZAM SUPPORTED 500,000 WRITES/SEC DURING SUPER BOWL
  20. 20. AMAZON
 REDSHIFT FULLY MANAGED, PETA-BYTE SCALE DATAWAREHOUSE ON AWS
  21. 21. 30 MINUTES 
 DOWN TO
 12 SECONDS
  22. 22. GENERATE ➔ STORE ➔ ANALYZE ➔ SHARE Amazon EC2
 Amazon Elastic MapReduce
  23. 23. AMAZON EC2
 ELASTIC COMPUTE CLOUD
  24. 24. 3 HOURS
 FOR $4828.85/hr
  25. 25. Instead of 
 $20+ MILLIONS
 in infrastructure
  26. 26. GPU INSTANCES G2 CG1  1x NVIDIA Kepler GK104
 8 vCPU (Intel Xeon E5-2670) 2x NVIDIA Fermi M2050
 16 vCPU (Intel Xeon X5570) $0.65/h $2.10/h
  27. 27. ON A SINGLE INSTANCE COMPUTE TIME: 4h
 COST: 4h x $2.1 = $8.4
  28. 28. ON MULTIPLE INSTANCES COMPUTE TIME: 1h
 COST: 1h x 4 x $2.1 = $8.4
  29. 29. GENERATE ➔ STORE ➔ ANALYZE ➔ SHARE Amazon S3,
 Amazon DynamoDB,
 Amazon RDS,
 Amazon Redshift,
 Data on Amazon EC2
  30. 30. GENERATE ➔ STORE ➔ ANALYZE ➔ SHARE
  31. 31. Internet of Things
  32. 32. The Number of
 Connected Sensors and Devices
 is Growing Exponentially
  33. 33. Sensors + Actuators
  34. 34. AWS IoT Secure, bi-directional communication
 between Internet-connected things
 (such as sensors, actuators, embedded devices,
 or smart appliances)
 and the AWS cloud over MQTT and HTTP
  35. 35. DEVICE SDK Set of client libraries to connect, authenticate and exchange messages DEVICE GATEWAY Communicate with devices via MQTT and HTTP AUTHENTICATION AUTHORIZATION Secure with mutual authentication and encryption RULES ENGINE Transform messages based on rules and route to AWS Services AWS Services - - - - - 3P Services DEVICE SHADOW Persistent thing state during intermittent connections APPLICATIONS AWS IoT API DEVICE REGISTRY Identity and Management of your things AWS IoT
  36. 36. C-SDK (Ideal for
 embedded OS) JS-SDK (Ideal for Embedded Linux Platforms) Arduino Library (Arduino Yun) Mobile SDK (Android and iOS) AWS IoT
  37. 37. <demo> ... </demo>
  38. 38. GENERATE ➔ STORE ➔ ANALYZE ➔ SHARE
  39. 39. GENERATE ➔ STORE ➔ ANALYZE ➔ SHARE Amazon S3,
 Amazon DynamoDB,
 Amazon RDS,
 Amazon Redshift,
 Data on Amazon EC2 Amazon EC2
 Amazon Elastic MapReduce Amazon S3,
 Amazon Glacier,
 Amazon DynamoDB,
 Amazon RDS,
 Amazon Redshift,
 AWS Storage Gateway,
 Data on Amazon EC2 AWS Import / Export
 AWS Direct Connect
  40. 40. GENERATE ➔ ➔ SHARE STREAM PROCESSING
  41. 41. GENERATE ➔ ➔ SHARE STREAM PROCESSING Amazon S3,
 Amazon DynamoDB,
 Amazon RDS,
 Amazon Redshift,
 Data on Amazon EC2 Amazon Kinesis
 Stream Processing on Amazon EC2
  42. 42. FROM DATA TO
 ACTIONABLE INFORMATION
  43. 43. RAW DATA BUSINESS INTELLIGENCE RAW INFORMATION DATA PREDICTIONS
  44. 44. Data Analytics
  45. 45. Data Analytics Value > Costs Storage and Analysis Costs
 are Going Down Making
 New Use Cases Possible
  46. 46. Structured Vs Unstructured Data High Degree
 of Organization Data Model Free Text Multimedia Social Media
  47. 47. Structured Semi-structured Unstructured Data XML JSON
  48. 48. Batch Vs Real-time Data Fixed Dataset Updated in
 Discrete Moments Continuous
 Stream of Data
  49. 49. Batch Report Real-time Alerts Prediction Forecast
  50. 50. Unstructured
 Data
  51. 51. ? Unstructured
 Data Structured
 Data
  52. 52. Unstructured
 Data Structured
 Data Resilient Distributed Datasets (RDDs) Memory Fast Processing Large Quantity of Data Disk Hadoop MapReduce Spark ?
  53. 53. Amazon
 Elastic MapReduce
 (Amazon EMR) Unstructured
 Data Structured
 Data
  54. 54. Amazon
 Elastic MapReduce
 (Amazon EMR) Structured
 Data Unstructured
 Data Structured
 Data
  55. 55. Amazon
 Elastic MapReduce
 (Amazon EMR) Managed clusters For Hadoop, Spark, Presto
 or any other applications
 in the Apache / Hadoop stackWhat is
 Amazon EMR?
  56. 56. Amazon
 Elastic MapReduce
 (Amazon EMR) Overview of
 Amazon EMR
 Architecture Storage HDFS EMRFS Local
 File System Data Processing Frameworks Hadoop Spark … Applications and Programs Hive Pig … ClusterResourceManagement YARNAgent…
  57. 57. Amazon
 Elastic MapReduce
 (Amazon EMR) Overview of
 Amazon EMR
 Architecture Master
 Instance
 Group Core
 Instance
 Group Task
 Instance
 Group EC2 Spot Instances
  58. 58. Amazon
 Elastic MapReduce
 (Amazon EMR) Hadoop NextGen MapReduce (YARN) https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html
  59. 59. Amazon
 Elastic MapReduce
 (Amazon EMR) Spark Cluster Mode (RDDs) http://spark.apache.org/docs/latest/cluster-overview.html
  60. 60. Separate Compute and Storage Resize and shut down
 Amazon EMR clusters with no data loss Point multiple Amazon EMR clusters
 at the same data in Amazon S3 Easily evolve your analytic infrastructure
 as technology evolves Leverage
 Amazon S3 with 
 EMR File System (EMRFS) S3 Bucket Cluster EMR Cluster Cluster EMR Cluster Amazon
 Elastic MapReduce
 (Amazon EMR)
  61. 61. Read-after-write consistency
 Very fast list operations
 (thanks to Amazon DynamoDB) Transparent to applications as s3://… S3 Bucket Cluster EMR Cluster DynamoDB Table Amazon
 Elastic MapReduce
 (Amazon EMR) EMRFS
 makes it easier
 to use Amazon S3
  62. 62. CREATE EXTERNAL TABLE serde_regex( host STRING, referer STRING, agent STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' ) LOCATION ‘some/path/input/' S3 Bucket Cluster EMR Cluster DynamoDB Table Amazon
 Elastic MapReduce
 (Amazon EMR) Going
 from HDFS
 …
  63. 63. S3 Bucket Cluster EMR Cluster DynamoDB Table Amazon
 Elastic MapReduce
 (Amazon EMR) Going
 from HDFS
 to Amazon S3 CREATE EXTERNAL TABLE serde_regex( host STRING, referer STRING, agent STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' ) LOCATION 's3://bucket/path/input/'
  64. 64. Amazon
 Elastic MapReduce
 (Amazon EMR) Consistent view and fast listing using 
 the optional EMRFS metadata layer List and read-after-write consistency Faster list operations Number of objects Without consistent view With 
 consistent view 1,000,000 147.72 29.70 100,000 12.70 3.69 Tested using a single node cluster with a m3.xlarge instance
  65. 65. Amazon
 Elastic MapReduce
 (Amazon EMR) EMRFS client-side encryption S3 Bucket Cluster EMR Cluster Cluster EMR Cluster AWS KMS or your custom key vendor AmazonS3encryptionclients EMRFSenabledfor
 AmazonS3client-sideencryption
  66. 66. Iterative workloads If you’re processing
 the same dataset more than once, consider using Spark & RDDs for this too Disk I/O intensive workloads Persist data on Amazon S3 and use S3DistCp to copy to/from HDFS for processingHDFS is still there
 if you need it Amazon
 Elastic MapReduce
 (Amazon EMR)
  67. 67. Use S3 as your persistent data store
 Query it using Presto, Hive, Spark, etc. Use Amazon EC2 Spot Instances to save >80% Use Amazon EC2 Reserved Instances
 for steady workloads Use Amazon CloudWatch alarms to notify you
 if a cluster is underutilized, then shut it down:
 e.g. 0 mappers running for >N hours Cost saving tips for Amazon EMR Amazon
 Elastic MapReduce
 (Amazon EMR)
  68. 68. Resize your cluster,
 or create clusters when needed
 and only pay for compute when you need it Intelligent Scale Down (including YARN / HDFS)Cost saving tips for Amazon EMR Amazon
 Elastic MapReduce
 (Amazon EMR)
  69. 69. Amazon S3 is your Data Lake S3 Bucket Cluster Hive, Pig Cluster Presto Cluster Spark Cluster Ad Hoc Cluster Cascading Logical Separation of Jobs
  70. 70. Amazon
 Elastic MapReduce
 (Amazon EMR) Structured
 Data Unstructured
 Data Structured
 Data
  71. 71. <demo> ... </demo>
  72. 72. A managed service that makes it easy
 to deploy, operate, and scale Elasticsearch
 in the AWS Cloud High availability, patch management, failure detection
 and node replacement, backups, and monitoring Integrated with Logstash and Kibana Scale up and scale down your cluster to deliver optimum performance as data and usage patterns change, paying only for the resources you actually consume Control access to the Elasticsearch APIs using AWS Identity and Access Management (IAM) policies What is
 Amazon ES? Amazon
 Elasticsearch Service
 (Amazon ES)
  73. 73. Amazon ES
 Architecture Amazon
 Elasticsearch Service
 (Amazon ES) Elasticsearch Kibana Amazon
 CloudWatch AWS
 CloudTrail Elastic
 Load Balancing Amazon
 Route 53 Elasticsearch
 APIs AWS Credentials
 (AWS IAM)
  74. 74. <demo> ... </demo>
  75. 75. Structured
 Data
  76. 76. ? Structured
 Data Information
  77. 77. Amazon
 Redshift Structured
 Data Information
  78. 78. Relational Data Warehouse a lot faster a lot simpler a lot cheaper 
 Massively parallel + Petabyte scale
 Fully managed
 HDD and SSD Platforms
 $1,000/TB/Year; starts at $0.25/hour What is
 Amazon Redshift? Amazon
 Redshift
  79. 79. Amazon Redshift Architecture Amazon
 Redshift Compute
 Node Compute
 Node Compute
 Node Leader
 Node SQL Clients / BI Tools Amazon S3 / Amazon DynamoDB / SSH 10GbE Ingestion/Backup JDBC / ODBC
  80. 80. Dramatically less I/O Column storage Data compression Zone maps Direct-attached storage Large data block sizes Amazon Redshift Performance Amazon
 Redshift analyze compression listing; Table | Column | Encoding ---------+----------------+---------- listing | listid | delta listing | sellerid | delta32k listing | eventid | delta32k listing | dateid | bytedict listing | numtickets | bytedict listing | priceperticket | delta32k listing | totalprice | mostly32 listing | listtime | raw 10 | 13 | 14 | 26 |… … | 100 | 245 | 324 375 | 393 | 417… … 512 | 549 | 623 637 | 712 | 809 … … | 834 | 921 | 959 10 324 375 623 637 959
  81. 81. Sort Keys
 and
 Zone Maps Amazon
 Redshift SELECT COUNT(*) FROM LOGS WHERE DATE = ‘09-JUNE-2013’ Unsorted Sorted by Date MIN: 01-JUNE-2013 MAX: 20-JUNE-2013 MIN: 08-JUNE-2013 MAX: 30-JUNE-2013 MIN: 12-JUNE-2013 MAX: 20-JUNE-2013 MIN: 02-JUNE-2013 MAX: 25-JUNE-2013 MIN: 01-JUNE-2013 MAX: 06-JUNE-2013 MIN: 07-JUNE-2013 MAX: 12-JUNE-2013 MIN: 13-JUNE-2013 MAX: 18-JUNE-2013 MIN: 19-JUNE-2013 MAX: 24-JUNE-2013
  82. 82. Parallel and Distributed Amazon
 Redshift Compute
 Node Compute
 Node Compute
 Node Leader
 Node SQL Clients / BI Tools Amazon S3 / Amazon DynamoDB / SSH Query Load / Export / Backup / Restore
  83. 83. Parallel and Distributed Amazon
 Redshift Compute
 Node Compute
 Node Compute
 Node Leader
 Node SQL Clients / BI Tools Amazon S3 / Amazon DynamoDB / SSH Compute
 Node Query Load / Export / Backup / Restore Resize
  84. 84. Load encrypted from S3 SSL to secure data in transit ECDHE perfect forward security Amazon VPC for network isolation Encryption to secure data at rest All blocks on disks & in Amazon S3 encrypted Block key, Cluster key, Master key (AES-256) On-premises HSM & AWS CloudHSM support Audit logging and AWS CloudTrail integration SOC 1/2/3, PCI-DSS, FedRAMP, BAA Amazon Redshift Security Amazon
 Redshift
  85. 85. Amazon Redshift Innovation Amazon
 Redshift Service Launch (2/14) PDX (4/2) Temp Credentials (4/11) DUB (4/25) SOC1/2/3 (5/8) Unload Encrypted Files NRT (6/5) JDBC Fetch Size (6/27) Unload logs (7/5) SHA1 Builtin (7/15) 4 byte UTF-8 (7/18) Sharing snapshots (7/18) Statement Timeout (7/22) Timezone, Epoch, Autoformat (7/25) WLM Timeout/Wildcards (8/1) CRC32 Builtin, CSV, Restore Progress (8/9) Resource Level IAM (8/9) PCI (8/22) UTF-8 Substitution (8/29) JSON, Regex, Cursors (9/10) Split_part, Audit tables (10/3) SIN/SYD (10/8) HSM Support (11/11) Kinesis EMR/HDFS/SSH copy, Distributed Tables, Audit Logging/CloudTrail, Concurrency, Resize Perf., Approximate Count Distinct, SNS Alerts, Cross Region Backup (11/13) Distributed Tables, Single Node Cursor Support, Maximum Connections to 500 (12/13) EIP Support for VPC Clusters (12/28) New query monitoring system tables and diststyle all (1/13) Redshift on DW2 (SSD) Nodes (1/23) Compression for COPY from SSH, Fetch size support for single node clusters, new system tables with commit stats, row_number(), strotol() and query termination (2/13) Resize progress indicator & Cluster Version (3/21) Regex_Substr, COPY from JSON (3/25) 50 slots, COPY from EMR, ECDHE ciphers (4/22) 3 new regex features, Unload to single file, FedRAMP(5/6) Rename Cluster (6/2) Copy from multiple regions, percentile_cont, percentile_disc (6/30) Free Trial (7/1) pg_last_unload_count (9/15) AES-128 S3 encryption (9/29) UTF-16 support (9/29) Well over 100 new features added since launch Release every two weeks Automatic patching
  86. 86. Amazon Redshift Features Amazon
 Redshift Approximate functions User defined functions Machine Learning Data Science Amazon ML
  87. 87. Amazon Redshift Ecosystem Amazon
 Redshift Data Integration Systems IntegratorsBusiness Intelligence
  88. 88. Amazon
 Redshift Structured
 Data Information
  89. 89. <demo> ... </demo>
  90. 90. Real-time
 Data
  91. 91. ?Data Stream Real-time
 Information
  92. 92. Amazon
 Kinesis Data Stream Real-time
 Information
  93. 93. A Platform for Streaming Data on AWS What is
 Amazon Kinesis? Amazon
 Kinesis Amazon
 Kinesis
 Streams Amazon
 Kinesis
 Firehose Amazon
 Kinesis
 Analytics
  94. 94. Amazon
 Kinesis
 Streams Amazon
 Kinesis Build your own custom applications
 that process or analyze streaming data
  95. 95. Amazon
 Kinesis
 Streams Amazon
 Kinesis Use the Kinesis Client Library (KCL)
 to consume data from Kinesys Streams
  96. 96. Amazon
 Kinesis
 Streams Amazon
 Kinesis AWS Lambda
 Functions Use AWS Lambda for a serverless architecture
  97. 97. Amazon
 Kinesis
 Streams Amazon
 Kinesis Low latency I/O Configurable retention period from 1 to 7 days The maximum size of a data blob is up to 1 MB Each shard can support: up to 5 transactions / second and up to 2 MB / second for reads up to 1,000 records / second and up to 1 MB / second for writes
  98. 98. <demo> ... </demo>
  99. 99. Amazon
 Kinesis
 Firehose Amazon
 Kinesis Easily load massive volumes
 of streaming data into AWS
  100. 100. <demo> ... </demo>
  101. 101. Amazon
 Kinesis
 Analytics Amazon
 Kinesis Easily analyze streaming data with standard SQL (Coming Soon)
  102. 102. Amazon
 Kinesis Data Stream Real-time
 Information
  103. 103. Learning
 from Data
  104. 104. ?Data Model
  105. 105. Amazon
 Machine Learning
 (Amazon ML) Data Model
  106. 106. Machine learning is the technology that automatically finds patterns in your data and uses them to make predictions for new data points as they become available Your Data + Machine Learning
 = Smart Applications What is
 Machine Learning? Amazon
 Machine Learning
 (Amazon ML)
  107. 107. PERSONALIZE
  108. 108. UNDERSTAND YOUR CUSTOMER Who is my customer really? What does he really like? What is happening with my products? Where do people consume my product?
  109. 109. THREE TYPES OF DATA-DRIVEN ANALYSIS Retrospective analysis and reporting Here-and-now real-time processing and dashboards Predictions to enable smart applications
  110. 110. MACHINE LEARNING Technology that automatically finds patterns in your data and uses them to make predictions for new data points
  111. 111. MACHINE LEARNING
 IS EVERYWHERE
  112. 112. Amazon.com 1994
  113. 113. @caarlco
  114. 114. @caarlco
  115. 115. AMAZON MACHINE LEARNING Scalable & managed machine learning service
  116. 116. <demo> ... </demo>
  117. 117. BEST PRACTICES & LESSONS LEARNED
  118. 118. B EST PR A C TIC ES USE ALL 
 AVAILABLE DATA
 Your company has more data on your users than what you think…
  119. 119. Quizz What percentage of data 
 do firms use for analytics? A: 12% C: 52% B: 34% D: 68%
  120. 120. Quizz What percentage of data 
 do firms use for analytics? A: 12% C: 52% B: 34% D: 68%
  121. 121. B EST PR A C TIC ES ENRICH DATA BASED
 ON SOCIAL NETWORKS
 User’s friends are valuable
 sources of information
  122. 122. 75% of users select movies based on recommendations
  123. 123. HOMOPHILY People are friends with people like them.
  124. 124. B EST PR A C TIC ES ENRICH DATA BASED
 ON USER ENVIRONMENT
 User’s context heavily 
 influences their behavior.
  125. 125. Geo-location data Device information Time of day and week Metadata from third parties …
  126. 126. B EST PR A C TIC ES ENRICH DATA BASED
 ON USER BEHAVIOR
 The way users interacts with the
 UI gives valuable information
  127. 127. B EST PR A C TIC ES THE MORE, THE BETTER
 (SOMETIMES)
 More data usually gives you 
 better prediction accuracy.
  128. 128. B EST PR A C TIC ES MAKE NO PREMATURE ASSUMPTIONS
 We all have preconceived ideas about who are our users … Most often they are wrong.
  129. 129. B EST PR A C TIC ES GET CLOSER 
 TO YOUR USERS
 Your customers live in the real world.
 Use IoT to bring your services closer.
  130. 130. Easy. Buy a copper tube ice maker kit
  131. 131. …And a shut-off valve…
  132. 132. …And a T-connector to tap the cold water supply line…
  133. 133. …And a hacksaw to cut the copper pipe…
  134. 134. …Finally, A special drill bit to make a hole in the kitchen floor for the copper tubing.
  135. 135. “It is not the strongest species that survive, nor the most intelligent, but the ones most responsive to change.”
  136. 136. Amazon
 Machine Learning
 (Amazon ML) Data Model
  137. 137. Data
 Orchestration & Visualization
  138. 138. Data Orchestration can be a Task by Itself S3 Bucket Cluster EMR Cluster DynamoDB Table Redshift DB RDS Instance S3 Bucket On Premises
  139. 139. Helps you reliably process and move data between different AWS compute and storage services, as well as on-premise data sources, at specified intervals What is AWS
 Data Pipeline? AWS
 Data Pipeline
  140. 140. Access your data where it’s stored, transform and process it at scale, and efficiently transfer the results to other AWS services What is AWS
 Data Pipeline? AWS
 Data Pipeline
  141. 141. Create complex data processing workloads that are fault tolerant, repeatable, and highly available What is AWS
 Data Pipeline? AWS
 Data Pipeline
  142. 142. <demo> ... </demo>
  143. 143. Helps you migrate databases to AWS easily and securely: the source database remains fully operational during the migration, minimizing downtime to applications that rely on the database What is
 AWS Database
 Migration Service? AWS Database
 Migration Service Customer Premises Application Users AWS Internet VPN AWS Database Migration Service
  144. 144. Migrate off Oracle and SQL Server Move your tables, views, stored procedures and DML to MySQL, MariaDB, and Amazon Aurora AWS Schema Conversion Tool AWS Database
 Migration Service
  145. 145. Know exactly where manual edits are needed AWS Schema Conversion Tool AWS Database
 Migration Service
  146. 146. ? Structured
 Data Visual
  147. 147. AWS Marketplace Structured
 Data Visual
  148. 148. https://aws.amazon.com/marketplace
  149. 149. <demo> ... </demo>
  150. 150. Amazon
 QuickSight Structured
 Data Visual
  151. 151. A very fast, cloud-powered business intelligence (BI) service that makes it easy to build visualizations, perform ad-hoc analysis, and quickly get business insights from their data What is Amazon QuickSight? Amazon
 QuickSight
  152. 152. First analysis in about 60 seconds Amazon
 QuickSight Business user Sign-in
  153. 153. Amazon
 QuickSight Architecture Amazon
 QuickSight Business User QuickSight API Data Prep Metadata SuggestionsConnectors SPICE Business User QuickSight UI Mobile Devices Web Browsers Partner BI products Amazon S3 Amazon Kinesis Amazon DynamoDB Amazon EMR Amazon Redshift Amazon RDSFiles Third-party
  154. 154. Point to a Data Source
  155. 155. Visualize in Minutes
  156. 156. Smart Visualizations
  157. 157. Dynamically Optimized Graphics
  158. 158. Get Answers
 Fast Amazon
 QuickSight Amazon QuickSight uses SPICE – a Super-fast, Parallel, In-memory optimized Calculation Engine built from the ground up to generate answers on large datasets
  159. 159. Use AWS Partner
 BI Solutions with Amazon QuickSight Amazon
 QuickSight Amazon QuickSight provides partners
 a simple SQL-like interface to query the data stored in SPICE, so that customers can continue using their existing BI tools while benefiting from the faster performance delivered by SPICE
  160. 160. Tell a Story
 with Your Data Share insights
 and collaborate
 with others Amazon
 QuickSight Securely share your analysis with others in your organization by building interactive stories for collaboration using the storyboard and annotations. Recipients can further explore the data and respond back with their insights and knowledge, making the whole organization efficient and effective.
  161. 161. Amazon
 QuickSight Structured
 Data Visual
  162. 162. Let’s Put Everything Together
  163. 163. Unstructured
 Data
  164. 164. Unstructured
 Data S3 Bucket (unstructured)
  165. 165. Unstructured
 Data S3 Bucket (unstructured)
  166. 166. Unstructured
 Data S3 Bucket (unstructured) Cluster EMR Cluster S3 Bucket (structured) Reporting Apps
  167. 167. Unstructured
 Data S3 Bucket (unstructured) Cluster EMR Cluster S3 Bucket (structured) Reporting Apps
  168. 168. Unstructured
 Data S3 Bucket (unstructured) Cluster EMR Cluster S3 Bucket (structured) Elasticsearch
 Cluster Reporting Apps
  169. 169. Unstructured
 Data S3 Bucket (unstructured) Cluster EMR Cluster S3 Bucket (structured) Elasticsearch
 Cluster Reporting Apps
  170. 170. Unstructured
 Data Structured
 Data S3 Bucket (unstructured) Cluster EMR Cluster S3 Bucket (structured) Elasticsearch
 Cluster Reporting Apps
  171. 171. Unstructured
 Data Structured
 Data S3 Bucket (unstructured) Cluster EMR Cluster S3 Bucket (structured) Elasticsearch
 Cluster Reporting Apps
  172. 172. Unstructured
 Data Structured
 Data S3 Bucket (unstructured) Cluster EMR Cluster DynamoDB Table S3 Bucket (structured) Redshift DB Elasticsearch
 Cluster Reporting Apps
  173. 173. Unstructured
 Data Structured
 Data S3 Bucket (unstructured) Cluster EMR Cluster DynamoDB Table S3 Bucket (structured) Redshift DB Elasticsearch
 Cluster Reporting Apps
  174. 174. Unstructured
 Data Structured
 Data Data
 Stream S3 Bucket (unstructured) Cluster EMR Cluster DynamoDB Table S3 Bucket (structured) Redshift DB Elasticsearch
 Cluster Amazon
 Kinesis
 Streams KCL AWS Lambda
 Functions Reporting Apps
  175. 175. Unstructured
 Data Structured
 Data Data
 Stream S3 Bucket (unstructured) Cluster EMR Cluster DynamoDB Table S3 Bucket (structured) Redshift DB Elasticsearch
 Cluster Amazon
 Kinesis
 Streams KCL AWS Lambda
 Functions Reporting Apps
  176. 176. Unstructured
 Data Structured
 Data Data
 Stream S3 Bucket (unstructured) Cluster EMR Cluster DynamoDB Table S3 Bucket (structured) Redshift DB Elasticsearch
 Cluster Amazon
 Kinesis
 Streams Amazon
 Kinesis
 Firehose KCL AWS Lambda
 Functions Reporting Apps
  177. 177. Unstructured
 Data Structured
 Data Data
 Stream S3 Bucket (unstructured) Cluster EMR Cluster DynamoDB Table S3 Bucket (structured) Redshift DB Elasticsearch
 Cluster Amazon
 Kinesis
 Streams Amazon
 Kinesis
 Firehose KCL AWS Lambda
 Functions Reporting Apps
  178. 178. Unstructured
 Data Structured
 Data Data
 Stream S3 Bucket (unstructured) Cluster EMR Cluster DynamoDB Table S3 Bucket (structured) Redshift DB Elasticsearch
 Cluster Amazon
 Kinesis
 Streams Amazon
 Kinesis
 Firehose Amazon
 Kinesis
 Analytics KCL AWS Lambda
 Functions Reporting Apps
  179. 179. Unstructured
 Data Structured
 Data Data
 Stream S3 Bucket (unstructured) Cluster EMR Cluster DynamoDB Table S3 Bucket (structured) Redshift DB Elasticsearch
 Cluster Amazon
 Kinesis
 Streams Amazon
 Kinesis
 Firehose Amazon
 Kinesis
 Analytics KCL AWS Lambda
 Functions Reporting Apps
  180. 180. Unstructured
 Data Structured
 Data Data
 Stream S3 Bucket (unstructured) Cluster EMR Cluster DynamoDB Table S3 Bucket (structured) Redshift DB Elasticsearch
 Cluster Amazon
 Kinesis
 Streams Amazon
 Kinesis
 Firehose Amazon
 Kinesis
 Analytics KCL AWS Lambda
 Functions Reporting Apps Amazon ML
  181. 181. Unstructured
 Data Structured
 Data Data
 Stream S3 Bucket (unstructured) Cluster EMR Cluster DynamoDB Table S3 Bucket (structured) Redshift DB Elasticsearch
 Cluster Amazon
 Kinesis
 Streams Amazon
 Kinesis
 Firehose Amazon
 Kinesis
 Analytics KCL AWS Lambda
 Functions Reporting Apps Amazon ML
  182. 182. Unstructured
 Data Structured
 Data Data
 Stream S3 Bucket (unstructured) Cluster EMR Cluster DynamoDB Table S3 Bucket (structured) Redshift DB Elasticsearch
 Cluster Amazon
 Kinesis
 Streams Amazon
 Kinesis
 Firehose Amazon
 Kinesis
 Analytics KCL AWS Lambda
 Functions Reporting Apps Amazon ML Amazon
 QuickSight AWS Data
 Pipeline
  183. 183. Unstructured
 Data Structured
 Data Data
 Stream S3 Bucket (unstructured) Cluster EMR Cluster DynamoDB Table S3 Bucket (structured) Redshift DB Elasticsearch
 Cluster Amazon
 Kinesis
 Streams Amazon
 Kinesis
 Firehose Amazon
 Kinesis
 Analytics KCL AWS Lambda
 Functions Reporting Apps Amazon ML Amazon
 QuickSight AWS Data
 Pipeline
  184. 184. Collect Store Analyze AWS Direct
 Connect AWS
 Import/Export
 Disk AWS
 Import/Export
 Snowball Amazon
 Kinesis
 Streams Amazon VPC
 VPN Connection AWS Database
 Migration Service AWS
 Data Pipeline Amazon
 Kinesis
 Firehose Amazon
 Kinesis
 Analytics AWS Storage
 Gateway Amazon S3 Amazon
 Glacier Amazon RDS Amazon
 Redshift Amazon
 Elastisearch
 Service Amazon
 DynamoDB Amazon EMR Amazon EC2 Amazon EC2 Container Service Amazon ML Amazon
 QuickSight
  185. 185. Start Simple Amazon S3 + Amazon EMR or Amazon S3 + Amazon Redshift
  186. 186. Grow As You Need
  187. 187. Pay Only For What You Use
  188. 188. Data Analytics on AWS Danilo Poccia Technical Evangelist @danilop Carlos Conde Sr. Manager Technical Evangelism @caarlco

×