Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Analytics on AWS:Structured, Unstructured and Streaming

1,208 views

Published on

Data comes in a variety of forms and in order to gain insight from this data you need to have the right platform in place. AWS has the services to cover all types of data, whether you need databases for structured data, Hadoop for unstructured data or a streaming engine for high-velocity data. In this session we will cover the various data analytics services on AWS and when to use them.

Published in: Technology

Analytics on AWS:Structured, Unstructured and Streaming

  1. 1. Analytics on AWS Structured, Unstructured and Streaming Russell Nash AWS Solutions Architect
  2. 2. Ingest Store Process Analyse
  3. 3. EC2 Linux Windows
  4. 4. Databases Database Data Log Data High Velocity Data Analytics Database INGEST STORE Devices Web Servers App Servers Mobile Amazon Redshift Amazon RDS
  5. 5. Traditional Database Choice of Engines Low Admin Amazon RDS (Relational Database Service) Amazon Aurora
  6. 6. MPP SQL Database Optimised for Analytics Scalable Amazon Redshift
  7. 7. Amazon Redshift Amazon RDS ScalingVertical Horizontal WorkloadMixed Analytical VolumesLow to Medium Medium to High TypeSQL Relational SQL Relational
  8. 8. Database Data ETL INGEST STORE High Velocity Data Devices Web Servers App Servers Mobile Databases Log Data Amazon Redshift Amazon RDS
  9. 9. AWS Database Migration Service Amazon Redshift Source Database ETL ETL Partners Amazon RDS
  10. 10. Storage INGEST STORE High Velocity Data Devices Web Servers App Servers Mobile Database Data Databases Log Data Search Amazon Elasticsearch Amazon Redshift Amazon RDS
  11. 11. Amazon Elasticsearch Search and Analytics Scalable Integrated – Logstash, Kibana
  12. 12. Database Data Storage INGEST STORE High Velocity Data Devices Web Servers App Servers Mobile Databases Log Data Amazon Elasticsearch Amazon S3 Amazon Redshift Amazon RDS
  13. 13. Amazon S3 Object Storage Low Cost 11 9’s of durability
  14. 14. PIG HDFS
  15. 15. PIG Amazon EMR Amazon S3 EMRFS EMR
  16. 16. CPU c4 family c3 family Memory x1 family r3 family Disk/IO d2 family i2 family General m4 family m3 family Instance Types Batch Machine Spark and Large process learning interactive HDFS
  17. 17. Cost & Time # CPUs Time # CPUs Time Wall clock time: 1 hourWall clock time: 10 hours
  18. 18. Spot Price – M3.2XL On-Demand Spot-Price $0.08$0.75
  19. 19. Database Data INGEST STORE Mobile High Velocity Data Devices Web Servers App Servers Mobile Databases Log Data Amazon Elasticsearch Amazon S3 Stream Processor Amazon Kinesis NoSQL Amazon Redshift Amazon RDS
  20. 20. Availability Zone Availability Zone Availability Zone Data Sources Data Sources Data Sources Data Sources Data Sources S3 Redshift Amazon Kinesis Stream AWS Lambda KCL App EMR Elasticsearch
  21. 21. Database Data INGEST STORE High Velocity Data Devices Web Servers App Servers Mobile Databases Log Data Amazon Elasticsearch Amazon S3 Amazon Kinesis NoSQL Amazon DynamoDB Amazon Redshift Amazon RDS
  22. 22. scalability data complexity RDBMS key/value document graph NoSQL
  23. 23. NoSQL Database Key/Value + Document Very Low Latency Amazon DynamoDB
  24. 24. Low Latency
  25. 25. INGEST STORE PROCESS Databases Amazon Redshift Amazon Kinesis Amazon S3 Impala Amazon Redshift Amazon EMR Database Data InteractiveBatchStreaming Hadoop Amazon Elasticsearch Log Data Mobile High Velocity Data Devices Web Servers App Servers Mobile Amazon DynamoDB
  26. 26. INGEST STORE PROCESS Impala Amazon Redshift Database Data InteractiveBatch PIG Streaming Amazon EMR Hadoop Amazon Kinesis Amazon S3 Amazon Elasticsearch Mobile High Velocity Data Devices Web Servers App Servers Mobile Amazon DynamoDB Databases Log Data Amazon Redshift
  27. 27. FAST RICH INDUSTRY SUPPORT FLEXIBLE
  28. 28. INGEST STORE PROCESS Impala Amazon Redshift AWS Lambda Kinesis Consumers Database Data InteractiveBatchStreaming PIG Amazon EMR Hadoop Amazon Kinesis Amazon S3 Amazon Elasticsearch Mobile High Velocity Data Devices Web Servers App Servers Mobile Amazon DynamoDB Databases Log Data Amazon Redshift
  29. 29. Comparison of Query Engines Amazon Redshift Amazon Elasticsearch Data Structure Languages Semi SemiSemi Multiple Full Text SearchSQL Full SQL Data Store S3/HDFS LocalS3/HDFS Local Performance
  30. 30. Comparison of Query Engines Amazon Redshift Amazon Elasticsearch General Purpose Processing Engine SQL Query Engine for S3/HDFS Fully Featured SQL Analytics Database Full Text search VisualizationUse Case
  31. 31. INGEST STORE PROCESS Impala Amazon Redshift AWS Lambda Kinesis Consumers Database Data InteractiveBatchStreaming PIG Amazon EMR Hadoop Amazon Kinesis Amazon S3 Amazon Elasticsearch Mobile High Velocity Data Devices Web Servers App Servers Mobile Amazon DynamoDB Databases Log Data Amazon Redshift ANALYSE BI Tools Amazon Machine Learning ML Amazon QuickSight
  32. 32. INGEST STORE PROCESS Impala Amazon Redshift AWS Lambda Kinesis Consumers Database Data InteractiveBatchStreaming PIG Amazon EMR Hadoop Amazon Kinesis Amazon S3 Amazon Elasticsearch Mobile High Velocity Data Devices Web Servers App Servers Mobile Amazon DynamoDB Databases Log Data Amazon Redshift ANALYSE BI Tools Amazon Machine Learning ML Amazon QuickSight
  33. 33. Visit bit.ly/aws-tech-chat or download via SoundCloud or iTunes

×