Ran Tessler, Manager, Solutions Architecture
Building Big Data Applications on AWS
A Modern Take on Alchemy
Turning Data into Actionable Insights
What to Expect from this Session
Big Data architectural principles
Reference Lambda Architecture
Live demo
Architectural Principles
• Decoupled “data bus”
Data → Store → Process → Answers
• Use the right tool for the job
Latency, throughput, access patterns
• Apply Lambda architecture ideas
Immutable (append-only) log, batch/speed/serving layer
• Leverage AWS managed services
No/low admin
• Be cost conscious
Big data ≠ big cost
Simplify Big Data Processing
Ingest /
collect
store process /
analyze
consume /
visualize
Time to Answer (data freshness)
Throughput
Demo
http://aws.amazon.com/big-data/use-cases/
AccessLog - Common Log Format (CLF)
75.35.230.210 - - [20/Jul/2009:22:22:42 -0700]
"GET /images/pigtrihawk.jpg HTTP/1.1" 200 29236
Did NASA’s STS-69 Mission Land …
… On the right homepage?
Your First Big Data Application on AWS
PROCESS
STORE
ANALYZE & VISUALIZE
COLLECT
Your First Big Data Application on AWS
PROCESS
STORE
COLLECT:
Amazon Kinesis Firehose
ANALYZE & VISUALIZE
Your First Big Data Application on AWS
STORE
COLLECT:
Amazon Kinesis Firehose
ANALYZE & VISUALIZE
PROCESS:
Amazon EMR with Spark & Hive
Your First Big Data Application on AWS
PROCESS:
Amazon EMR with Spark & Hive
STORE
ANALYZE & VISUALIZE:
Amazon Redshift and Amazon QuickSight
COLLECT:
Amazon Kinesis Firehose
Reference
Lambda Architecture
process
store
Apps
Batch Layer
Amazon
Kinesis S3
Connector
Amazon S3
Amazon
Redshift
Amazon EMR
Presto
Hive
Pig
Spark
Lambda
Architecture
Serving
Layer
Amazon
ElastiCache
Amazon
DynamoDB
Amazon
RDS
Amazon
ES
Amazon
Kinesis Speed Layer
KCL
AWS Lambda
Spark Streaming
Storm
Amazon
MLdata
Back to our demo…
PROCESS:
Amazon EMR with Spark & Hive
STORE
ANALYZE & VISUALIZE:
Amazon Redshift and Amazon QuickSight
COLLECT:
Amazon Kinesis Firehose
DIY
Download all steps: http://bit.ly/29fhcwu
http://aws.amazon.com/big-data/use-cases/
tesslerr@amazon.com

Building big data applications on AWS by Ran Tessler

  • 1.
    Ran Tessler, Manager,Solutions Architecture Building Big Data Applications on AWS
  • 2.
    A Modern Takeon Alchemy Turning Data into Actionable Insights
  • 3.
    What to Expectfrom this Session Big Data architectural principles Reference Lambda Architecture Live demo
  • 4.
    Architectural Principles • Decoupled“data bus” Data → Store → Process → Answers • Use the right tool for the job Latency, throughput, access patterns • Apply Lambda architecture ideas Immutable (append-only) log, batch/speed/serving layer • Leverage AWS managed services No/low admin • Be cost conscious Big data ≠ big cost
  • 5.
    Simplify Big DataProcessing Ingest / collect store process / analyze consume / visualize Time to Answer (data freshness) Throughput
  • 6.
  • 7.
    AccessLog - CommonLog Format (CLF) 75.35.230.210 - - [20/Jul/2009:22:22:42 -0700] "GET /images/pigtrihawk.jpg HTTP/1.1" 200 29236
  • 8.
    Did NASA’s STS-69Mission Land … … On the right homepage?
  • 9.
    Your First BigData Application on AWS PROCESS STORE ANALYZE & VISUALIZE COLLECT
  • 10.
    Your First BigData Application on AWS PROCESS STORE COLLECT: Amazon Kinesis Firehose ANALYZE & VISUALIZE
  • 11.
    Your First BigData Application on AWS STORE COLLECT: Amazon Kinesis Firehose ANALYZE & VISUALIZE PROCESS: Amazon EMR with Spark & Hive
  • 12.
    Your First BigData Application on AWS PROCESS: Amazon EMR with Spark & Hive STORE ANALYZE & VISUALIZE: Amazon Redshift and Amazon QuickSight COLLECT: Amazon Kinesis Firehose
  • 13.
  • 14.
    process store Apps Batch Layer Amazon Kinesis S3 Connector AmazonS3 Amazon Redshift Amazon EMR Presto Hive Pig Spark Lambda Architecture Serving Layer Amazon ElastiCache Amazon DynamoDB Amazon RDS Amazon ES Amazon Kinesis Speed Layer KCL AWS Lambda Spark Streaming Storm Amazon MLdata
  • 15.
    Back to ourdemo… PROCESS: Amazon EMR with Spark & Hive STORE ANALYZE & VISUALIZE: Amazon Redshift and Amazon QuickSight COLLECT: Amazon Kinesis Firehose
  • 16.
    DIY Download all steps:http://bit.ly/29fhcwu http://aws.amazon.com/big-data/use-cases/
  • 17.