Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
RAPID PROTOTYPING FOR
BIG DATA WITH AWS
Tuesday, March 15, 2016
8 AM PST/4 PM BST/5 PM CEST
webinar webinar@softserveinc.c...
SPEAKERS
Serge Haziyev
VP of Technology Services,
SoftServe
Taras Bachynskyy
Data Architect,
SoftServe
Vadim Astakhov
Solu...
AGENDA
webinar
Big Data
Prototyping
AWS as a
Prototype
Accelerator
Case study Questions
TYPICAL BIG DATA CHALLENGES
UNSTRUCTURED
STRUCTURED
HIGH
MEDIUM
LOW
Archives Docs Business
Apps
Media Social
Networks
Publ...
WHY PROTOTYPING IS IMPORTANT?
Typical signs to start prototyping:
• Requirements are uncertain
• Technologies are new
• No...
TYPES OF PROTOTYPES
Throwaway Prototype
(Proof-of-Concept)
Horizontal Prototype
Vertical
Evolutionary Prototype
Minimum Vi...
WHEN AND WHY TO PROTOTYPE?
Find more info at: “Strategic Prototyping for Developing Big Data Systems”,
IEEE Software, Marc...
AGENDA
webinar
Big Data
Prototyping
Challenges
AWS as a
Prototype
Accelerator
Case study Questions
BIG DATA CHALLENGES
Volume
Velocity
Variety
Big Data Real-time Big Data
webinar
SIMPLIFY BIG DATA PROCESSING
Ingest Collect Process Analyze
Data Answers
Time
webinar
EMR Redshift
Process
AWS BIG DATA TECHNOLOGIES
EC2
S3Amazon Kinesis GlacierDynamoDB
AWS Direct Connect AWS Import/Export
I...
S3
Kinesis
DynamoDB
RDS (Aurora)
AWS Lambda
KCL Apps
EMR Redshift
Machine
Learning
Collect Process Analyze
Store
Data Coll...
DATA STRUCTURE AND QUERY TYPES VS STORAGE TECHNOLOGY
Structured – Simple Query
NoSQL
Amazon DynamoDB
Cache
Amazon ElastiCa...
DATA CHARACTERISTICS: HOT, WARM, COLD
Hot Warm Cold
Volume MB–GB GB–TB PB
Item size B–KB KB–MB KB–TB
Latency ms ms, sec mi...
WHAT DATA STORE SHOULD I USE?
Hot Warm Data Cold
webinar
YOUR BIG DATA APPLICATION ON AWS
Log4J
EMR-Kinesis Connector
Hive with
Amazon S3
Amazon Redshift
parallel COPY from
Amazon...
AGENDA
webinar
Big Data
Prototyping
Challenges
AWS as a
Prototype
Accelerator
Case study Questions
YOTTAA CREATES AN ABSTRACTION LAYER ON TOP OF
INFRASTRUCTURE, APP & VISITOR BROWSER
webinar
YOTTAA’S PROXY-BASED SOLUTION SEES EVERY VISITOR
REQUEST & INFRASTRUCTURE RESPONSE
Primary Web
(www) Domain
Visitor
Browse...
REAL-TIME WEB ANALYTICS – LOB & IT USE CASES TO
DRIVE YOTTAAS BUSINESS FORWARD
“The Business”
Customer Journey
• User expe...
Complete Visibility
• Centralized log delivery & analytics
• Role-based Access Control
• Dual-factor authentication
• Acco...
TECHNICAL SOLUTION
Architecture Drivers
▪ Volume (> 100 TB scale)
▪ Throughput (> 20K/sec)
▪ Performance (low latency)
▪ E...
TECHNICAL SOLUTION. PROTOTYPE
Technologies
 S3
 EMR
 EC2
 Redshift
 Elasticsearch
 Kafka
 Flume
Prototyping Phase
S...
BIG DATA PROTOTYPING
webinar
-as-a-Service = time-to-market
“on demand” model = cost economy
rich services portfolio
elast...
AGENDA
webinar
Big Data
Prototyping
Challenges
AWS as a
Prototype
Accelerator
Case study Questions
QUESTIONS & ANSWERS
e-mail your questions to webinar@softserveinc.com
webinar
THANK YOU
www.softserveinc.com
webinar
Upcoming SlideShare
Loading in …5
×

Rapid Prototyping for Big Data with AWS

916 views

Published on

Deep dive into the most talked about topic in the software industry — how to design and prototype complex Big Data solutions by leveraging the Cloud.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Rapid Prototyping for Big Data with AWS

  1. 1. RAPID PROTOTYPING FOR BIG DATA WITH AWS Tuesday, March 15, 2016 8 AM PST/4 PM BST/5 PM CEST webinar webinar@softserveinc.com
  2. 2. SPEAKERS Serge Haziyev VP of Technology Services, SoftServe Taras Bachynskyy Data Architect, SoftServe Vadim Astakhov Solution Architect, Amazon Web Services Ariel Weil VP of Marketing and Business Development, Yottaa webinar
  3. 3. AGENDA webinar Big Data Prototyping AWS as a Prototype Accelerator Case study Questions
  4. 4. TYPICAL BIG DATA CHALLENGES UNSTRUCTURED STRUCTURED HIGH MEDIUM LOW Archives Docs Business Apps Media Social Networks Public Web Data Storages Machine Log Data Sensor Data Velocity Variety VolumeComplexity Architecture Concerns: • Scalability • Performance • Extensibility • Data Quality • Fault-Tolerance and Availability • Security • Cost • Skills Availability Data Sources: webinar
  5. 5. WHY PROTOTYPING IS IMPORTANT? Typical signs to start prototyping: • Requirements are uncertain • Technologies are new • No comparable system has been previously developed • No full buy-in from the business They said they didn’t need a prototype webinar
  6. 6. TYPES OF PROTOTYPES Throwaway Prototype (Proof-of-Concept) Horizontal Prototype Vertical Evolutionary Prototype Minimum Viable Product (MVP) webinar
  7. 7. WHEN AND WHY TO PROTOTYPE? Find more info at: “Strategic Prototyping for Developing Big Data Systems”, IEEE Software, March-April, 2016 Initial Architecture Analysis Vertical Evolutionary Prototype PoC MVP Rapid Horizontal Prototype Projecttimeline(When?) • Identification of missing, conflicting or ambiguous architectural requirements • Creation of initial architecture design and selection of candidate technologies Goals (Why?): • Confirmation of user interface requirements and system scope • Demonstration version of the system to obtain buy-in from the business • Integration of selected technologies • Clarification of complex requirements • Testing critical functionality and quality attribute scenarios • Validation of technologies and scenarios that pose risks PoCs • Getting early feedback from end users and updating the product accordingly • Presentation of a working version to a trade show or customer event • Evaluation of team progress and alignment webinar
  8. 8. AGENDA webinar Big Data Prototyping Challenges AWS as a Prototype Accelerator Case study Questions
  9. 9. BIG DATA CHALLENGES Volume Velocity Variety Big Data Real-time Big Data webinar
  10. 10. SIMPLIFY BIG DATA PROCESSING Ingest Collect Process Analyze Data Answers Time webinar
  11. 11. EMR Redshift Process AWS BIG DATA TECHNOLOGIES EC2 S3Amazon Kinesis GlacierDynamoDB AWS Direct Connect AWS Import/Export Ingest Automate AWS Data Pipeline Store VPN/Public Web webinar
  12. 12. S3 Kinesis DynamoDB RDS (Aurora) AWS Lambda KCL Apps EMR Redshift Machine Learning Collect Process Analyze Store Data Collection and Storage Data Processing Event Processing Data Analysis BIG DATA PROCESSING Data Answers webinar
  13. 13. DATA STRUCTURE AND QUERY TYPES VS STORAGE TECHNOLOGY Structured – Simple Query NoSQL Amazon DynamoDB Cache Amazon ElastiCache (MemCached/Redis -PubSub) Structured – Complex Query SQL Amazon RDS DW SQL Amazon Redshift Unstructured – No Query Cloud Storage Amazon S3 Amazon Glacier Unstructured – Custom Query Search Amazon CloudSearch Hadoop/HDFS Amazon Elastic MapReduce Complexity Query Structure Complexity webinar
  14. 14. DATA CHARACTERISTICS: HOT, WARM, COLD Hot Warm Cold Volume MB–GB GB–TB PB Item size B–KB KB–MB KB–TB Latency ms ms, sec min, hrs Durability Low–High High Very High Request rate Very High High Low Cost/GB $$-$ $-¢¢ ¢ webinar
  15. 15. WHAT DATA STORE SHOULD I USE? Hot Warm Data Cold webinar
  16. 16. YOUR BIG DATA APPLICATION ON AWS Log4J EMR-Kinesis Connector Hive with Amazon S3 Amazon Redshift parallel COPY from Amazon S3 Amazon Kinesis processing state webinar
  17. 17. AGENDA webinar Big Data Prototyping Challenges AWS as a Prototype Accelerator Case study Questions
  18. 18. YOTTAA CREATES AN ABSTRACTION LAYER ON TOP OF INFRASTRUCTURE, APP & VISITOR BROWSER webinar
  19. 19. YOTTAA’S PROXY-BASED SOLUTION SEES EVERY VISITOR REQUEST & INFRASTRUCTURE RESPONSE Primary Web (www) Domain Visitor Browser YOTTAA Network WAF Incumbent CDN Resource Domain(s) 3rd Party WAF (if present) 3rd Party Domain(s) Asset Optimization Non-optimized Assets webinar
  20. 20. REAL-TIME WEB ANALYTICS – LOB & IT USE CASES TO DRIVE YOTTAAS BUSINESS FORWARD “The Business” Customer Journey • User experience • Visitor Targeting • Vendor Attribution • Business Agility IT & Operations Service Levels • Speed • Scalability • Security • Standards webinar
  21. 21. Complete Visibility • Centralized log delivery & analytics • Role-based Access Control • Dual-factor authentication • Account lockout Actionable Insights • Real-time traffic & threat analysis • Event management • In-line actions via Yottaa Portal THE SOLUTION: IMPACTANALYTICSTM BIG DATA ANALYTICS FOR ACTIONABLE INSIGHT webinar
  22. 22. TECHNICAL SOLUTION Architecture Drivers ▪ Volume (> 100 TB scale) ▪ Throughput (> 20K/sec) ▪ Performance (low latency) ▪ Exploratory analytics ▪ Near Real-time (5 sec latency) ▪ Historical view (5 years data) Lambda Architecture Solution Combine different techniques  Stream (resent data) – hot data  Batch (all data) – cold and warm Velocity Variety Volume Batch Layer Speed Layer Serving Layer Mater Data Stream Processing Batch View Real-time View Batch Processing Web Logs webinar
  23. 23. TECHNICAL SOLUTION. PROTOTYPE Technologies  S3  EMR  EC2  Redshift  Elasticsearch  Kafka  Flume Prototyping Phase Speed Layer > 80% scenarios  Validate Elasticsearch aggregates  Compatibility & integration  Performance and load testing webinar PoC & Vertical Prototype Batch Layer Speed Layer Elastic Log Parser Elastic Driver EMR Event Broker Kafka- Elastic Consumer S3 + Serving Layer Redshift Web Logs Logs Collector
  24. 24. BIG DATA PROTOTYPING webinar -as-a-Service = time-to-market “on demand” model = cost economy rich services portfolio elasticity  Identify risks  Choose prototyping approach  Evaluate your decisions  Achieve required quality attributes  Determine needed hardware and configuration  Workload and concurrency matters ARCHITECTUR E DRIVERS DECISIONS ASSOCIATED SCENARIOS ESTIMATED CAPACITY SOLUTION RISKS Analysis Design Evaluation Required Actions Why AWS Big Data ChallengesEnvironment Experience Strategy
  25. 25. AGENDA webinar Big Data Prototyping Challenges AWS as a Prototype Accelerator Case study Questions
  26. 26. QUESTIONS & ANSWERS e-mail your questions to webinar@softserveinc.com webinar
  27. 27. THANK YOU www.softserveinc.com webinar

×