WordPress Websites for Engineers: Elevate Your Brand
Architecting for Big Data with AWS
1. BIG DATA WITH AWS
Architecting for Big Data with AWS
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
- By Stany Simon
2. Agenda
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Introduction
to Big Data
When starting
a Big Data
Project
Cost Vs. ROI
Architecting for
Big Data
3. What is Big Data
Big data is an evolving term that describes any voluminous amount of
structured, semi-structured and unstructured data that has the potential to be
mined for information.
Big Data is less about the data itself and more about what you do with the
data.
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
4. Data vs. Information
Data is raw, unorganized facts
Information is derived from data
Data is useless unless useful information from it is not derived
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
5. The 3 Vs of Big Data
Velocity: Speed in
which data is
created, processed
& analyzed
Variety: Varying
formats of the data
( structured /
unstructured)
Volume: Size of
the data to be
handled
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
6. Are the 3 V's enough to define Big Data today?
Veracity
Variability
Visualization
Value
Validity
Volatility
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
7. Where has Big Data Analytics helped?
Solving the Big Pie Mystery
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
8. Where has Big Data Analytics helped?
Faster Fast Food
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
9. Where has Big Data Analytics helped?
Who stole my Cheese?
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
10. Where has Big Data Analytics helped?
Tooth Fairy
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
11. When starting a Big Data
Project
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
12. Identify clear business need and value.
Build a strong data infrastructure to host and manage data.
Time taken for a useful outcome.
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Time
Money Outcome
Big Data
Project
13. Big Data Today….
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Estimated to grow to
40 zettabytes
by 2020.
Investment expected
to top $114
billion by 2018.
Data market
reached $27.36B
in 2014.
Big Data market
will top $84B in
2026
BUT....
14. www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Gartner predicted that through 2017, 60% of
Big Data projects will fail to go beyond piloting
and experimentation and will be abandoned.
BUT....
3. 30% of Big Data projects , 65 % failure,
35 % successful ( 5% with useful insights )
15. Case one
Brief: A retailer which was into a range of Big
Data projects.
Objective: Mining all of their stock and
purchase data
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Importance of Identifying clear need and value
Outcome:
16. Case two
Brief: Web-based startup focused on mothers
on child development
Objective: Collection of data from multiple
sources for adaptive learning & behavioral
pattern analysis.
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Outcome:
Importance of Identifying clear need and value
Budget
17. Points to Ponder
Does this data help your organizations with their business decisions.
How to capture the Data?
Where to store?
How to analyze?
Cost vs. ROI- Value
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
18. Big Data Break Down
Flow of Data
Ingest Store
Process/
Analyze
Visualize
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
22. Points to Ponder
Size of the data to be stored.
Time for which the data needs to be stored.
Cost of storage per GB.
Criticality of the data in terms of security & recovery.
Availability of data.
Data Structure
Query Pattern www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
25. Phase 3: Process/ Analyse
Batch Processing
• Hourly, daily, weekly reports
• Works on huge amount of data sets at one go
• Response Time : Might take a few Hours to answer your questions
Real-Time Processing
• Minute based reports
• Works on very small data sets
• Response Time : Takes a few seconds to answer your data
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
27. Phase 4: Visualize
Value to the users
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
28. Points to Ponder
Business :
Value of the insights from the project
Architecture:
• Frequency of data flowing in
• Size of data coming in
• Tools to be used for processing the data
• Scaling the Infra & HA as per requirement
• Maintenance
• Cost of maintenance
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
29. Why AWS?
Scalability Elasticity
Pay as you go Maintenance &
Management
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com