SlideShare a Scribd company logo
1 of 31
BIG DATA WITH AWS
Architecting for Big Data with AWS
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
- By Stany Simon
Agenda
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Introduction
to Big Data
When starting
a Big Data
Project
Cost Vs. ROI
Architecting for
Big Data
What is Big Data
Big data is an evolving term that describes any voluminous amount of
structured, semi-structured and unstructured data that has the potential to be
mined for information.
Big Data is less about the data itself and more about what you do with the
data.
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Data vs. Information
Data is raw, unorganized facts
Information is derived from data
Data is useless unless useful information from it is not derived
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
The 3 Vs of Big Data
Velocity: Speed in
which data is
created, processed
& analyzed
Variety: Varying
formats of the data
( structured /
unstructured)
Volume: Size of
the data to be
handled
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Are the 3 V's enough to define Big Data today?
Veracity
Variability
Visualization
Value
Validity
Volatility
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Where has Big Data Analytics helped?
Solving the Big Pie Mystery
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Where has Big Data Analytics helped?
Faster Fast Food
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Where has Big Data Analytics helped?
Who stole my Cheese?
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Where has Big Data Analytics helped?
Tooth Fairy
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
When starting a Big Data
Project
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Identify clear business need and value.
Build a strong data infrastructure to host and manage data.
Time taken for a useful outcome.
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Time
Money Outcome
Big Data
Project
Big Data Today….
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Estimated to grow to
40 zettabytes
by 2020.
Investment expected
to top $114
billion by 2018.
Data market
reached $27.36B
in 2014.
Big Data market
will top $84B in
2026
BUT....
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Gartner predicted that through 2017, 60% of
Big Data projects will fail to go beyond piloting
and experimentation and will be abandoned.
BUT....
3. 30% of Big Data projects , 65 % failure,
35 % successful ( 5% with useful insights )
Case one
Brief: A retailer which was into a range of Big
Data projects.
Objective: Mining all of their stock and
purchase data
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Importance of Identifying clear need and value
Outcome:
Case two
Brief: Web-based startup focused on mothers
on child development
Objective: Collection of data from multiple
sources for adaptive learning & behavioral
pattern analysis.
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Outcome:
Importance of Identifying clear need and value
Budget
Points to Ponder
Does this data help your organizations with their business decisions.
How to capture the Data?
Where to store?
How to analyze?
Cost vs. ROI- Value
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Big Data Break Down
Flow of Data
Ingest Store
Process/
Analyze
Visualize
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Phase 1: Ingestion
Transactional
Orders, Invoices, Travel Records
File
Server Logs
Stream
IOT
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Phase 2: Storage
Database Storage
Cloud Storage
Stream Storage
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
AWS Services for Storage
Amazon
CloudSearch
Amazon
DynamoDB
Amazon
ElastiCache
Amazon Elasticsearch
Service
Amazon
Kinesis
Amazon RDSAmazon
Glacier
Amazon S3Amazon
Redshift
Points to Ponder
Size of the data to be stored.
Time for which the data needs to be stored.
Cost of storage per GB.
Criticality of the data in terms of security & recovery.
Availability of data.
Data Structure
Query Pattern www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Traditional Architecture
Client Tier
Web-App Tier
Data Storage Tier
SQL
Storage Architecture
Storage Tier
SQL NoSQL Search Cache
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Phase 3: Process/ Analyse
Batch Processing
• Hourly, daily, weekly reports
• Works on huge amount of data sets at one go
• Response Time : Might take a few Hours to answer your questions
Real-Time Processing
• Minute based reports
• Works on very small data sets
• Response Time : Takes a few seconds to answer your data
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
AWS Services
Amazon EMR
Amazon
Kinesis
Amazon
Redshift
AWS Lambda
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Phase 4: Visualize
Value to the users
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Points to Ponder
Business :
Value of the insights from the project
Architecture:
• Frequency of data flowing in
• Size of data coming in
• Tools to be used for processing the data
• Scaling the Infra & HA as per requirement
• Maintenance
• Cost of maintenance
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Why AWS?
Scalability Elasticity
Pay as you go Maintenance &
Management
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Tooth Fairy-Architecture
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com

More Related Content

What's hot

Hadoop et bases de données relationnelles ultra performantes : le meilleur de...
Hadoop et bases de données relationnelles ultra performantes : le meilleur de...Hadoop et bases de données relationnelles ultra performantes : le meilleur de...
Hadoop et bases de données relationnelles ultra performantes : le meilleur de...Microsoft Décideurs IT
 
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for CybersecurityEmpower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for CybersecurityDatabricks
 
GraphTalk Barcelona - Keynote
GraphTalk Barcelona - KeynoteGraphTalk Barcelona - Keynote
GraphTalk Barcelona - KeynoteNeo4j
 
Introduction to Data Mining, Business Intelligence and Data Science
Introduction to Data Mining, Business Intelligence and Data ScienceIntroduction to Data Mining, Business Intelligence and Data Science
Introduction to Data Mining, Business Intelligence and Data ScienceIMC Institute
 
Turn Big Data into Big Value on Informatica and AWS
Turn Big Data into Big Value on Informatica and AWSTurn Big Data into Big Value on Informatica and AWS
Turn Big Data into Big Value on Informatica and AWSAmazon Web Services
 
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
The Five Graphs of Government: How Federal Agencies can Utilize Graph TechnologyThe Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
The Five Graphs of Government: How Federal Agencies can Utilize Graph TechnologyGreta Workman
 
Web analyticsandbigdata techweek2011
Web analyticsandbigdata techweek2011Web analyticsandbigdata techweek2011
Web analyticsandbigdata techweek2011Raghu Kashyap
 
Analytical Systems Evolution: From Excel to Big Data Platforms and Data Lakes
Analytical Systems Evolution: From Excel to Big Data Platforms and Data LakesAnalytical Systems Evolution: From Excel to Big Data Platforms and Data Lakes
Analytical Systems Evolution: From Excel to Big Data Platforms and Data LakesProvectus
 
The Business of Big Data - IA Ventures
The Business of Big Data - IA VenturesThe Business of Big Data - IA Ventures
The Business of Big Data - IA VenturesBen Siscovick
 
The New Convergence of Data; the Next Strategic Business Advantage
The New Convergence of Data; the Next Strategic Business AdvantageThe New Convergence of Data; the Next Strategic Business Advantage
The New Convergence of Data; the Next Strategic Business AdvantageJoAnna Cheshire
 
Big Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationBig Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationCambridge Semantics
 
Moving to the Cloud: Modernizing Data Architecture in Healthcare
Moving to the Cloud: Modernizing Data Architecture in HealthcareMoving to the Cloud: Modernizing Data Architecture in Healthcare
Moving to the Cloud: Modernizing Data Architecture in HealthcarePerficient, Inc.
 
Neo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4j
Neo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4jNeo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4j
Neo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4jNeo4j
 
10 reasons why you should choose big data hadoop as career in 2018
10 reasons why you should choose big data hadoop as career in 201810 reasons why you should choose big data hadoop as career in 2018
10 reasons why you should choose big data hadoop as career in 2018JanBask Training
 
How Google Does Big Data - DevNexus 2014
How Google Does Big Data - DevNexus 2014How Google Does Big Data - DevNexus 2014
How Google Does Big Data - DevNexus 2014James Chittenden
 

What's hot (20)

Hadoop et bases de données relationnelles ultra performantes : le meilleur de...
Hadoop et bases de données relationnelles ultra performantes : le meilleur de...Hadoop et bases de données relationnelles ultra performantes : le meilleur de...
Hadoop et bases de données relationnelles ultra performantes : le meilleur de...
 
Big data case study collection
Big data   case study collectionBig data   case study collection
Big data case study collection
 
Google and big query
Google and big queryGoogle and big query
Google and big query
 
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for CybersecurityEmpower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
 
GraphTalk Barcelona - Keynote
GraphTalk Barcelona - KeynoteGraphTalk Barcelona - Keynote
GraphTalk Barcelona - Keynote
 
Introduction to Data Mining, Business Intelligence and Data Science
Introduction to Data Mining, Business Intelligence and Data ScienceIntroduction to Data Mining, Business Intelligence and Data Science
Introduction to Data Mining, Business Intelligence and Data Science
 
Turn Big Data into Big Value on Informatica and AWS
Turn Big Data into Big Value on Informatica and AWSTurn Big Data into Big Value on Informatica and AWS
Turn Big Data into Big Value on Informatica and AWS
 
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
The Five Graphs of Government: How Federal Agencies can Utilize Graph TechnologyThe Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
 
Web analyticsandbigdata techweek2011
Web analyticsandbigdata techweek2011Web analyticsandbigdata techweek2011
Web analyticsandbigdata techweek2011
 
Analytical Systems Evolution: From Excel to Big Data Platforms and Data Lakes
Analytical Systems Evolution: From Excel to Big Data Platforms and Data LakesAnalytical Systems Evolution: From Excel to Big Data Platforms and Data Lakes
Analytical Systems Evolution: From Excel to Big Data Platforms and Data Lakes
 
The Business of Big Data - IA Ventures
The Business of Big Data - IA VenturesThe Business of Big Data - IA Ventures
The Business of Big Data - IA Ventures
 
The New Convergence of Data; the Next Strategic Business Advantage
The New Convergence of Data; the Next Strategic Business AdvantageThe New Convergence of Data; the Next Strategic Business Advantage
The New Convergence of Data; the Next Strategic Business Advantage
 
Big Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationBig Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data Democratization
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Moving to the Cloud: Modernizing Data Architecture in Healthcare
Moving to the Cloud: Modernizing Data Architecture in HealthcareMoving to the Cloud: Modernizing Data Architecture in Healthcare
Moving to the Cloud: Modernizing Data Architecture in Healthcare
 
Big query
Big queryBig query
Big query
 
Neo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4j
Neo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4jNeo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4j
Neo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4j
 
10 reasons why you should choose big data hadoop as career in 2018
10 reasons why you should choose big data hadoop as career in 201810 reasons why you should choose big data hadoop as career in 2018
10 reasons why you should choose big data hadoop as career in 2018
 
BigQuery for Beginners
BigQuery for BeginnersBigQuery for Beginners
BigQuery for Beginners
 
How Google Does Big Data - DevNexus 2014
How Google Does Big Data - DevNexus 2014How Google Does Big Data - DevNexus 2014
How Google Does Big Data - DevNexus 2014
 

Viewers also liked

[TechTalks] Learning Configuration Management with SaltStack (Advanced Concepts)
[TechTalks] Learning Configuration Management with SaltStack (Advanced Concepts)[TechTalks] Learning Configuration Management with SaltStack (Advanced Concepts)
[TechTalks] Learning Configuration Management with SaltStack (Advanced Concepts)Blazeclan Technologies Private Limited
 
Cloudlytics Reporting: Analyze Amazon CloudFront, S3 & ELB Logs - Part 2
Cloudlytics Reporting: Analyze Amazon CloudFront, S3 & ELB Logs - Part 2Cloudlytics Reporting: Analyze Amazon CloudFront, S3 & ELB Logs - Part 2
Cloudlytics Reporting: Analyze Amazon CloudFront, S3 & ELB Logs - Part 2Blazeclan Technologies Private Limited
 
[TechTalks] Effects of UI/ UX Designs on Customer Satisfaction & Loyalty
[TechTalks] Effects of UI/ UX Designs on Customer Satisfaction & Loyalty[TechTalks] Effects of UI/ UX Designs on Customer Satisfaction & Loyalty
[TechTalks] Effects of UI/ UX Designs on Customer Satisfaction & LoyaltyBlazeclan Technologies Private Limited
 

Viewers also liked (18)

Micro services on AWS
Micro services on AWSMicro services on AWS
Micro services on AWS
 
AWS RDS Migration Tool
AWS RDS Migration Tool AWS RDS Migration Tool
AWS RDS Migration Tool
 
Cloud stream webinar
Cloud stream webinarCloud stream webinar
Cloud stream webinar
 
Life of data from generation to visualization using big data
Life of data from generation to visualization using big dataLife of data from generation to visualization using big data
Life of data from generation to visualization using big data
 
Analyze Amazon CloudFront, S3 & ELB Logs with Cloudlytics - Part 1
Analyze Amazon CloudFront, S3 & ELB Logs with Cloudlytics - Part 1Analyze Amazon CloudFront, S3 & ELB Logs with Cloudlytics - Part 1
Analyze Amazon CloudFront, S3 & ELB Logs with Cloudlytics - Part 1
 
[TechTalks] Learning Configuration Management with SaltStack (Advanced Concepts)
[TechTalks] Learning Configuration Management with SaltStack (Advanced Concepts)[TechTalks] Learning Configuration Management with SaltStack (Advanced Concepts)
[TechTalks] Learning Configuration Management with SaltStack (Advanced Concepts)
 
Overview of AWS Services for Media Content
Overview of AWS Services for Media ContentOverview of AWS Services for Media Content
Overview of AWS Services for Media Content
 
Enterprise Cloud for your Business Applications
Enterprise Cloud for your Business ApplicationsEnterprise Cloud for your Business Applications
Enterprise Cloud for your Business Applications
 
Productive Expansion on Amazon Web Services with BlazeClan
 Productive Expansion on Amazon Web Services with BlazeClan Productive Expansion on Amazon Web Services with BlazeClan
Productive Expansion on Amazon Web Services with BlazeClan
 
How to Design for High Availability & Scale with AWS
How to Design for High Availability & Scale with AWSHow to Design for High Availability & Scale with AWS
How to Design for High Availability & Scale with AWS
 
Cloudlytics Reporting: Analyze Amazon CloudFront, S3 & ELB Logs - Part 2
Cloudlytics Reporting: Analyze Amazon CloudFront, S3 & ELB Logs - Part 2Cloudlytics Reporting: Analyze Amazon CloudFront, S3 & ELB Logs - Part 2
Cloudlytics Reporting: Analyze Amazon CloudFront, S3 & ELB Logs - Part 2
 
Big Data Building Blocks with AWS Cloud
Big Data Building Blocks with AWS CloudBig Data Building Blocks with AWS Cloud
Big Data Building Blocks with AWS Cloud
 
Solving Big Data problems on AWS by Rajnish Malik
Solving Big Data problems on AWS by Rajnish MalikSolving Big Data problems on AWS by Rajnish Malik
Solving Big Data problems on AWS by Rajnish Malik
 
Hurix case study
Hurix case study Hurix case study
Hurix case study
 
Overview of AWS Services for your Enterprise
Overview of AWS Services for your Enterprise Overview of AWS Services for your Enterprise
Overview of AWS Services for your Enterprise
 
[TechTalks] Effects of UI/ UX Designs on Customer Satisfaction & Loyalty
[TechTalks] Effects of UI/ UX Designs on Customer Satisfaction & Loyalty[TechTalks] Effects of UI/ UX Designs on Customer Satisfaction & Loyalty
[TechTalks] Effects of UI/ UX Designs on Customer Satisfaction & Loyalty
 
Solving Big Data Industry Use Cases with AWS Cloud Computing
Solving Big Data Industry Use Cases with AWS Cloud ComputingSolving Big Data Industry Use Cases with AWS Cloud Computing
Solving Big Data Industry Use Cases with AWS Cloud Computing
 
Amazon CloudFront Complete with Blazeclan's Media Solution Stack
Amazon CloudFront Complete with Blazeclan's Media Solution StackAmazon CloudFront Complete with Blazeclan's Media Solution Stack
Amazon CloudFront Complete with Blazeclan's Media Solution Stack
 

Similar to Architecting for Big Data with AWS

Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDon't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDataStax
 
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopAnalytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopCCG
 
Unlock Data-driven Insights in Databricks Using Location Intelligence
Unlock Data-driven Insights in Databricks Using Location IntelligenceUnlock Data-driven Insights in Databricks Using Location Intelligence
Unlock Data-driven Insights in Databricks Using Location IntelligencePrecisely
 
When SAP alone is not enough
When SAP alone is not enoughWhen SAP alone is not enough
When SAP alone is not enoughCloudera, Inc.
 
Analytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopAnalytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopCCG
 
Data In Action: Business Value of Data
Data In Action: Business Value of DataData In Action: Business Value of Data
Data In Action: Business Value of DataMatt Turner
 
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopAnalytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopCCG
 
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Introducing the MongoDB Connector for BI 2.0 with TableauWebinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Introducing the MongoDB Connector for BI 2.0 with TableauMongoDB
 
February 2016 Webinar Series - 451 Research and AWS
February 2016 Webinar Series - 451 Research and AWSFebruary 2016 Webinar Series - 451 Research and AWS
February 2016 Webinar Series - 451 Research and AWSAmazon Web Services
 
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스Amazon Web Services Korea
 
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...Kai Wähner
 
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...Denodo
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big DataSonovate
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Oomph! Recruitment
 
IBM Governed Data Lake
IBM Governed Data LakeIBM Governed Data Lake
IBM Governed Data LakeKaran Sachdeva
 
How Insurance Companies Use MongoDB
How Insurance Companies Use MongoDB How Insurance Companies Use MongoDB
How Insurance Companies Use MongoDB MongoDB
 
Your big data audience insight big data show 24 apr 2013
Your big data audience insight big data show 24 apr 2013Your big data audience insight big data show 24 apr 2013
Your big data audience insight big data show 24 apr 2013iCrossing
 
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018Charles Allen
 

Similar to Architecting for Big Data with AWS (20)

Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDon't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
 
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopAnalytics in a Day Virtual Workshop
Analytics in a Day Virtual Workshop
 
Unlock Data-driven Insights in Databricks Using Location Intelligence
Unlock Data-driven Insights in Databricks Using Location IntelligenceUnlock Data-driven Insights in Databricks Using Location Intelligence
Unlock Data-driven Insights in Databricks Using Location Intelligence
 
When SAP alone is not enough
When SAP alone is not enoughWhen SAP alone is not enough
When SAP alone is not enough
 
Analytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopAnalytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual Workshop
 
Data In Action: Business Value of Data
Data In Action: Business Value of DataData In Action: Business Value of Data
Data In Action: Business Value of Data
 
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopAnalytics in a Day Virtual Workshop
Analytics in a Day Virtual Workshop
 
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Introducing the MongoDB Connector for BI 2.0 with TableauWebinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
 
February 2016 Webinar Series - 451 Research and AWS
February 2016 Webinar Series - 451 Research and AWSFebruary 2016 Webinar Series - 451 Research and AWS
February 2016 Webinar Series - 451 Research and AWS
 
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
 
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
 
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big Data
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
 
IBM Governed Data Lake
IBM Governed Data LakeIBM Governed Data Lake
IBM Governed Data Lake
 
How Insurance Companies Use MongoDB
How Insurance Companies Use MongoDB How Insurance Companies Use MongoDB
How Insurance Companies Use MongoDB
 
Your big data audience insight big data show 24 apr 2013
Your big data audience insight big data show 24 apr 2013Your big data audience insight big data show 24 apr 2013
Your big data audience insight big data show 24 apr 2013
 
Big Data in the Cloud
Big Data in the CloudBig Data in the Cloud
Big Data in the Cloud
 
Big Data and Business Insight
Big Data and Business InsightBig Data and Business Insight
Big Data and Business Insight
 
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
 

More from Blazeclan Technologies Private Limited

Cloudlytics: In Depth S3 & CloudFront Log Analysis - Featuring Reports
Cloudlytics: In Depth S3 & CloudFront Log Analysis - Featuring ReportsCloudlytics: In Depth S3 & CloudFront Log Analysis - Featuring Reports
Cloudlytics: In Depth S3 & CloudFront Log Analysis - Featuring ReportsBlazeclan Technologies Private Limited
 

More from Blazeclan Technologies Private Limited (12)

2020 Recap | Clan's Transformational Journey In The New Normal
2020 Recap | Clan's Transformational Journey In The New Normal2020 Recap | Clan's Transformational Journey In The New Normal
2020 Recap | Clan's Transformational Journey In The New Normal
 
Reminiscing 2019 And Heading Toward A Brighter Future!
Reminiscing 2019 And Heading Toward A Brighter Future!Reminiscing 2019 And Heading Toward A Brighter Future!
Reminiscing 2019 And Heading Toward A Brighter Future!
 
AWS Managed Services - BlazeClan Technologies
AWS Managed Services - BlazeClan TechnologiesAWS Managed Services - BlazeClan Technologies
AWS Managed Services - BlazeClan Technologies
 
Cloudlytics: In Depth S3 & CloudFront Log Analysis - Featuring Reports
Cloudlytics: In Depth S3 & CloudFront Log Analysis - Featuring ReportsCloudlytics: In Depth S3 & CloudFront Log Analysis - Featuring Reports
Cloudlytics: In Depth S3 & CloudFront Log Analysis - Featuring Reports
 
Amazon Reshift as your Data Warehouse Solution
Amazon Reshift as your Data Warehouse SolutionAmazon Reshift as your Data Warehouse Solution
Amazon Reshift as your Data Warehouse Solution
 
Testing Framework on AWS Cloud - Solution Set
Testing Framework on AWS Cloud - Solution SetTesting Framework on AWS Cloud - Solution Set
Testing Framework on AWS Cloud - Solution Set
 
Cloud for Media - A Complete Solution Stack for Faster Cloud Adoption
Cloud for Media - A Complete Solution Stack for Faster Cloud AdoptionCloud for Media - A Complete Solution Stack for Faster Cloud Adoption
Cloud for Media - A Complete Solution Stack for Faster Cloud Adoption
 
5 Points to Consider - Enterprise Road Map to AWS Cloud
5 Points to Consider  - Enterprise Road Map to AWS Cloud5 Points to Consider  - Enterprise Road Map to AWS Cloud
5 Points to Consider - Enterprise Road Map to AWS Cloud
 
How cloud is fueling growth for online gaming
How cloud is fueling growth for online gamingHow cloud is fueling growth for online gaming
How cloud is fueling growth for online gaming
 
A guide on Aws Security Token Service
A guide on Aws Security Token ServiceA guide on Aws Security Token Service
A guide on Aws Security Token Service
 
Working and Features of HTML5 and PhoneGap - An Overview
Working and Features of HTML5 and PhoneGap - An OverviewWorking and Features of HTML5 and PhoneGap - An Overview
Working and Features of HTML5 and PhoneGap - An Overview
 
Cloud Migration Strategy - IT Transformation with Cloud
Cloud Migration Strategy - IT Transformation with CloudCloud Migration Strategy - IT Transformation with Cloud
Cloud Migration Strategy - IT Transformation with Cloud
 

Recently uploaded

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

Architecting for Big Data with AWS

  • 1. BIG DATA WITH AWS Architecting for Big Data with AWS www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com - By Stany Simon
  • 2. Agenda www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com Introduction to Big Data When starting a Big Data Project Cost Vs. ROI Architecting for Big Data
  • 3. What is Big Data Big data is an evolving term that describes any voluminous amount of structured, semi-structured and unstructured data that has the potential to be mined for information. Big Data is less about the data itself and more about what you do with the data. www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 4. Data vs. Information Data is raw, unorganized facts Information is derived from data Data is useless unless useful information from it is not derived www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 5. The 3 Vs of Big Data Velocity: Speed in which data is created, processed & analyzed Variety: Varying formats of the data ( structured / unstructured) Volume: Size of the data to be handled www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 6. Are the 3 V's enough to define Big Data today? Veracity Variability Visualization Value Validity Volatility www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 7. Where has Big Data Analytics helped? Solving the Big Pie Mystery www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 8. Where has Big Data Analytics helped? Faster Fast Food www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 9. Where has Big Data Analytics helped? Who stole my Cheese? www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 10. Where has Big Data Analytics helped? Tooth Fairy www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 11. When starting a Big Data Project www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 12. Identify clear business need and value. Build a strong data infrastructure to host and manage data. Time taken for a useful outcome. www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com Time Money Outcome Big Data Project
  • 13. Big Data Today…. www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com Estimated to grow to 40 zettabytes by 2020. Investment expected to top $114 billion by 2018. Data market reached $27.36B in 2014. Big Data market will top $84B in 2026 BUT....
  • 14. www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com Gartner predicted that through 2017, 60% of Big Data projects will fail to go beyond piloting and experimentation and will be abandoned. BUT.... 3. 30% of Big Data projects , 65 % failure, 35 % successful ( 5% with useful insights )
  • 15. Case one Brief: A retailer which was into a range of Big Data projects. Objective: Mining all of their stock and purchase data www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com Importance of Identifying clear need and value Outcome:
  • 16. Case two Brief: Web-based startup focused on mothers on child development Objective: Collection of data from multiple sources for adaptive learning & behavioral pattern analysis. www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com Outcome: Importance of Identifying clear need and value Budget
  • 17. Points to Ponder Does this data help your organizations with their business decisions. How to capture the Data? Where to store? How to analyze? Cost vs. ROI- Value www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 18. Big Data Break Down Flow of Data Ingest Store Process/ Analyze Visualize www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 19. Phase 1: Ingestion Transactional Orders, Invoices, Travel Records File Server Logs Stream IOT www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 20. Phase 2: Storage Database Storage Cloud Storage Stream Storage www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 21. AWS Services for Storage Amazon CloudSearch Amazon DynamoDB Amazon ElastiCache Amazon Elasticsearch Service Amazon Kinesis Amazon RDSAmazon Glacier Amazon S3Amazon Redshift
  • 22. Points to Ponder Size of the data to be stored. Time for which the data needs to be stored. Cost of storage per GB. Criticality of the data in terms of security & recovery. Availability of data. Data Structure Query Pattern www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 23. Traditional Architecture Client Tier Web-App Tier Data Storage Tier SQL
  • 24. Storage Architecture Storage Tier SQL NoSQL Search Cache www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 25. Phase 3: Process/ Analyse Batch Processing • Hourly, daily, weekly reports • Works on huge amount of data sets at one go • Response Time : Might take a few Hours to answer your questions Real-Time Processing • Minute based reports • Works on very small data sets • Response Time : Takes a few seconds to answer your data www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 26. AWS Services Amazon EMR Amazon Kinesis Amazon Redshift AWS Lambda www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 27. Phase 4: Visualize Value to the users www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 28. Points to Ponder Business : Value of the insights from the project Architecture: • Frequency of data flowing in • Size of data coming in • Tools to be used for processing the data • Scaling the Infra & HA as per requirement • Maintenance • Cost of maintenance www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 29. Why AWS? Scalability Elasticity Pay as you go Maintenance & Management www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 30. Tooth Fairy-Architecture www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 31. www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com