SlideShare a Scribd company logo
1 of 29
Download to read offline
BIG DATA ANALYTICS
REFERENCE ARCHITECTURES AND
CASE STUDIES
Relational vs. Non-Relational Architecture
2
Relational Non-Relational
• Rational
• Predictable
• Traditional
• Agile
• Flexible
• Modern
Agenda
3
Big Data
Challenges
Big Data
Reference
Architectures
Case
Studies
Tips for
Designing
Big Data
Solutions
Big Data Challenges
4
UNSTRUCTURED
STRUCTURED
HIGH
MEDIUM
LOW
Archives Docs Business
Apps
Media Social
Networks
Public
Web
Data
Storages
Machine
Log Data
Sensor
Data
Data Storages
RDBMS, NoSQL, Hadoop, file systems
etc.
Machine Log Data
Application logs, event logs, server
data, CDRs, clickstream data etc.
Sensor Data
Smart electric meters, medical
devices, car sensors, road cameras
etc.
Archives
Scanned documents, statements,
medical records, e-mails etc..
Docs
XLS, PDF, CSV, HTML, JSON etc.
Business Apps
CRM, ERP systems, HR, project
management etc.
Social Networks
Twitter, Facebook, Google+,
LinkedIn etc.
Public Web
Wikipedia, news, weather, public
finance etc
Media
Images, video, audio etc.
Velocity Variety Volume
Complexity
Big Data Analytics
5
Traditional Analytics (BI) Big Data Analytics
Focus on
Data Sets
Supports
• Descriptive analytics
• Diagnosis analytics
• Limited data sets
• Cleansed data
• Simple models
• Large scale data sets
• More types of data
• Raw data
• Complex data models
• Predictive analytics
• Data Science
Causation: what happened,
and why?
Correlation: new insight
More accurate answers
vs
Big Data Analytics Use Cases
6
Data
Discovery
Business
Reporting
Real Time
Intelligence
Data Quality
Self Service
Business Users
Intelligent Agents
Consumers
Low Latency 
Reliability
Volume
Performance
Data Scientists/
Analysts
Big Data Analytics Reference Architectures
7
Architecture Drivers: Reference Architectures: 
▪ Volume
▪ Sources
▪ Throughput
▪ Latency
▪ Extensibility
▪ Data Quality
▪ Reliability
▪ Security
▪ Self-Service
▪ Cost
▪ Extended Relational
▪ Non-Relational
▪ Hybrid
Relational Reference Architecture
8
Web Services
Mobile
Devices
Native
Desktop
Web
Browsers
Advanced
Analytics
OLAP Cubes
Query &
Reporting
Operational
Data Stores
Data Marts
Data
Warehouses
Replication
API/ODBC
Messaging
ETL
Unstructured
Semi-
Structured
Data Sources Integration Data Storages Analytics Presentation
Structured
9
Extended Relational
Reference Architecture
Web Services
Mobile
Devices
Native
Desktop
Web
Browsers
Advanced
Analytics
OLAP Cubes
Query &
Reporting
Operational
Data Stores
Data Marts
Data
Warehouses
Replication
API/ODBC
Messaging
ETL
Unstructured
Semi-
Structured
Data Sources Integration Data Storages Analytics Presentation
Structured
Key components affected with Big Data challenges
Non-Relational Reference Architecture
10
Web Services
Mobile
Devices
Native
Desktop
Web
Browsers
Advanced
Analytics
Map Reduce
Query &
Reporting
Search Engines
Distributed File
Systems
NoSQL
Databases
API
Messaging
ETL
Unstructured
Semi-
Structured
Data Sources Integration Data Storages Analytics Presentation
Structured
Key components introduced with non-relational movement
Extended Relational vs. Non-Relational Architecture
11
Architecture Drivers
Extended 
Relational
Non‐Relational
Large data volume
Self‐service (ad‐hoc reporting)
Unstructured data processing
High data model extensibility
High data quality and consistency
Extensive security
Reliability and fault‐tolerance
Low latency (near‐real time)
Low cost
Skills availability
Extended Relational vs. Non-Relational Architecture
12
Architecture Drivers
Extended 
Relational
Non‐Relational
Large data volume
Self‐service (ad‐hoc reporting)
Unstructured data processing
High data model extensibility
High data quality and consistency
Extensive security
Reliability and fault‐tolerance
Low latency (near‐real time)
Low cost
Skills availability
Extended Relational vs. Non-Relational Architecture
13
Architecture Drivers
Extended 
Relational
Non‐Relational
Large data volume
Self‐service (ad‐hoc reporting)
Unstructured data processing
High data model extensibility
High data quality and consistency
Extensive security
Reliability and fault‐tolerance
Low latency (near‐real time)
Low cost
Skills availability
Relational vs. Non-Relational Architecture
14
Relational Non-Relational
• Rational
• Predictable
• Traditional
• Agile
• Flexible
• Modern
Data
Discovery
Business
Reporting
Real Time
Intelligence
Big Data Analytics Use Cases
15
Business Users
Intelligent Agents
Consumers
Performance
Volume
Data Scientists
Data Discovery: Non-Relational Architecture
16
Web Services
Mobile
Devices
Native
Desktop
Web
Browsers
Advanced
Analytics
Map Reduce
Query &
Reporting
Search Engines
Distributed File
Systems
NoSQL
Databases
API
Messaging
ETL
Unstructured
Semi-
Structured
Data Sources Integration Data Storages Analytics Presentation
Structured
Data
Discovery
Business
Reporting
Real Time
Intelligence
Big Data Analytics Use Cases
17
Intelligent Agents
Consumers
Data Scientists
Data Quality
Self Service
Business Users
Business Reporting: Hybrid Architecture
18
Web Services
Mobile
Devices
Native
Desktop
Web
Browsers
Map Reduce
SQL Query &
Reporting
Distributed File
Systems
API
Messaging
ETL
Unstructured
Semi-
Structured
Data Sources Integration Data Storages Analytics Presentation
Structured
Relational
DWH/DM
Advanced
Analytics
Search Engines
Extended Relational components Non-relational components
Data
Discovery
Business
Reporting
Real Time
Intelligence
Big Data Analytics Use Cases
19
Data Scientists Business Users
Intelligent Agents
Consumers
Low Latency 
Reliability
Lambda Architecture
20
Source:
21
Business Goals:
 Provide visual environment for building
custom mobile application
 Charge customers based on the platform
they are using, number of consumers’
applications etc.
Business Area:
Cloud based platform for building, deploying,
hosting and managing of mobile applications
Case Study #1: Usage & Billing Analysis
Architectural Decisions
22
▪ Volume (> 10 TB)
▪ Sources (Semi-structured - JSON)
▪ Throughput (> 10K/sec)
▪ Latency (2 min)
▪ Extensibility (Custom metrics)
▪ Data Quality (Consistency)
▪ Reliability (24/7)
▪ Security (Multitenancy)
▪ Self-Service (Ad-Hoc reports)
▪ Cost (The less the better )
▪ Constraints (Public Cloud)
Architecture Drivers:
Trade-off:
//
Extended
Relational
Non-Relational
Extensibility ‐ +
Data Quality + ‐
Self-Service + ‐
 Extended Relational Architecture
 Extensibility via Pre‐allocated 
Fields pattern 
Solution Architecture
23
Technologies:
• Amazon Redshift
• Amazon SQS
• Amazon S3
• Elastic Beanstalk
• Jaspersoft BI Professional
• Python
24
Business Goals:
 Build in-house Analytics Platform for ROI measurement
and performance analysis of every product and feature
delivered by the e-commerce platform;
 Provide the ability to understand how end-users are
interacting with service content, products, and features on
sites;
 Do clickstream analysis;
 Perform A/B Testing
Business Area:
Retail. A platform for e-commerce and
collecting feedbacks from customers
Case Study #2: Clickstream for retail website
//
Extended
Relational
Non-
Relational
Volume/Scalability +/‐ +
Throughput + +
Self-Service + +/‐
Extensibility ‐ +
Architectural Decisions
25
▪ Volume (45 TB)
▪ Sources (Semi-structured - JSON)
▪ Throughput (> 20K/sec)
▪ Latency (1 hour)
▪ Extensibility (Custom tags)
▪ Data Quality (Not critical)
▪ Reliability (24/7)
▪ Security (Multitenancy)
▪ Self-Service (Canned reports, Data
science)
▪ Cost (The less the better )
▪ Constraints (Public Cloud)
Architecture Drivers:
Trade-off:
 Non‐Relational Architecture
 Reporting via Materialized View
pattern
Solution Architecture
26
Technologies:
• Amazon S3
• Flume
• Hadoop/HDFS, MapReduce
• HBase
• Oozie
• Hive
Node 1
Node 2
Node N
Tips for Designing Big Data Solutions
27
 Understand data users and sources
 Discover architecture drivers
 Select proper reference architecture
 Do trade-off analysis, address cons
 Map reference architecture to technology stack
 Prototype, re-evaluate architecture
 Estimate implementation efforts
 Set up devops practices from the very beginning
 Advance in solution development through “small wins”
 Be ready for changes, big data technologies are evolving
rapidly
28
▪ Leading global Product and
Application Development partner
founded in 1993
▪ 3,300+ employees across North
America, Ukraine and Western
Europe
▪ Thousands of successful outsourcing
projects!
SaaS/Cloud Solutions . Mobility Solutions . UX/UI
BI/Analytics/Big Data . Software Architecture . Security
Clients include:
Thank You!
29
SoftServe US Office
One Congress Plaza,
111 Congress Avenue, Suite 2700 Austin, TX
78701
Tel: 512.516.8880
Contacts
Serhiy Haziyev: shaziyev@softserveinc.com
Olha Hrytsay: ohrytsay@softserveinc.com

More Related Content

What's hot

Data Virtualization - Enabling Next Generation Analytics
Data Virtualization - Enabling Next Generation AnalyticsData Virtualization - Enabling Next Generation Analytics
Data Virtualization - Enabling Next Generation AnalyticsDenodo
 
Digital intelligence satish bhatia
Digital intelligence satish bhatiaDigital intelligence satish bhatia
Digital intelligence satish bhatiaSatish Bhatia
 
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Denodo
 
Empowering your Enterprise with a Self-Service Data Marketplace (ASEAN)
Empowering your Enterprise with a Self-Service Data Marketplace (ASEAN)Empowering your Enterprise with a Self-Service Data Marketplace (ASEAN)
Empowering your Enterprise with a Self-Service Data Marketplace (ASEAN)Denodo
 
GDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data VirtualizationGDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data VirtualizationDenodo
 
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?Denodo
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionDenodo
 
Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with OktopusDenodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with OktopusDenodo
 
3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your Portfolio3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your PortfolioDenodo
 
Solution Architecture US healthcare
Solution Architecture US healthcare Solution Architecture US healthcare
Solution Architecture US healthcare sumiteshkr
 
Ten Pillars of World Class Data Virtualization
Ten Pillars of World Class Data VirtualizationTen Pillars of World Class Data Virtualization
Ten Pillars of World Class Data VirtualizationDenodo
 
Big Data Expo 2015 - Barnsten Why Data Modelling is Essential
Big Data Expo 2015 - Barnsten Why Data Modelling is EssentialBig Data Expo 2015 - Barnsten Why Data Modelling is Essential
Big Data Expo 2015 - Barnsten Why Data Modelling is EssentialBigDataExpo
 
Fixing data science & Accelerating Artificial Super Intelligence Development
 Fixing data science & Accelerating Artificial Super Intelligence Development Fixing data science & Accelerating Artificial Super Intelligence Development
Fixing data science & Accelerating Artificial Super Intelligence DevelopmentManojKumarR41
 
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...Denodo
 
Fast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationFast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationDenodo
 
Agile Data Management with Enterprise Data Fabric (ASEAN)
Agile Data Management with Enterprise Data Fabric (ASEAN)Agile Data Management with Enterprise Data Fabric (ASEAN)
Agile Data Management with Enterprise Data Fabric (ASEAN)Denodo
 
A Big Data Journey
A Big Data JourneyA Big Data Journey
A Big Data JourneyPaul Boal
 
Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)Denodo
 
Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure CloudCaserta
 
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)Denodo
 

What's hot (20)

Data Virtualization - Enabling Next Generation Analytics
Data Virtualization - Enabling Next Generation AnalyticsData Virtualization - Enabling Next Generation Analytics
Data Virtualization - Enabling Next Generation Analytics
 
Digital intelligence satish bhatia
Digital intelligence satish bhatiaDigital intelligence satish bhatia
Digital intelligence satish bhatia
 
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
 
Empowering your Enterprise with a Self-Service Data Marketplace (ASEAN)
Empowering your Enterprise with a Self-Service Data Marketplace (ASEAN)Empowering your Enterprise with a Self-Service Data Marketplace (ASEAN)
Empowering your Enterprise with a Self-Service Data Marketplace (ASEAN)
 
GDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data VirtualizationGDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data Virtualization
 
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An Introduction
 
Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with OktopusDenodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
 
3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your Portfolio3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your Portfolio
 
Solution Architecture US healthcare
Solution Architecture US healthcare Solution Architecture US healthcare
Solution Architecture US healthcare
 
Ten Pillars of World Class Data Virtualization
Ten Pillars of World Class Data VirtualizationTen Pillars of World Class Data Virtualization
Ten Pillars of World Class Data Virtualization
 
Big Data Expo 2015 - Barnsten Why Data Modelling is Essential
Big Data Expo 2015 - Barnsten Why Data Modelling is EssentialBig Data Expo 2015 - Barnsten Why Data Modelling is Essential
Big Data Expo 2015 - Barnsten Why Data Modelling is Essential
 
Fixing data science & Accelerating Artificial Super Intelligence Development
 Fixing data science & Accelerating Artificial Super Intelligence Development Fixing data science & Accelerating Artificial Super Intelligence Development
Fixing data science & Accelerating Artificial Super Intelligence Development
 
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
 
Fast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationFast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow Presentation
 
Agile Data Management with Enterprise Data Fabric (ASEAN)
Agile Data Management with Enterprise Data Fabric (ASEAN)Agile Data Management with Enterprise Data Fabric (ASEAN)
Agile Data Management with Enterprise Data Fabric (ASEAN)
 
A Big Data Journey
A Big Data JourneyA Big Data Journey
A Big Data Journey
 
Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)
 
Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure Cloud
 
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
 

Similar to big_data_case_studies.pdf

Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...SoftServe
 
Data Virtualization. An Introduction (ASEAN)
Data Virtualization. An Introduction (ASEAN)Data Virtualization. An Introduction (ASEAN)
Data Virtualization. An Introduction (ASEAN)Denodo
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dataconomy Media
 
Neo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in GraphdatenbankenNeo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in GraphdatenbankenNeo4j
 
Choosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudChoosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudJames Serra
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarioskcmallu
 
SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview Rajesh Menon
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2RojaT4
 
Big Data with Not Only SQL
Big Data with Not Only SQLBig Data with Not Only SQL
Big Data with Not Only SQLPhilippe Julio
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptxElsonPaul2
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Nathan Bijnens
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneySai Paravastu
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitecturePerficient, Inc.
 
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02email2jl
 
Data Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWSData Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWSDenodo
 
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data VirtualizationKASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data VirtualizationDenodo
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 
Big and fast data strategy 2017 jr
Big and fast data strategy 2017 jrBig and fast data strategy 2017 jr
Big and fast data strategy 2017 jrJonathan Raspaud
 
IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxAIMLSEMINARS
 

Similar to big_data_case_studies.pdf (20)

Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
 
Data Virtualization. An Introduction (ASEAN)
Data Virtualization. An Introduction (ASEAN)Data Virtualization. An Introduction (ASEAN)
Data Virtualization. An Introduction (ASEAN)
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
Neo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in GraphdatenbankenNeo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in Graphdatenbanken
 
Choosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudChoosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloud
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
 
SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
Big Data with Not Only SQL
Big Data with Not Only SQLBig Data with Not Only SQL
Big Data with Not Only SQL
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data Architecture
 
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
 
Data Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWSData Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWS
 
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data VirtualizationKASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Big and fast data strategy 2017 jr
Big and fast data strategy 2017 jrBig and fast data strategy 2017 jr
Big and fast data strategy 2017 jr
 
IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptx
 

More from vishal choudhary (20)

SE-Lecture1.ppt
SE-Lecture1.pptSE-Lecture1.ppt
SE-Lecture1.ppt
 
SE-Testing.ppt
SE-Testing.pptSE-Testing.ppt
SE-Testing.ppt
 
SE-CyclomaticComplexityand Testing.ppt
SE-CyclomaticComplexityand Testing.pptSE-CyclomaticComplexityand Testing.ppt
SE-CyclomaticComplexityand Testing.ppt
 
SE-Lecture-7.pptx
SE-Lecture-7.pptxSE-Lecture-7.pptx
SE-Lecture-7.pptx
 
Se-Lecture-6.ppt
Se-Lecture-6.pptSe-Lecture-6.ppt
Se-Lecture-6.ppt
 
SE-Lecture-5.pptx
SE-Lecture-5.pptxSE-Lecture-5.pptx
SE-Lecture-5.pptx
 
XML.pptx
XML.pptxXML.pptx
XML.pptx
 
SE-Lecture-8.pptx
SE-Lecture-8.pptxSE-Lecture-8.pptx
SE-Lecture-8.pptx
 
SE-coupling and cohesion.ppt
SE-coupling and cohesion.pptSE-coupling and cohesion.ppt
SE-coupling and cohesion.ppt
 
SE-Lecture-2.pptx
SE-Lecture-2.pptxSE-Lecture-2.pptx
SE-Lecture-2.pptx
 
SE-software design.ppt
SE-software design.pptSE-software design.ppt
SE-software design.ppt
 
SE1.ppt
SE1.pptSE1.ppt
SE1.ppt
 
SE-Lecture-4.pptx
SE-Lecture-4.pptxSE-Lecture-4.pptx
SE-Lecture-4.pptx
 
SE-Lecture=3.pptx
SE-Lecture=3.pptxSE-Lecture=3.pptx
SE-Lecture=3.pptx
 
Multimedia-Lecture-Animation.pptx
Multimedia-Lecture-Animation.pptxMultimedia-Lecture-Animation.pptx
Multimedia-Lecture-Animation.pptx
 
MultimediaLecture5.pptx
MultimediaLecture5.pptxMultimediaLecture5.pptx
MultimediaLecture5.pptx
 
Multimedia-Lecture-7.pptx
Multimedia-Lecture-7.pptxMultimedia-Lecture-7.pptx
Multimedia-Lecture-7.pptx
 
MultiMedia-Lecture-4.pptx
MultiMedia-Lecture-4.pptxMultiMedia-Lecture-4.pptx
MultiMedia-Lecture-4.pptx
 
Multimedia-Lecture-6.pptx
Multimedia-Lecture-6.pptxMultimedia-Lecture-6.pptx
Multimedia-Lecture-6.pptx
 
Multimedia-Lecture-3.pptx
Multimedia-Lecture-3.pptxMultimedia-Lecture-3.pptx
Multimedia-Lecture-3.pptx
 

Recently uploaded

Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 

Recently uploaded (20)

Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 

big_data_case_studies.pdf

  • 1. BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
  • 2. Relational vs. Non-Relational Architecture 2 Relational Non-Relational • Rational • Predictable • Traditional • Agile • Flexible • Modern
  • 4. Big Data Challenges 4 UNSTRUCTURED STRUCTURED HIGH MEDIUM LOW Archives Docs Business Apps Media Social Networks Public Web Data Storages Machine Log Data Sensor Data Data Storages RDBMS, NoSQL, Hadoop, file systems etc. Machine Log Data Application logs, event logs, server data, CDRs, clickstream data etc. Sensor Data Smart electric meters, medical devices, car sensors, road cameras etc. Archives Scanned documents, statements, medical records, e-mails etc.. Docs XLS, PDF, CSV, HTML, JSON etc. Business Apps CRM, ERP systems, HR, project management etc. Social Networks Twitter, Facebook, Google+, LinkedIn etc. Public Web Wikipedia, news, weather, public finance etc Media Images, video, audio etc. Velocity Variety Volume Complexity
  • 5. Big Data Analytics 5 Traditional Analytics (BI) Big Data Analytics Focus on Data Sets Supports • Descriptive analytics • Diagnosis analytics • Limited data sets • Cleansed data • Simple models • Large scale data sets • More types of data • Raw data • Complex data models • Predictive analytics • Data Science Causation: what happened, and why? Correlation: new insight More accurate answers vs
  • 6. Big Data Analytics Use Cases 6 Data Discovery Business Reporting Real Time Intelligence Data Quality Self Service Business Users Intelligent Agents Consumers Low Latency  Reliability Volume Performance Data Scientists/ Analysts
  • 7. Big Data Analytics Reference Architectures 7 Architecture Drivers: Reference Architectures:  ▪ Volume ▪ Sources ▪ Throughput ▪ Latency ▪ Extensibility ▪ Data Quality ▪ Reliability ▪ Security ▪ Self-Service ▪ Cost ▪ Extended Relational ▪ Non-Relational ▪ Hybrid
  • 8. Relational Reference Architecture 8 Web Services Mobile Devices Native Desktop Web Browsers Advanced Analytics OLAP Cubes Query & Reporting Operational Data Stores Data Marts Data Warehouses Replication API/ODBC Messaging ETL Unstructured Semi- Structured Data Sources Integration Data Storages Analytics Presentation Structured
  • 9. 9 Extended Relational Reference Architecture Web Services Mobile Devices Native Desktop Web Browsers Advanced Analytics OLAP Cubes Query & Reporting Operational Data Stores Data Marts Data Warehouses Replication API/ODBC Messaging ETL Unstructured Semi- Structured Data Sources Integration Data Storages Analytics Presentation Structured Key components affected with Big Data challenges
  • 10. Non-Relational Reference Architecture 10 Web Services Mobile Devices Native Desktop Web Browsers Advanced Analytics Map Reduce Query & Reporting Search Engines Distributed File Systems NoSQL Databases API Messaging ETL Unstructured Semi- Structured Data Sources Integration Data Storages Analytics Presentation Structured Key components introduced with non-relational movement
  • 11. Extended Relational vs. Non-Relational Architecture 11 Architecture Drivers Extended  Relational Non‐Relational Large data volume Self‐service (ad‐hoc reporting) Unstructured data processing High data model extensibility High data quality and consistency Extensive security Reliability and fault‐tolerance Low latency (near‐real time) Low cost Skills availability
  • 12. Extended Relational vs. Non-Relational Architecture 12 Architecture Drivers Extended  Relational Non‐Relational Large data volume Self‐service (ad‐hoc reporting) Unstructured data processing High data model extensibility High data quality and consistency Extensive security Reliability and fault‐tolerance Low latency (near‐real time) Low cost Skills availability
  • 13. Extended Relational vs. Non-Relational Architecture 13 Architecture Drivers Extended  Relational Non‐Relational Large data volume Self‐service (ad‐hoc reporting) Unstructured data processing High data model extensibility High data quality and consistency Extensive security Reliability and fault‐tolerance Low latency (near‐real time) Low cost Skills availability
  • 14. Relational vs. Non-Relational Architecture 14 Relational Non-Relational • Rational • Predictable • Traditional • Agile • Flexible • Modern
  • 15. Data Discovery Business Reporting Real Time Intelligence Big Data Analytics Use Cases 15 Business Users Intelligent Agents Consumers Performance Volume Data Scientists
  • 16. Data Discovery: Non-Relational Architecture 16 Web Services Mobile Devices Native Desktop Web Browsers Advanced Analytics Map Reduce Query & Reporting Search Engines Distributed File Systems NoSQL Databases API Messaging ETL Unstructured Semi- Structured Data Sources Integration Data Storages Analytics Presentation Structured
  • 17. Data Discovery Business Reporting Real Time Intelligence Big Data Analytics Use Cases 17 Intelligent Agents Consumers Data Scientists Data Quality Self Service Business Users
  • 18. Business Reporting: Hybrid Architecture 18 Web Services Mobile Devices Native Desktop Web Browsers Map Reduce SQL Query & Reporting Distributed File Systems API Messaging ETL Unstructured Semi- Structured Data Sources Integration Data Storages Analytics Presentation Structured Relational DWH/DM Advanced Analytics Search Engines Extended Relational components Non-relational components
  • 19. Data Discovery Business Reporting Real Time Intelligence Big Data Analytics Use Cases 19 Data Scientists Business Users Intelligent Agents Consumers Low Latency  Reliability
  • 21. 21 Business Goals:  Provide visual environment for building custom mobile application  Charge customers based on the platform they are using, number of consumers’ applications etc. Business Area: Cloud based platform for building, deploying, hosting and managing of mobile applications Case Study #1: Usage & Billing Analysis
  • 22. Architectural Decisions 22 ▪ Volume (> 10 TB) ▪ Sources (Semi-structured - JSON) ▪ Throughput (> 10K/sec) ▪ Latency (2 min) ▪ Extensibility (Custom metrics) ▪ Data Quality (Consistency) ▪ Reliability (24/7) ▪ Security (Multitenancy) ▪ Self-Service (Ad-Hoc reports) ▪ Cost (The less the better ) ▪ Constraints (Public Cloud) Architecture Drivers: Trade-off: // Extended Relational Non-Relational Extensibility ‐ + Data Quality + ‐ Self-Service + ‐  Extended Relational Architecture  Extensibility via Pre‐allocated  Fields pattern 
  • 23. Solution Architecture 23 Technologies: • Amazon Redshift • Amazon SQS • Amazon S3 • Elastic Beanstalk • Jaspersoft BI Professional • Python
  • 24. 24 Business Goals:  Build in-house Analytics Platform for ROI measurement and performance analysis of every product and feature delivered by the e-commerce platform;  Provide the ability to understand how end-users are interacting with service content, products, and features on sites;  Do clickstream analysis;  Perform A/B Testing Business Area: Retail. A platform for e-commerce and collecting feedbacks from customers Case Study #2: Clickstream for retail website
  • 25. // Extended Relational Non- Relational Volume/Scalability +/‐ + Throughput + + Self-Service + +/‐ Extensibility ‐ + Architectural Decisions 25 ▪ Volume (45 TB) ▪ Sources (Semi-structured - JSON) ▪ Throughput (> 20K/sec) ▪ Latency (1 hour) ▪ Extensibility (Custom tags) ▪ Data Quality (Not critical) ▪ Reliability (24/7) ▪ Security (Multitenancy) ▪ Self-Service (Canned reports, Data science) ▪ Cost (The less the better ) ▪ Constraints (Public Cloud) Architecture Drivers: Trade-off:  Non‐Relational Architecture  Reporting via Materialized View pattern
  • 26. Solution Architecture 26 Technologies: • Amazon S3 • Flume • Hadoop/HDFS, MapReduce • HBase • Oozie • Hive Node 1 Node 2 Node N
  • 27. Tips for Designing Big Data Solutions 27  Understand data users and sources  Discover architecture drivers  Select proper reference architecture  Do trade-off analysis, address cons  Map reference architecture to technology stack  Prototype, re-evaluate architecture  Estimate implementation efforts  Set up devops practices from the very beginning  Advance in solution development through “small wins”  Be ready for changes, big data technologies are evolving rapidly
  • 28. 28 ▪ Leading global Product and Application Development partner founded in 1993 ▪ 3,300+ employees across North America, Ukraine and Western Europe ▪ Thousands of successful outsourcing projects! SaaS/Cloud Solutions . Mobility Solutions . UX/UI BI/Analytics/Big Data . Software Architecture . Security Clients include:
  • 29. Thank You! 29 SoftServe US Office One Congress Plaza, 111 Congress Avenue, Suite 2700 Austin, TX 78701 Tel: 512.516.8880 Contacts Serhiy Haziyev: shaziyev@softserveinc.com Olha Hrytsay: ohrytsay@softserveinc.com