SlideShare a Scribd company logo
1 of 36
SOFTSERVE’S
HADOOP DEMO LAB
Aug, 2014
Agenda
1) Who we are & What we do
2) Why we started this project
3) High-Level Task Overview
4) Task Analysis
5) Solution Architecture
6) Trade-off Analysis
7) Development Aspects
WHO WE ARE
&
WHAT WE DO
4
▪ Leading global Product and
Application Development partner
founded in 1993
▪ 3,300+ employees across North
America, Ukraine and Western
Europe
▪ Thousands of successful outsourcing
projects!
SaaS/Cloud Solutions . Mobility Solutions . UX/UI
BI/Analytics/Big Data . Software Architecture . Security
Clients include:
Why SoftServe
• Dedicated Architecture Group (including BI and BigData, 40+ architects)
• Demo Hadoop Environment
• Reference architecture library
• 10+ successful BI/BigData projects
• Certified Big Data engineers (Hadoop, MongoDB)
• Partnership with major RDBMS, BI and BigData vendors
What we do: Services
1) Design & Assessment
2) Optimization & Modernization
3) POC & Prototyping
4) Development and Quality Control
5) Production and Non-Production Support
Technology Stack
Data Warehouse
Data Integration
Big Data and NoSQL
BI Platforms
Big Data Analytics Expertise
WHY WE STARTED THIS PROJECT
Demo Lab: Why?
1) Increase Internal Experience
2) Create Reference Solution w/o NDA Limitations
3) Get Playground for Tests
4) Provide Demo Environment for Customers (using their data)
5) Decrease time to Deliver (by introducing automation)
HADOOP DEMO LAB:
HIGH-LEVEL TASK OVERVIEW
Demo Lab: Input Data
Data Volume
270-300 Web Servers (Apache HTTPD)
447 392 events per minute
644 245 094 events / day
~100-250 bytes per event
150GB of data per day
Log Types
1) Apache HTTPD access log
2) Apache HTTPD error log
3) Service log (CPU, RAM, Disk I/O, Disk Space)
4) Application server servlet log
Retention
Last 30 days: Raw data
Last 24 hours: per minute aggregation
Whole period: per hour aggregation
Demo Lab: Log Data Examples
Access log:
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326
Error log:
[Sun Mar 7 20:58:27 2004] [info] [client 64.242.88.10] (104)Connection reset by peer: client
stopped connection before send body completed
[Sun Mar 7 21:16:17 2004] [error] [client 24.70.56.49] File does not exist:
/home/httpd/twiki/view/Main/WebHome
Vmstat
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 305416 260688 29160 2356920 2 2 4 1 0 0 6 1 92 2 0
iostat
Linux 2.6.32-100.28.5.el6.x86_64 (dev-db) 07/09/2011
avg-cpu: %user %nice %system %iowait %steal %idle
5.68 0.00 0.52 2.03 0.00 91.76
TASK ANALYSIS
Architecture Drivers: Use Cases
Architecture Drivers: Quality Attributes (1/3)
Architecture Drivers: Quality Attributes (2/3)
Architecture Drivers: Quality Attributes (3/3)
Architecture Drivers: Limitations
Demo Lab: Marketecture
SOLUTION ARCHITECTURE
Reference Architecture Selection
Options:
• Traditional Relational
• Extended Relational
• Non-Relational
• Lambda Architecture (Hybrid)
• Data Refinery (Hybrid)
Lambda Architecture:
• Simultaneous access to real-time and historical data
• Isolated Design and Development
• Increased Fault Tolerance
Lambda Architecture
Data Flow
Infrastructure View
TRADE-OFF ANALYSIS
Distribution Selection
Hive Stinger vs Impala
Compression Ratio
Access Speed
BI Tool Selection
Options:
• Tableau
• JasperSoft
• Microstrategy
• QlikView
Microstrategy:
• Powerful and Feature-Rich BI Tool
• 31 days trial period w/o trial key
• Well-integrated with Hadoop (and Impala)
• Easy to install in a silent-mode (command-line)
DEVELOPMENT ASPECTS
Automatization
80% 20%
Development and Debugging F&P Testing, Demo
Local Development Cloud Development
Log Generator
4200 events / second (File source)
5500 events / second (TCP source)
Accurate Sizing
100k/min
50k/min
20k/min
200k
/min
Calculator!
Reference
35Click to add the title
1) Install Hadoop (CDH4) on 5 nodes with VMWare, CDH4, Cloudera Manager 4
https://www.youtube.com/watch?v=CobVqNMiqww
2) Puppet & Vagrant Tutorial
http://puppetlabs.com/blog/puppet-and-vagrant-tutorial
3) Hardware for Hadoop
http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.3.0/bk_cluster-planning-guide
/content/hardware-selection-for-hbase.html
4) How to Refine and Visualize Server Log Data
http://hortonworks.com/hadoop-tutorial/how-to-refine-and-visualize-server-log-data/
5) Hadoop Cluster Sizing
http://hortonworks.com/wp-content/uploads/downloads/2013/06/
Hortonworks.ClusterConfigGuide.1.0.pdf
Thank You!
36
SoftServe US Office
One Congress Plaza,
111 Congress Avenue, Suite 2700 Austin, TX
78701
Tel: 512.516.8880
Contacts
Valentyn Kropov
vkrop@softserveinc.com
Tel: 866.687.3588 x4341

More Related Content

Similar to SoftServe's Hadoop Demo Lab Architecture and Development

Microsoft Machine Learning Server. Architecture View
Microsoft Machine Learning Server. Architecture ViewMicrosoft Machine Learning Server. Architecture View
Microsoft Machine Learning Server. Architecture ViewDmitry Petukhov
 
Azure Nights August2017
Azure Nights August2017Azure Nights August2017
Azure Nights August2017Michael Frank
 
Data(?)Ops with CircleCI
Data(?)Ops with CircleCIData(?)Ops with CircleCI
Data(?)Ops with CircleCIJinwoong Kim
 
Mid-Term Presentation
Mid-Term Presentation Mid-Term Presentation
Mid-Term Presentation HarshJivani2
 
Oracle_Patching_Untold_Story_Final_Part2.pdf
Oracle_Patching_Untold_Story_Final_Part2.pdfOracle_Patching_Untold_Story_Final_Part2.pdf
Oracle_Patching_Untold_Story_Final_Part2.pdfAlex446314
 
Moving to the next neth server ui by @davideprincipi #neth17
Moving to the next neth server ui by @davideprincipi #neth17Moving to the next neth server ui by @davideprincipi #neth17
Moving to the next neth server ui by @davideprincipi #neth17NethServer
 
Webinar: Untethering Compute from Storage
Webinar: Untethering Compute from StorageWebinar: Untethering Compute from Storage
Webinar: Untethering Compute from StorageAvere Systems
 
Brk3288 sql server v.next with support on linux, windows and containers was...
Brk3288 sql server v.next with support on linux, windows and containers   was...Brk3288 sql server v.next with support on linux, windows and containers   was...
Brk3288 sql server v.next with support on linux, windows and containers was...Bob Ward
 
KoprowskiT_SQLSatMoscow_WASDforBeginners
KoprowskiT_SQLSatMoscow_WASDforBeginnersKoprowskiT_SQLSatMoscow_WASDforBeginners
KoprowskiT_SQLSatMoscow_WASDforBeginnersTobias Koprowski
 
Code for Startup MVP (Ruby on Rails) Session 1
Code for Startup MVP (Ruby on Rails) Session 1Code for Startup MVP (Ruby on Rails) Session 1
Code for Startup MVP (Ruby on Rails) Session 1Henry S
 
ContainerCon EU 2016 - Software-Defined Storage and Container Schedulers
ContainerCon EU 2016 - Software-Defined Storage and Container SchedulersContainerCon EU 2016 - Software-Defined Storage and Container Schedulers
ContainerCon EU 2016 - Software-Defined Storage and Container SchedulersDavid vonThenen
 
#SPSBrussels 2017 vincent biret #azure #functions microsoft #flow
#SPSBrussels 2017 vincent biret #azure #functions microsoft #flow#SPSBrussels 2017 vincent biret #azure #functions microsoft #flow
#SPSBrussels 2017 vincent biret #azure #functions microsoft #flowVincent Biret
 
Introduction to Microsoft Flow and Azure Functions
Introduction to Microsoft Flow and Azure FunctionsIntroduction to Microsoft Flow and Azure Functions
Introduction to Microsoft Flow and Azure FunctionsBIWUG
 
Le novità di SQL Server 2022
Le novità di SQL Server 2022Le novità di SQL Server 2022
Le novità di SQL Server 2022Gianluca Hotz
 
State of GeoServer 2015
State of GeoServer 2015State of GeoServer 2015
State of GeoServer 2015Jody Garnett
 
2010/09 - Database Architechs - Performance & Tuning Tool
2010/09 - Database Architechs - Performance & Tuning Tool2010/09 - Database Architechs - Performance & Tuning Tool
2010/09 - Database Architechs - Performance & Tuning ToolDatabase Architechs
 
Monitoring IO performance with iostat and pt-diskstats
Monitoring IO performance with iostat and pt-diskstatsMonitoring IO performance with iostat and pt-diskstats
Monitoring IO performance with iostat and pt-diskstatsBen Mildren
 
[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogic
[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogic[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogic
[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogicRakuten Group, Inc.
 
Design & Secure Your Cloud Infrastructure
Design & Secure Your Cloud Infrastructure Design & Secure Your Cloud Infrastructure
Design & Secure Your Cloud Infrastructure Anoop Nair
 

Similar to SoftServe's Hadoop Demo Lab Architecture and Development (20)

Microsoft Machine Learning Server. Architecture View
Microsoft Machine Learning Server. Architecture ViewMicrosoft Machine Learning Server. Architecture View
Microsoft Machine Learning Server. Architecture View
 
Azure Nights August2017
Azure Nights August2017Azure Nights August2017
Azure Nights August2017
 
DR_PRESENT 1
DR_PRESENT 1DR_PRESENT 1
DR_PRESENT 1
 
Data(?)Ops with CircleCI
Data(?)Ops with CircleCIData(?)Ops with CircleCI
Data(?)Ops with CircleCI
 
Mid-Term Presentation
Mid-Term Presentation Mid-Term Presentation
Mid-Term Presentation
 
Oracle_Patching_Untold_Story_Final_Part2.pdf
Oracle_Patching_Untold_Story_Final_Part2.pdfOracle_Patching_Untold_Story_Final_Part2.pdf
Oracle_Patching_Untold_Story_Final_Part2.pdf
 
Moving to the next neth server ui by @davideprincipi #neth17
Moving to the next neth server ui by @davideprincipi #neth17Moving to the next neth server ui by @davideprincipi #neth17
Moving to the next neth server ui by @davideprincipi #neth17
 
Webinar: Untethering Compute from Storage
Webinar: Untethering Compute from StorageWebinar: Untethering Compute from Storage
Webinar: Untethering Compute from Storage
 
Brk3288 sql server v.next with support on linux, windows and containers was...
Brk3288 sql server v.next with support on linux, windows and containers   was...Brk3288 sql server v.next with support on linux, windows and containers   was...
Brk3288 sql server v.next with support on linux, windows and containers was...
 
KoprowskiT_SQLSatMoscow_WASDforBeginners
KoprowskiT_SQLSatMoscow_WASDforBeginnersKoprowskiT_SQLSatMoscow_WASDforBeginners
KoprowskiT_SQLSatMoscow_WASDforBeginners
 
Code for Startup MVP (Ruby on Rails) Session 1
Code for Startup MVP (Ruby on Rails) Session 1Code for Startup MVP (Ruby on Rails) Session 1
Code for Startup MVP (Ruby on Rails) Session 1
 
ContainerCon EU 2016 - Software-Defined Storage and Container Schedulers
ContainerCon EU 2016 - Software-Defined Storage and Container SchedulersContainerCon EU 2016 - Software-Defined Storage and Container Schedulers
ContainerCon EU 2016 - Software-Defined Storage and Container Schedulers
 
#SPSBrussels 2017 vincent biret #azure #functions microsoft #flow
#SPSBrussels 2017 vincent biret #azure #functions microsoft #flow#SPSBrussels 2017 vincent biret #azure #functions microsoft #flow
#SPSBrussels 2017 vincent biret #azure #functions microsoft #flow
 
Introduction to Microsoft Flow and Azure Functions
Introduction to Microsoft Flow and Azure FunctionsIntroduction to Microsoft Flow and Azure Functions
Introduction to Microsoft Flow and Azure Functions
 
Le novità di SQL Server 2022
Le novità di SQL Server 2022Le novità di SQL Server 2022
Le novità di SQL Server 2022
 
State of GeoServer 2015
State of GeoServer 2015State of GeoServer 2015
State of GeoServer 2015
 
2010/09 - Database Architechs - Performance & Tuning Tool
2010/09 - Database Architechs - Performance & Tuning Tool2010/09 - Database Architechs - Performance & Tuning Tool
2010/09 - Database Architechs - Performance & Tuning Tool
 
Monitoring IO performance with iostat and pt-diskstats
Monitoring IO performance with iostat and pt-diskstatsMonitoring IO performance with iostat and pt-diskstats
Monitoring IO performance with iostat and pt-diskstats
 
[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogic
[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogic[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogic
[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogic
 
Design & Secure Your Cloud Infrastructure
Design & Secure Your Cloud Infrastructure Design & Secure Your Cloud Infrastructure
Design & Secure Your Cloud Infrastructure
 

Recently uploaded

Challengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya Shirtrahman018755
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...SofiyaSharma5
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012rehmti665
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts servicevipmodelshub1
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...aditipandeya
 
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on DeliveryCall Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Deliverybabeytanya
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebJames Anderson
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of indiaimessage0108
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Roomdivyansh0kumar0
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts servicesonalikaur4
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGAPNIC
 
Radiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsRadiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsstephieert
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Dana Luther
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersDamian Radcliffe
 

Recently uploaded (20)

Challengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
 
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
 
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on DeliveryCall Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of india
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOG
 
Radiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsRadiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girls
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
 

SoftServe's Hadoop Demo Lab Architecture and Development

  • 2. Agenda 1) Who we are & What we do 2) Why we started this project 3) High-Level Task Overview 4) Task Analysis 5) Solution Architecture 6) Trade-off Analysis 7) Development Aspects
  • 4. 4 ▪ Leading global Product and Application Development partner founded in 1993 ▪ 3,300+ employees across North America, Ukraine and Western Europe ▪ Thousands of successful outsourcing projects! SaaS/Cloud Solutions . Mobility Solutions . UX/UI BI/Analytics/Big Data . Software Architecture . Security Clients include:
  • 5. Why SoftServe • Dedicated Architecture Group (including BI and BigData, 40+ architects) • Demo Hadoop Environment • Reference architecture library • 10+ successful BI/BigData projects • Certified Big Data engineers (Hadoop, MongoDB) • Partnership with major RDBMS, BI and BigData vendors
  • 6. What we do: Services 1) Design & Assessment 2) Optimization & Modernization 3) POC & Prototyping 4) Development and Quality Control 5) Production and Non-Production Support
  • 7. Technology Stack Data Warehouse Data Integration Big Data and NoSQL BI Platforms
  • 8. Big Data Analytics Expertise
  • 9. WHY WE STARTED THIS PROJECT
  • 10.
  • 11. Demo Lab: Why? 1) Increase Internal Experience 2) Create Reference Solution w/o NDA Limitations 3) Get Playground for Tests 4) Provide Demo Environment for Customers (using their data) 5) Decrease time to Deliver (by introducing automation)
  • 13. Demo Lab: Input Data Data Volume 270-300 Web Servers (Apache HTTPD) 447 392 events per minute 644 245 094 events / day ~100-250 bytes per event 150GB of data per day Log Types 1) Apache HTTPD access log 2) Apache HTTPD error log 3) Service log (CPU, RAM, Disk I/O, Disk Space) 4) Application server servlet log Retention Last 30 days: Raw data Last 24 hours: per minute aggregation Whole period: per hour aggregation
  • 14. Demo Lab: Log Data Examples Access log: 127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 Error log: [Sun Mar 7 20:58:27 2004] [info] [client 64.242.88.10] (104)Connection reset by peer: client stopped connection before send body completed [Sun Mar 7 21:16:17 2004] [error] [client 24.70.56.49] File does not exist: /home/httpd/twiki/view/Main/WebHome Vmstat procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 305416 260688 29160 2356920 2 2 4 1 0 0 6 1 92 2 0 iostat Linux 2.6.32-100.28.5.el6.x86_64 (dev-db) 07/09/2011 avg-cpu: %user %nice %system %iowait %steal %idle 5.68 0.00 0.52 2.03 0.00 91.76
  • 17. Architecture Drivers: Quality Attributes (1/3)
  • 18. Architecture Drivers: Quality Attributes (2/3)
  • 19. Architecture Drivers: Quality Attributes (3/3)
  • 23. Reference Architecture Selection Options: • Traditional Relational • Extended Relational • Non-Relational • Lambda Architecture (Hybrid) • Data Refinery (Hybrid) Lambda Architecture: • Simultaneous access to real-time and historical data • Isolated Design and Development • Increased Fault Tolerance
  • 29. Hive Stinger vs Impala Compression Ratio Access Speed
  • 30. BI Tool Selection Options: • Tableau • JasperSoft • Microstrategy • QlikView Microstrategy: • Powerful and Feature-Rich BI Tool • 31 days trial period w/o trial key • Well-integrated with Hadoop (and Impala) • Easy to install in a silent-mode (command-line)
  • 32. Automatization 80% 20% Development and Debugging F&P Testing, Demo Local Development Cloud Development
  • 33. Log Generator 4200 events / second (File source) 5500 events / second (TCP source)
  • 35. Reference 35Click to add the title 1) Install Hadoop (CDH4) on 5 nodes with VMWare, CDH4, Cloudera Manager 4 https://www.youtube.com/watch?v=CobVqNMiqww 2) Puppet & Vagrant Tutorial http://puppetlabs.com/blog/puppet-and-vagrant-tutorial 3) Hardware for Hadoop http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.3.0/bk_cluster-planning-guide /content/hardware-selection-for-hbase.html 4) How to Refine and Visualize Server Log Data http://hortonworks.com/hadoop-tutorial/how-to-refine-and-visualize-server-log-data/ 5) Hadoop Cluster Sizing http://hortonworks.com/wp-content/uploads/downloads/2013/06/ Hortonworks.ClusterConfigGuide.1.0.pdf
  • 36. Thank You! 36 SoftServe US Office One Congress Plaza, 111 Congress Avenue, Suite 2700 Austin, TX 78701 Tel: 512.516.8880 Contacts Valentyn Kropov vkrop@softserveinc.com Tel: 866.687.3588 x4341

Editor's Notes

  1. All of these vendors are used in our projects. Among them Global leaders (Oracle, IBM, IBI, SAS, SAP) and many niche players (Infobright, Pentaho etc.) With proprietary software and open source