SlideShare a Scribd company logo
| 1
MULTI-TENANT HADOOP
THE CHALLENGE OF
MAINTAINING HIGH SLAS
Edouard ROUSSEAUX
EDF-DTEO-DSIT-ITO-DATACENTER
Big Data Tech Lead
| 2
SUMMARY
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
1. EDF CONTEXT
PRESENTATION
2. BIG DATA STRATEGY OF EDF-DSIT (IT PRODUCTION DEPARTMENT)
HISTORY
ARCHITECTURE CHOICES
BIG DATA SERVICE OFFER
3. CHALLENGES TO TAKE UP
CURRENT STATE
DIAGNOSTIC
FOCUS
4. ACTION PLAN
TECHNICAL POINTS
ORGANIZATIONAL POINTS
| 3
EDF CONTEXT
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
WORLD LEADER IN LOW-CARBON ENERGY, EDF GROUP BRINGS TOGETHER ALL THE BUSINESS OF
PRODUCTION, TRADE AND ELECTRICITY NETWORKS.
EDF-DSIT PROVIDES IT-SERVICES TO SUPPORT
THE GROUP IN ITS DIGITAL TRANSFORMATION
| 4
BIG DATA STRATEGY OF EDF-DSIT
A SHARED DATALAKE
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
ARCHITECTURE CHOICES
Shared Platform for all Group businesses:
- Centralization of data
- Economic efficiency
- Sharing / Cross Analyzing data between Business
- Simplify operations
Specific platform per type of use cases in
order to guaranty SLA, performance and flexibility:
- Development / Pre-production
- Production (mainly used as a backend of production
app)
- Backup / Disaster Recovery
- Analytics (soon)
These architectural choices have a very strong
impact on the performance of our infrastructures and
applications
HISTOIRY OF BIG DATA AT EDF
We start in 2012 with a first cluster (Hadoop v1)
- Trade Direction wanted to start cross analyzing data
- 4 recycled hosts
… exponential growth …
Until today :
- 3 physical environments (4th one soon)
- We are using Hortonworks (HDP, HDF)
- 200 hosts
- HDFS ≈ 1,4 PB (usable space)
- YARN ≈ 14,6TB of RAM / 4600 vCores
- HBASE ≈ 8,2TB of RAM
- Biggest HBASE table ≈ 90TB (8k Regions)
| 5
BIG DATA STRATEGY OF EDF-DSIT
BIG DATA SERVICE OFFER
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
AN OFFER THAT FOUND ITS PUBLIC
− 5 Business Directions 50 production applications
− +10k Yarn jobs per day (in production)
− 250 HBase Tables / 25k Hbase Regions /
− 150+ Hive DB
− +500 users
− 24/7 applications
− 1 HA applications
− All kind of application type:
• Batch (ELT)
• Streaming / Real-time
• OLAP
• OLTP
• Big volumes / Small volumes
• Critical / Non-critical applications
« BIG DATA » SERVICE OFFER
Business oriented offer :
- A price catalog
- Very simple units of work : TB et vCores
- Global SLA on shared services available on our clusters
(HDFS, HBASE, KAFKA,…)
- Organization, process,…
« self-service » consumption by trades
The design of this service offer also has a strong impact on
the performance of infrastructures and applications
OUR BIG DATA INFRASTRUCTURES HAS BECOME ESSENTIAL TO OUR BUSINESSES
| 6
CHALLENGES TO TAKE UP
AN OFFER « VICTIM » OF ITS OWN SUCCESS
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
SITUATION LAST SUMMER
Many critical use cases for businesses
Hard time maintaining the expected level of service :
- Instability / Unavailability
- Difficulty to Communicate
99.5
100 100 100
99.35
99.86
99.1
99
96
97.44
98.85
99.8599.8
99.9
96.3
100
99.2
99.88
100 100
99.6
100 100 100
95
96
97
98
99
100
101
Juin Juillet Août Septembre Octobre Novembre
Availability of services in 2017
Hive Hbase Hdfs Yarn
| 7
CHALLENGES TO TAKE UP
DIAGNOSTIC
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
TECHNICAL ISSUES
Improve the way we operate our shared Big Data Infrastructures:
- Insufficient Metrics
- Monitoring didn’t evolve as fast as application types
- Complex diagnostics
- Sizing anticipation were not accurate
Some applications were developed with anti-patterns
Lack of rigor when putting applications in production
- Scale-out and Performance tests
- No code review
Internal billing based on storage and CPU (insufficient) :
- Does not accurately reflect cluster usage
- Does not encourage a virtuous use of clusters
NEED FOR FURTHER TECHNICAL &
PRODUCTION SUPPORT
| 8
CHALLENGES TO TAKE UP
DIAGNOSTIC
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
ORGANIZATIONAL ISSUES
« Self-Service » Offer:
- Partial control of what runs on infrastructure
- Lack of a global vision of clusters usage
SLA not accurate enough:
- No concept of degraded service
- Feeling not equal between business and operation teams
Improvement of our production skills
- Shift in skill type required
Shared infrastructure:
- Governance challenge (change management, ...)
- Businesses Use cases impact each other
- Accurate resources allocation is essential (according to needs)
Capacity Planning :
- Not sufficiently detailed
- Not easy to quantify the potential impact of business use cases
(Restricted vision / Sizing)
NEED MORE COORDINATION BETWEEN TEAMS
| 9
CHALLENGES TO TAKE UP
IN HINDSIGHT
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
EVOLVING CLUSTER USAGE
Until Summer 2017 : Only « ELT » usage
New type of business use cases :
- Intensive use of HBASE
- Huge volumes
- Critical use cases
Lack of anticipation :
- Impact of those new use cases not qualified
- More than 1000 Regions par Region Servers
- HDFS saturation
- HBASE collapsed
99.5
100 100 100
99.35
99.86
99.1 99
96
97.44
98.85
99.8599.8 99.9
96.3
100
99.2
99.88
100 100
99.6
100 100 100
95
96
97
98
99
100
101
Juin Juillet Août Septembre Octobre Novembre
Availability of services 2017
Hive Hbase Hdfs Yarn
COMPELING EVENT FOR CHANGE
| 10
ACTION PLAN
TECHNICAL POINTS
EDITOR SUPPORT
Expertise
- Diagnostic aid
- Improving operating aspect of infrastructure
- Metrics
- Tools
- Sizing / Scaling
- Improvement of application ops
Internal skill improvement
Communication
- Facilitate internal communication between teams
- Best practices
- Shared Diagnostics
- Facilitate dialogue with management
- Investment decisions
- Technical evolution decisions
RESOURCES ISOLATION / SHARING
Fine management of Yarn queues :
- Queues per Business Directions
- Sub-queues per applications or use cases ?
- Overbooking ? Pre-emption ?
Resources Isolation :
- Node labels
- Region Server groups
- Containerization of Elasticsearch/SolR
Protection mechanism of resources :
- HDFS Quotas
- HBASE Quotas
- Kafka Quotas
- Zookeeper Observers
Evaluate multi-tenancy to match business requirements and
enterprise strategy.
| 11
ACTION PLAN
CONFIGURATION & ARCHITECTURE
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
| 12
ACTION PLAN
SERVICES AVAILABILITY
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
| 13
ACTION PLAN
MONITORING – HBASE TABLE
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
| 14
ACTION PLAN
MONITORING – HBASE REGION SERVER
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
| 15
ACTION PLAN
MONITORING – KAFKA
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
| 16
ACTION PLAN
MONITORING – HBASE REPLICATION (EVOLUTION OF BAD ROWS)
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
| 17
ACTION PLAN
ORGANIZATION
COMMUNICATION
Build and animate Enterprise Communities
- Templates / GIT,…
- Feedback Sharing
Improve Cross team communication
Share Development guides
Create shared metrics
- Dashboards
- Health Check
- Application Status
| 18
ACTION PLAN
ORGANIZATION
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
MANAGEMENT
Reorganization of Application Management :
- Centralized team
- Global vision of jobs
- Centralization of application logs
- Tools
CAPACITY PLANNING
Setup of an « industrial » Capacity Planning
- Definition of the right metrics to predict
- Tools
- Regural meetup to review the planning
SERVICE OFFER
Modification of Billing to reflect actual usage
- Memory and CPU Usage for HBASE
- Number of Region
- Markup for small files
MANAGE EXPECTATIONS & ANTICIPATE
REQUIREMENTS
| 19
ACTION PLAN
POSITIVE EFFECTS
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
99.5
100 100 100
99.35
99.86
100 100 100
99.1
99
96
97.44
98.85
99.85
100 100
99.899.8
99.9
96.3
100
99.2
99.88
100 100 100100 100
99.6
100 100 100 100 100 100
95
96
97
98
99
100
101
Juin Juillet Août Septembre Octobre Novembre Décembre Janvier Février
Availability of services
Hive Hbase Hdfs Yarn
| 20
THING TO TAKE-AWAY
Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
Service OFFER & Multi-tenancy must be managed globally
Communication & coordination at all level is essential
Get help !!
| 21Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
THANK YOU !
QUESTIONS

More Related Content

What's hot

O2’s Financial Data Hub: going beyond IFRS compliance to support digital tran...
O2’s Financial Data Hub: going beyond IFRS compliance to support digital tran...O2’s Financial Data Hub: going beyond IFRS compliance to support digital tran...
O2’s Financial Data Hub: going beyond IFRS compliance to support digital tran...
DataWorks Summit
 
Artificial Intelligence and Analytic Ops to Continuously Improve Business Out...
Artificial Intelligence and Analytic Ops to Continuously Improve Business Out...Artificial Intelligence and Analytic Ops to Continuously Improve Business Out...
Artificial Intelligence and Analytic Ops to Continuously Improve Business Out...
DataWorks Summit
 
Presentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_finalPresentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_final
Diego Alberto Tamayo
 
Synchronicity of a distributed financial system
Synchronicity of a distributed financial systemSynchronicity of a distributed financial system
Synchronicity of a distributed financial system
DataWorks Summit
 
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks
 
Securing and governing a multi-tenant data lake within the financial industry
Securing and governing a multi-tenant data lake within the financial industrySecuring and governing a multi-tenant data lake within the financial industry
Securing and governing a multi-tenant data lake within the financial industry
DataWorks Summit
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
DataWorks Summit
 
Cloud Innovation Day - Commonwealth of PA v11.3
Cloud Innovation Day - Commonwealth of PA v11.3Cloud Innovation Day - Commonwealth of PA v11.3
Cloud Innovation Day - Commonwealth of PA v11.3
Eric Rice
 
Pouring the Foundation: Data Management in the Energy Industry
Pouring the Foundation: Data Management in the Energy IndustryPouring the Foundation: Data Management in the Energy Industry
Pouring the Foundation: Data Management in the Energy Industry
DataWorks Summit
 
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseHybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
DataWorks Summit
 
Ultralight Data Movement for IoT with SDC Edge
Ultralight Data Movement for IoT with SDC EdgeUltralight Data Movement for IoT with SDC Edge
Ultralight Data Movement for IoT with SDC Edge
DataWorks Summit
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success
DataWorks Summit/Hadoop Summit
 
Lessons learned processing 70 billion data points a day using the hybrid cloud
Lessons learned processing 70 billion data points a day using the hybrid cloudLessons learned processing 70 billion data points a day using the hybrid cloud
Lessons learned processing 70 billion data points a day using the hybrid cloud
DataWorks Summit
 
Continuous Data Ingestion pipeline for the Enterprise
Continuous Data Ingestion pipeline for the EnterpriseContinuous Data Ingestion pipeline for the Enterprise
Continuous Data Ingestion pipeline for the Enterprise
DataWorks Summit
 
The convergence of reporting and interactive BI on Hadoop
The convergence of reporting and interactive BI on HadoopThe convergence of reporting and interactive BI on Hadoop
The convergence of reporting and interactive BI on Hadoop
DataWorks Summit
 
Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015
DataWorks Summit
 
Depositing Value from Transactional Data at Danske Bank
Depositing Value from Transactional Data at Danske BankDepositing Value from Transactional Data at Danske Bank
Depositing Value from Transactional Data at Danske Bank
DataWorks Summit/Hadoop Summit
 
Accelerating Data Warehouse Modernization
Accelerating Data Warehouse ModernizationAccelerating Data Warehouse Modernization
Accelerating Data Warehouse Modernization
DataWorks Summit/Hadoop Summit
 
Tools and approaches for migrating big datasets to the cloud
Tools and approaches for migrating big datasets to the cloudTools and approaches for migrating big datasets to the cloud
Tools and approaches for migrating big datasets to the cloud
DataWorks Summit
 
Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache...
Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache...Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache...
Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache...
DataWorks Summit
 

What's hot (20)

O2’s Financial Data Hub: going beyond IFRS compliance to support digital tran...
O2’s Financial Data Hub: going beyond IFRS compliance to support digital tran...O2’s Financial Data Hub: going beyond IFRS compliance to support digital tran...
O2’s Financial Data Hub: going beyond IFRS compliance to support digital tran...
 
Artificial Intelligence and Analytic Ops to Continuously Improve Business Out...
Artificial Intelligence and Analytic Ops to Continuously Improve Business Out...Artificial Intelligence and Analytic Ops to Continuously Improve Business Out...
Artificial Intelligence and Analytic Ops to Continuously Improve Business Out...
 
Presentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_finalPresentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_final
 
Synchronicity of a distributed financial system
Synchronicity of a distributed financial systemSynchronicity of a distributed financial system
Synchronicity of a distributed financial system
 
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
Securing and governing a multi-tenant data lake within the financial industry
Securing and governing a multi-tenant data lake within the financial industrySecuring and governing a multi-tenant data lake within the financial industry
Securing and governing a multi-tenant data lake within the financial industry
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
 
Cloud Innovation Day - Commonwealth of PA v11.3
Cloud Innovation Day - Commonwealth of PA v11.3Cloud Innovation Day - Commonwealth of PA v11.3
Cloud Innovation Day - Commonwealth of PA v11.3
 
Pouring the Foundation: Data Management in the Energy Industry
Pouring the Foundation: Data Management in the Energy IndustryPouring the Foundation: Data Management in the Energy Industry
Pouring the Foundation: Data Management in the Energy Industry
 
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseHybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
 
Ultralight Data Movement for IoT with SDC Edge
Ultralight Data Movement for IoT with SDC EdgeUltralight Data Movement for IoT with SDC Edge
Ultralight Data Movement for IoT with SDC Edge
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success
 
Lessons learned processing 70 billion data points a day using the hybrid cloud
Lessons learned processing 70 billion data points a day using the hybrid cloudLessons learned processing 70 billion data points a day using the hybrid cloud
Lessons learned processing 70 billion data points a day using the hybrid cloud
 
Continuous Data Ingestion pipeline for the Enterprise
Continuous Data Ingestion pipeline for the EnterpriseContinuous Data Ingestion pipeline for the Enterprise
Continuous Data Ingestion pipeline for the Enterprise
 
The convergence of reporting and interactive BI on Hadoop
The convergence of reporting and interactive BI on HadoopThe convergence of reporting and interactive BI on Hadoop
The convergence of reporting and interactive BI on Hadoop
 
Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015
 
Depositing Value from Transactional Data at Danske Bank
Depositing Value from Transactional Data at Danske BankDepositing Value from Transactional Data at Danske Bank
Depositing Value from Transactional Data at Danske Bank
 
Accelerating Data Warehouse Modernization
Accelerating Data Warehouse ModernizationAccelerating Data Warehouse Modernization
Accelerating Data Warehouse Modernization
 
Tools and approaches for migrating big datasets to the cloud
Tools and approaches for migrating big datasets to the cloudTools and approaches for migrating big datasets to the cloud
Tools and approaches for migrating big datasets to the cloud
 
Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache...
Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache...Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache...
Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache...
 

Similar to Multi-tenant Hadoop - the challenge of maintaining high SLAS

The TCO Calculator - Estimate the True Cost of Hadoop
The TCO Calculator - Estimate the True Cost of Hadoop The TCO Calculator - Estimate the True Cost of Hadoop
The TCO Calculator - Estimate the True Cost of Hadoop
MapR Technologies
 
Pushing new industry standards with Sap Hana
Pushing new industry standards with Sap HanaPushing new industry standards with Sap Hana
Pushing new industry standards with Sap Hana
Ankit Bose
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
Edgar Alejandro Villegas
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
MapR Technologies
 
Paris FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant PresentationParis FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant Presentation
Abdelkrim Hadjidj
 
Virtualizing SAP HANA with Hitachi Unified Compute Platform Solutions: Bring...
Virtualizing SAP HANA with Hitachi Unified Compute Platform Solutions: Bring...Virtualizing SAP HANA with Hitachi Unified Compute Platform Solutions: Bring...
Virtualizing SAP HANA with Hitachi Unified Compute Platform Solutions: Bring...
Hitachi Vantara
 
Hadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRHadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapR
Data Con LA
 
BI congres 2016-2: Diving into weblog data with SAS on Hadoop - Lisa Truyers...
BI congres 2016-2: Diving into weblog data with SAS on Hadoop -  Lisa Truyers...BI congres 2016-2: Diving into weblog data with SAS on Hadoop -  Lisa Truyers...
BI congres 2016-2: Diving into weblog data with SAS on Hadoop - Lisa Truyers...
BICC Thomas More
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Kiththi Perera
 
Big data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardBig data solutions on cloud – the way forward
Big data solutions on cloud – the way forward
Kiththi Perera
 
Software Defined Infrastructure
Software Defined InfrastructureSoftware Defined Infrastructure
Software Defined Infrastructure
inside-BigData.com
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
jaxconf
 
Disaster Recovery for SAP HANA with SUSE Linux
Disaster Recovery for SAP HANA with SUSE LinuxDisaster Recovery for SAP HANA with SUSE Linux
Disaster Recovery for SAP HANA with SUSE Linux
Dirk Oppenkowski
 
SAP on pay as you go model
SAP on pay as you go modelSAP on pay as you go model
SAP on pay as you go model
Ajay Kumar Uppal
 
Integrating Hadoop into your enterprise IT environment
Integrating Hadoop into your enterprise IT environmentIntegrating Hadoop into your enterprise IT environment
Integrating Hadoop into your enterprise IT environment
MapR Technologies
 
High Value Business Intelligence for IBM Platform compute environments
High Value Business Intelligence for IBM Platform compute environmentsHigh Value Business Intelligence for IBM Platform compute environments
High Value Business Intelligence for IBM Platform compute environments
Gabor Samu
 
How to scale your PaaS with OVH infrastructure?
How to scale your PaaS with OVH infrastructure?How to scale your PaaS with OVH infrastructure?
How to scale your PaaS with OVH infrastructure?
OVHcloud
 
HP Enterprises in Hana Pankaj Jain May 2016
HP Enterprises in Hana Pankaj Jain May 2016HP Enterprises in Hana Pankaj Jain May 2016
HP Enterprises in Hana Pankaj Jain May 2016
INDUSCommunity
 
SAP S4/HANA meetup overview
SAP S4/HANA meetup overview SAP S4/HANA meetup overview
SAP S4/HANA meetup overview
Accenture Hungary
 
Telvent Big Data Approach and Case Studies
Telvent Big Data Approach and Case StudiesTelvent Big Data Approach and Case Studies
Telvent Big Data Approach and Case Studies
CSUC - Consorci de Serveis Universitaris de Catalunya
 

Similar to Multi-tenant Hadoop - the challenge of maintaining high SLAS (20)

The TCO Calculator - Estimate the True Cost of Hadoop
The TCO Calculator - Estimate the True Cost of Hadoop The TCO Calculator - Estimate the True Cost of Hadoop
The TCO Calculator - Estimate the True Cost of Hadoop
 
Pushing new industry standards with Sap Hana
Pushing new industry standards with Sap HanaPushing new industry standards with Sap Hana
Pushing new industry standards with Sap Hana
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Paris FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant PresentationParis FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant Presentation
 
Virtualizing SAP HANA with Hitachi Unified Compute Platform Solutions: Bring...
Virtualizing SAP HANA with Hitachi Unified Compute Platform Solutions: Bring...Virtualizing SAP HANA with Hitachi Unified Compute Platform Solutions: Bring...
Virtualizing SAP HANA with Hitachi Unified Compute Platform Solutions: Bring...
 
Hadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRHadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapR
 
BI congres 2016-2: Diving into weblog data with SAS on Hadoop - Lisa Truyers...
BI congres 2016-2: Diving into weblog data with SAS on Hadoop -  Lisa Truyers...BI congres 2016-2: Diving into weblog data with SAS on Hadoop -  Lisa Truyers...
BI congres 2016-2: Diving into weblog data with SAS on Hadoop - Lisa Truyers...
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
 
Big data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardBig data solutions on cloud – the way forward
Big data solutions on cloud – the way forward
 
Software Defined Infrastructure
Software Defined InfrastructureSoftware Defined Infrastructure
Software Defined Infrastructure
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
 
Disaster Recovery for SAP HANA with SUSE Linux
Disaster Recovery for SAP HANA with SUSE LinuxDisaster Recovery for SAP HANA with SUSE Linux
Disaster Recovery for SAP HANA with SUSE Linux
 
SAP on pay as you go model
SAP on pay as you go modelSAP on pay as you go model
SAP on pay as you go model
 
Integrating Hadoop into your enterprise IT environment
Integrating Hadoop into your enterprise IT environmentIntegrating Hadoop into your enterprise IT environment
Integrating Hadoop into your enterprise IT environment
 
High Value Business Intelligence for IBM Platform compute environments
High Value Business Intelligence for IBM Platform compute environmentsHigh Value Business Intelligence for IBM Platform compute environments
High Value Business Intelligence for IBM Platform compute environments
 
How to scale your PaaS with OVH infrastructure?
How to scale your PaaS with OVH infrastructure?How to scale your PaaS with OVH infrastructure?
How to scale your PaaS with OVH infrastructure?
 
HP Enterprises in Hana Pankaj Jain May 2016
HP Enterprises in Hana Pankaj Jain May 2016HP Enterprises in Hana Pankaj Jain May 2016
HP Enterprises in Hana Pankaj Jain May 2016
 
SAP S4/HANA meetup overview
SAP S4/HANA meetup overview SAP S4/HANA meetup overview
SAP S4/HANA meetup overview
 
Telvent Big Data Approach and Case Studies
Telvent Big Data Approach and Case StudiesTelvent Big Data Approach and Case Studies
Telvent Big Data Approach and Case Studies
 

More from DataWorks Summit

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
Pravash Chandra Das
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
saastr
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Jeffrey Haguewood
 

Recently uploaded (20)

Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
 

Multi-tenant Hadoop - the challenge of maintaining high SLAS

  • 1. | 1 MULTI-TENANT HADOOP THE CHALLENGE OF MAINTAINING HIGH SLAS Edouard ROUSSEAUX EDF-DTEO-DSIT-ITO-DATACENTER Big Data Tech Lead
  • 2. | 2 SUMMARY Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018 1. EDF CONTEXT PRESENTATION 2. BIG DATA STRATEGY OF EDF-DSIT (IT PRODUCTION DEPARTMENT) HISTORY ARCHITECTURE CHOICES BIG DATA SERVICE OFFER 3. CHALLENGES TO TAKE UP CURRENT STATE DIAGNOSTIC FOCUS 4. ACTION PLAN TECHNICAL POINTS ORGANIZATIONAL POINTS
  • 3. | 3 EDF CONTEXT Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018 WORLD LEADER IN LOW-CARBON ENERGY, EDF GROUP BRINGS TOGETHER ALL THE BUSINESS OF PRODUCTION, TRADE AND ELECTRICITY NETWORKS. EDF-DSIT PROVIDES IT-SERVICES TO SUPPORT THE GROUP IN ITS DIGITAL TRANSFORMATION
  • 4. | 4 BIG DATA STRATEGY OF EDF-DSIT A SHARED DATALAKE Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018 ARCHITECTURE CHOICES Shared Platform for all Group businesses: - Centralization of data - Economic efficiency - Sharing / Cross Analyzing data between Business - Simplify operations Specific platform per type of use cases in order to guaranty SLA, performance and flexibility: - Development / Pre-production - Production (mainly used as a backend of production app) - Backup / Disaster Recovery - Analytics (soon) These architectural choices have a very strong impact on the performance of our infrastructures and applications HISTOIRY OF BIG DATA AT EDF We start in 2012 with a first cluster (Hadoop v1) - Trade Direction wanted to start cross analyzing data - 4 recycled hosts … exponential growth … Until today : - 3 physical environments (4th one soon) - We are using Hortonworks (HDP, HDF) - 200 hosts - HDFS ≈ 1,4 PB (usable space) - YARN ≈ 14,6TB of RAM / 4600 vCores - HBASE ≈ 8,2TB of RAM - Biggest HBASE table ≈ 90TB (8k Regions)
  • 5. | 5 BIG DATA STRATEGY OF EDF-DSIT BIG DATA SERVICE OFFER Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018 AN OFFER THAT FOUND ITS PUBLIC − 5 Business Directions 50 production applications − +10k Yarn jobs per day (in production) − 250 HBase Tables / 25k Hbase Regions / − 150+ Hive DB − +500 users − 24/7 applications − 1 HA applications − All kind of application type: • Batch (ELT) • Streaming / Real-time • OLAP • OLTP • Big volumes / Small volumes • Critical / Non-critical applications « BIG DATA » SERVICE OFFER Business oriented offer : - A price catalog - Very simple units of work : TB et vCores - Global SLA on shared services available on our clusters (HDFS, HBASE, KAFKA,…) - Organization, process,… « self-service » consumption by trades The design of this service offer also has a strong impact on the performance of infrastructures and applications OUR BIG DATA INFRASTRUCTURES HAS BECOME ESSENTIAL TO OUR BUSINESSES
  • 6. | 6 CHALLENGES TO TAKE UP AN OFFER « VICTIM » OF ITS OWN SUCCESS Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018 SITUATION LAST SUMMER Many critical use cases for businesses Hard time maintaining the expected level of service : - Instability / Unavailability - Difficulty to Communicate 99.5 100 100 100 99.35 99.86 99.1 99 96 97.44 98.85 99.8599.8 99.9 96.3 100 99.2 99.88 100 100 99.6 100 100 100 95 96 97 98 99 100 101 Juin Juillet Août Septembre Octobre Novembre Availability of services in 2017 Hive Hbase Hdfs Yarn
  • 7. | 7 CHALLENGES TO TAKE UP DIAGNOSTIC Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018 TECHNICAL ISSUES Improve the way we operate our shared Big Data Infrastructures: - Insufficient Metrics - Monitoring didn’t evolve as fast as application types - Complex diagnostics - Sizing anticipation were not accurate Some applications were developed with anti-patterns Lack of rigor when putting applications in production - Scale-out and Performance tests - No code review Internal billing based on storage and CPU (insufficient) : - Does not accurately reflect cluster usage - Does not encourage a virtuous use of clusters NEED FOR FURTHER TECHNICAL & PRODUCTION SUPPORT
  • 8. | 8 CHALLENGES TO TAKE UP DIAGNOSTIC Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018 ORGANIZATIONAL ISSUES « Self-Service » Offer: - Partial control of what runs on infrastructure - Lack of a global vision of clusters usage SLA not accurate enough: - No concept of degraded service - Feeling not equal between business and operation teams Improvement of our production skills - Shift in skill type required Shared infrastructure: - Governance challenge (change management, ...) - Businesses Use cases impact each other - Accurate resources allocation is essential (according to needs) Capacity Planning : - Not sufficiently detailed - Not easy to quantify the potential impact of business use cases (Restricted vision / Sizing) NEED MORE COORDINATION BETWEEN TEAMS
  • 9. | 9 CHALLENGES TO TAKE UP IN HINDSIGHT Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018 EVOLVING CLUSTER USAGE Until Summer 2017 : Only « ELT » usage New type of business use cases : - Intensive use of HBASE - Huge volumes - Critical use cases Lack of anticipation : - Impact of those new use cases not qualified - More than 1000 Regions par Region Servers - HDFS saturation - HBASE collapsed 99.5 100 100 100 99.35 99.86 99.1 99 96 97.44 98.85 99.8599.8 99.9 96.3 100 99.2 99.88 100 100 99.6 100 100 100 95 96 97 98 99 100 101 Juin Juillet Août Septembre Octobre Novembre Availability of services 2017 Hive Hbase Hdfs Yarn COMPELING EVENT FOR CHANGE
  • 10. | 10 ACTION PLAN TECHNICAL POINTS EDITOR SUPPORT Expertise - Diagnostic aid - Improving operating aspect of infrastructure - Metrics - Tools - Sizing / Scaling - Improvement of application ops Internal skill improvement Communication - Facilitate internal communication between teams - Best practices - Shared Diagnostics - Facilitate dialogue with management - Investment decisions - Technical evolution decisions RESOURCES ISOLATION / SHARING Fine management of Yarn queues : - Queues per Business Directions - Sub-queues per applications or use cases ? - Overbooking ? Pre-emption ? Resources Isolation : - Node labels - Region Server groups - Containerization of Elasticsearch/SolR Protection mechanism of resources : - HDFS Quotas - HBASE Quotas - Kafka Quotas - Zookeeper Observers Evaluate multi-tenancy to match business requirements and enterprise strategy.
  • 11. | 11 ACTION PLAN CONFIGURATION & ARCHITECTURE Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
  • 12. | 12 ACTION PLAN SERVICES AVAILABILITY Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
  • 13. | 13 ACTION PLAN MONITORING – HBASE TABLE Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
  • 14. | 14 ACTION PLAN MONITORING – HBASE REGION SERVER Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
  • 15. | 15 ACTION PLAN MONITORING – KAFKA Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
  • 16. | 16 ACTION PLAN MONITORING – HBASE REPLICATION (EVOLUTION OF BAD ROWS) Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018
  • 17. | 17 ACTION PLAN ORGANIZATION COMMUNICATION Build and animate Enterprise Communities - Templates / GIT,… - Feedback Sharing Improve Cross team communication Share Development guides Create shared metrics - Dashboards - Health Check - Application Status
  • 18. | 18 ACTION PLAN ORGANIZATION Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018 MANAGEMENT Reorganization of Application Management : - Centralized team - Global vision of jobs - Centralization of application logs - Tools CAPACITY PLANNING Setup of an « industrial » Capacity Planning - Definition of the right metrics to predict - Tools - Regural meetup to review the planning SERVICE OFFER Modification of Billing to reflect actual usage - Memory and CPU Usage for HBASE - Number of Region - Markup for small files MANAGE EXPECTATIONS & ANTICIPATE REQUIREMENTS
  • 19. | 19 ACTION PLAN POSITIVE EFFECTS Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018 99.5 100 100 100 99.35 99.86 100 100 100 99.1 99 96 97.44 98.85 99.85 100 100 99.899.8 99.9 96.3 100 99.2 99.88 100 100 100100 100 99.6 100 100 100 100 100 100 95 96 97 98 99 100 101 Juin Juillet Août Septembre Octobre Novembre Décembre Janvier Février Availability of services Hive Hbase Hdfs Yarn
  • 20. | 20 THING TO TAKE-AWAY Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018 Service OFFER & Multi-tenancy must be managed globally Communication & coordination at all level is essential Get help !!
  • 21. | 21Multi-tenant Hadoop - The challenge of maintaining high SLAS | april 2018 THANK YOU ! QUESTIONS