SlideShare a Scribd company logo
1 of 37
Download to read offline
Testing high-availability
telecom-grade systems
By Attila Fekete
Test Automation Day 2013 - Rotterdam
YOUR LOGO
Page ▪ 2
About me
János György Kemény
Neumann János
Puskás Tivadar
YOUR LOGO
Page ▪ 3
About me
Hungary
Page ▪ 4
Agenda
Definition of High-Availability
Maintain High-Availability
Final note
1
2
3
4
Design for High-Availability
Definition of High-Availability
Page ▪ 6
Warning…
Page ▪ 7
What do we call high-availability telecom-grade system?
What isn’t HA…
f@%#ed up
Page ▪ 8
What do we call high-availability telecom-grade system?
What isn’t HA…
“Good Enough for Us”
Page ▪ 9
What do we call high-availability telecom-grade system?
What isn’t HA…
f@%#ed upYou are fired
Page ▪ 10
What do we call high-availability telecom-grade system?
Federal Standard
“1037C and MIL-ST-188 define telecommunications
availability as a ratio of the time a module can be
used (if a use request existed) over a period of time.
It is a ratio of uptime to total time”
What HA is…
Page ▪ 11
What do we call high-availability telecom-grade system?
Scheduled downtime:
Any event initiated by Operation and
Maintenance personnel
Unscheduled downtime:
▪ Software failure
▪ Hardware failure
▪ Environmental anomaly
Types of downtime
Page ▪ 12
What do we call high-availability telecom-grade system?
Availability Downtime per year
90% ("one nine") 36.5 days
99% ("two nines") 3.65 days
99.9% ("three nines") 8.76 hours
99.99% ("four nines") 52.56 minutes
99.999% ("five nines") 5.26 minutes
99.9999% ("six nines") 31.5 seconds
99.99999% ("seven nines") 3.15 seconds
What HA is…
Page ▪ 13
What do we call high-availability telecom-grade system?
Events to be handled
▪ HW failures
▪ SW failures
▪ On-line reconfigurations
▪ Network connection problems
▪ Extreme load levels
▪ Natural disasters
What a HA system would cope with
Page ▪ 14
What do we call high-availability telecom-grade system?
Available all the time
▪ Literally no service unavailability
▪ Literally no data loss
▪ billing information
▪ user profiles
Characteristics - part 1
Page ▪ 15
What do we call high-availability telecom-grade system?
Online upgrade, patching, replacement
▪ Hardware
▪ Operating system
▪ Middle-ware
▪ Application
Characteristics - part 2
Page ▪ 16
What do we call high-availability telecom-grade system?
Ability to recover after
▪ SW crashes
▪ HW failures
▪ Overload situations
▪ Network outage
Stability
▪ Till taken out of service
Characteristics – part 3
Design for High-Availability
Page ▪ 18
No single point of
failure
How to achieve High-Availability?
Page ▪ 19
Design for high-availability
Example
Page ▪ 20
Design for high-availability
Redundancy
▪ ISP connection to:
▪ its redundant peers
▪ to any surrounding system
▪ Every piece of HW it is built from
▪ Every single SW component
▪ Relevant data
▪ Whole node / entity
What must be redundant?
Page ▪ 21
Design for high-availability
Redundancy
▪ Active/active
▪ Active/passive
▪ N+1
▪ N+M
▪ N-to-1
▪ N-to-N
Types of redundancy
Page ▪ 22
Design for high-availability
Active/Active
▪ All entities are handling
requests
▪ In case of failure traffic is
taken over
Types of redundancy
Page ▪ 23
Design for high-availability
Active/Passive
▪ Only one of them is online
▪ Failure node brought online if
primary fails
Types of redundancy
Page ▪ 24
Design for high-availability
N+1
▪ Single extra failure node
▪ Also called roaming-spare
▪ Takes over the role of the
failing one
Types of redundancy
Page ▪ 25
Design for high-availability
N+M
▪ More extra failure node
▪ To increase redundancy
Types of redundancy
Page ▪ 26
Design for high-availability
N-to-1
▪ Stand-by node becomes
active temporarily
▪ Also called dedicated spare
▪ Same node becomes failure
node after original node
restored
Types of redundancy
Page ▪ 27
Design for high-availability
N-to-N
▪ Combination of N+M and
Active/Active
▪ Load is redistributed among
remaining active nodes
Types of redundancy
Page ▪ 28
Design for high-availability
Recovery mechanisms:
▪ Process restart
▪ Processor board restart
▪ Cluster restart
Recovery time:
▪ Short (miliseconds..seconds..minutes)
Ability to recover
Maintain high-availability
Page ▪ 30
Verify and Maintain high-availability
Robustness, Recovery,…
X X
Test Harness
Process A Process A
Process B Process B
New
Process A
Page ▪ 31
Verify and Maintain high-availability
Scenarios
▪ SW crashes
▪ HW component failures
▪ Node failures
▪ Connection problems
▪ Maintenance activities
Robustness, Recovery,… - scenarios
Page ▪ 32
Verify and Maintain high-availability
Load and stability test environment
Simulator 5
Simulator 4
MTAS
XDMS
Simulator 7
Simulator 3
Simulator 2
Simulator 6
Simulator 1
Simulator 15
SUT
Simulator 12
Simulator 11
Simulator 14
Simulator 10
Simulator 9
Simulator 13
Simulator 8
Page ▪ 33
Verify and Maintain high-availability
Type of stresses
▪ Few-hour overload situations (1.5x
engineered load)
▪ One-hour heavy load (4x
engineered load)
Load Test and Stability Test – level of stress
Page ▪ 34
Verify and Maintain high-availability
Specification
▪ Simulates several million
subscribers (5-15 million)
▪ Simulates several tens of thousands
of call set-ups per second (5000-
6000) while handling ongoing
sessions
▪ Simulates large part of the
telephony network
▪ Scalable
▪ Test harness is TTCN 3-based in-
house-developed
Load Test and Stability Test - test environment
Page ▪ 35
Verify and Maintain high-availability
Maintain and improve availability
Shorter
runs every
night
Long runs
during the
weekend
Page ▪ 36
Final Note
Design for
High-
Availability
Maintain
High-
Availability
High-
Availability
System
Page ▪ 37
Do You Have
Any Questions?

More Related Content

Similar to Testing high-availability telecom-grade systems by Attila Fekete

Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesAlexander Penev
 
Fast Online Access to Massive Offline Data - SECR 2016
Fast Online Access to Massive Offline Data - SECR 2016Fast Online Access to Massive Offline Data - SECR 2016
Fast Online Access to Massive Offline Data - SECR 2016Felix GV
 
Lookout on Scaling Security to 100 Million Devices
Lookout on Scaling Security to 100 Million DevicesLookout on Scaling Security to 100 Million Devices
Lookout on Scaling Security to 100 Million DevicesScyllaDB
 
Automated Deployment and Management of Edge Clouds
Automated Deployment and Management of Edge CloudsAutomated Deployment and Management of Edge Clouds
Automated Deployment and Management of Edge CloudsJay Bryant
 
Sample Solution Blueprint
Sample Solution BlueprintSample Solution Blueprint
Sample Solution BlueprintMike Alvarado
 
Быстрый онлайн-доступ к огромному количеству оффлайн-данных в LinkedIn
Быстрый онлайн-доступ к огромному количеству оффлайн-данных в LinkedInБыстрый онлайн-доступ к огромному количеству оффлайн-данных в LinkedIn
Быстрый онлайн-доступ к огромному количеству оффлайн-данных в LinkedInCEE-SEC(R)
 
SolarWinds Scalability for the Enterprise
SolarWinds Scalability for the EnterpriseSolarWinds Scalability for the Enterprise
SolarWinds Scalability for the EnterpriseSolarWinds
 
Emerson converged infrastructure (thermal-power-mgt.-security (gs)
Emerson   converged infrastructure (thermal-power-mgt.-security (gs)Emerson   converged infrastructure (thermal-power-mgt.-security (gs)
Emerson converged infrastructure (thermal-power-mgt.-security (gs)Greg Stover
 
Our Multi-Year Journey to a 10x Faster Confluent Cloud
Our Multi-Year Journey to a 10x Faster Confluent CloudOur Multi-Year Journey to a 10x Faster Confluent Cloud
Our Multi-Year Journey to a 10x Faster Confluent CloudHostedbyConfluent
 
Running Persistent Data in a Multi-Cloud Architecture
Running Persistent Data in a Multi-Cloud ArchitectureRunning Persistent Data in a Multi-Cloud Architecture
Running Persistent Data in a Multi-Cloud ArchitectureVMware Tanzu
 
Open Hardware for All - Webinar March 25, 2015
Open Hardware for All - Webinar March 25, 2015Open Hardware for All - Webinar March 25, 2015
Open Hardware for All - Webinar March 25, 2015Cumulus Networks
 
Netpod - The Merging of NPM & APM
Netpod - The Merging of NPM & APMNetpod - The Merging of NPM & APM
Netpod - The Merging of NPM & APMBoni Bruno
 
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...DataWorks Summit
 
Unleash the Power of Open Networking
Unleash the Power of Open NetworkingUnleash the Power of Open Networking
Unleash the Power of Open NetworkingCumulus Networks
 
Lecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptxLecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptxSandeepGupta229023
 
Scaling FreeSWITCH Performance
Scaling FreeSWITCH PerformanceScaling FreeSWITCH Performance
Scaling FreeSWITCH PerformanceMoises Silva
 
High-Speed Reactive Microservices
High-Speed Reactive MicroservicesHigh-Speed Reactive Microservices
High-Speed Reactive MicroservicesRick Hightower
 

Similar to Testing high-availability telecom-grade systems by Attila Fekete (20)

Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE Architectures
 
Serverless Computing
Serverless ComputingServerless Computing
Serverless Computing
 
Fast Online Access to Massive Offline Data - SECR 2016
Fast Online Access to Massive Offline Data - SECR 2016Fast Online Access to Massive Offline Data - SECR 2016
Fast Online Access to Massive Offline Data - SECR 2016
 
Lookout on Scaling Security to 100 Million Devices
Lookout on Scaling Security to 100 Million DevicesLookout on Scaling Security to 100 Million Devices
Lookout on Scaling Security to 100 Million Devices
 
Automated Deployment and Management of Edge Clouds
Automated Deployment and Management of Edge CloudsAutomated Deployment and Management of Edge Clouds
Automated Deployment and Management of Edge Clouds
 
Sample Solution Blueprint
Sample Solution BlueprintSample Solution Blueprint
Sample Solution Blueprint
 
Быстрый онлайн-доступ к огромному количеству оффлайн-данных в LinkedIn
Быстрый онлайн-доступ к огромному количеству оффлайн-данных в LinkedInБыстрый онлайн-доступ к огромному количеству оффлайн-данных в LinkedIn
Быстрый онлайн-доступ к огромному количеству оффлайн-данных в LinkedIn
 
SolarWinds Scalability for the Enterprise
SolarWinds Scalability for the EnterpriseSolarWinds Scalability for the Enterprise
SolarWinds Scalability for the Enterprise
 
Emerson converged infrastructure (thermal-power-mgt.-security (gs)
Emerson   converged infrastructure (thermal-power-mgt.-security (gs)Emerson   converged infrastructure (thermal-power-mgt.-security (gs)
Emerson converged infrastructure (thermal-power-mgt.-security (gs)
 
Introduction to SDN
Introduction to SDNIntroduction to SDN
Introduction to SDN
 
Our Multi-Year Journey to a 10x Faster Confluent Cloud
Our Multi-Year Journey to a 10x Faster Confluent CloudOur Multi-Year Journey to a 10x Faster Confluent Cloud
Our Multi-Year Journey to a 10x Faster Confluent Cloud
 
Running Persistent Data in a Multi-Cloud Architecture
Running Persistent Data in a Multi-Cloud ArchitectureRunning Persistent Data in a Multi-Cloud Architecture
Running Persistent Data in a Multi-Cloud Architecture
 
Open Hardware for All - Webinar March 25, 2015
Open Hardware for All - Webinar March 25, 2015Open Hardware for All - Webinar March 25, 2015
Open Hardware for All - Webinar March 25, 2015
 
Netpod - The Merging of NPM & APM
Netpod - The Merging of NPM & APMNetpod - The Merging of NPM & APM
Netpod - The Merging of NPM & APM
 
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
 
Unleash the Power of Open Networking
Unleash the Power of Open NetworkingUnleash the Power of Open Networking
Unleash the Power of Open Networking
 
Lecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptxLecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptx
 
Scaling FreeSWITCH Performance
Scaling FreeSWITCH PerformanceScaling FreeSWITCH Performance
Scaling FreeSWITCH Performance
 
Designing Scalable Applications
Designing Scalable ApplicationsDesigning Scalable Applications
Designing Scalable Applications
 
High-Speed Reactive Microservices
High-Speed Reactive MicroservicesHigh-Speed Reactive Microservices
High-Speed Reactive Microservices
 

Recently uploaded

Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...Nitya salvi
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456KiaraTiradoMicha
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 

Recently uploaded (20)

Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 

Testing high-availability telecom-grade systems by Attila Fekete

  • 1. Testing high-availability telecom-grade systems By Attila Fekete Test Automation Day 2013 - Rotterdam
  • 2. YOUR LOGO Page ▪ 2 About me János György Kemény Neumann János Puskás Tivadar
  • 3. YOUR LOGO Page ▪ 3 About me Hungary
  • 4. Page ▪ 4 Agenda Definition of High-Availability Maintain High-Availability Final note 1 2 3 4 Design for High-Availability
  • 7. Page ▪ 7 What do we call high-availability telecom-grade system? What isn’t HA… f@%#ed up
  • 8. Page ▪ 8 What do we call high-availability telecom-grade system? What isn’t HA… “Good Enough for Us”
  • 9. Page ▪ 9 What do we call high-availability telecom-grade system? What isn’t HA… f@%#ed upYou are fired
  • 10. Page ▪ 10 What do we call high-availability telecom-grade system? Federal Standard “1037C and MIL-ST-188 define telecommunications availability as a ratio of the time a module can be used (if a use request existed) over a period of time. It is a ratio of uptime to total time” What HA is…
  • 11. Page ▪ 11 What do we call high-availability telecom-grade system? Scheduled downtime: Any event initiated by Operation and Maintenance personnel Unscheduled downtime: ▪ Software failure ▪ Hardware failure ▪ Environmental anomaly Types of downtime
  • 12. Page ▪ 12 What do we call high-availability telecom-grade system? Availability Downtime per year 90% ("one nine") 36.5 days 99% ("two nines") 3.65 days 99.9% ("three nines") 8.76 hours 99.99% ("four nines") 52.56 minutes 99.999% ("five nines") 5.26 minutes 99.9999% ("six nines") 31.5 seconds 99.99999% ("seven nines") 3.15 seconds What HA is…
  • 13. Page ▪ 13 What do we call high-availability telecom-grade system? Events to be handled ▪ HW failures ▪ SW failures ▪ On-line reconfigurations ▪ Network connection problems ▪ Extreme load levels ▪ Natural disasters What a HA system would cope with
  • 14. Page ▪ 14 What do we call high-availability telecom-grade system? Available all the time ▪ Literally no service unavailability ▪ Literally no data loss ▪ billing information ▪ user profiles Characteristics - part 1
  • 15. Page ▪ 15 What do we call high-availability telecom-grade system? Online upgrade, patching, replacement ▪ Hardware ▪ Operating system ▪ Middle-ware ▪ Application Characteristics - part 2
  • 16. Page ▪ 16 What do we call high-availability telecom-grade system? Ability to recover after ▪ SW crashes ▪ HW failures ▪ Overload situations ▪ Network outage Stability ▪ Till taken out of service Characteristics – part 3
  • 18. Page ▪ 18 No single point of failure How to achieve High-Availability?
  • 19. Page ▪ 19 Design for high-availability Example
  • 20. Page ▪ 20 Design for high-availability Redundancy ▪ ISP connection to: ▪ its redundant peers ▪ to any surrounding system ▪ Every piece of HW it is built from ▪ Every single SW component ▪ Relevant data ▪ Whole node / entity What must be redundant?
  • 21. Page ▪ 21 Design for high-availability Redundancy ▪ Active/active ▪ Active/passive ▪ N+1 ▪ N+M ▪ N-to-1 ▪ N-to-N Types of redundancy
  • 22. Page ▪ 22 Design for high-availability Active/Active ▪ All entities are handling requests ▪ In case of failure traffic is taken over Types of redundancy
  • 23. Page ▪ 23 Design for high-availability Active/Passive ▪ Only one of them is online ▪ Failure node brought online if primary fails Types of redundancy
  • 24. Page ▪ 24 Design for high-availability N+1 ▪ Single extra failure node ▪ Also called roaming-spare ▪ Takes over the role of the failing one Types of redundancy
  • 25. Page ▪ 25 Design for high-availability N+M ▪ More extra failure node ▪ To increase redundancy Types of redundancy
  • 26. Page ▪ 26 Design for high-availability N-to-1 ▪ Stand-by node becomes active temporarily ▪ Also called dedicated spare ▪ Same node becomes failure node after original node restored Types of redundancy
  • 27. Page ▪ 27 Design for high-availability N-to-N ▪ Combination of N+M and Active/Active ▪ Load is redistributed among remaining active nodes Types of redundancy
  • 28. Page ▪ 28 Design for high-availability Recovery mechanisms: ▪ Process restart ▪ Processor board restart ▪ Cluster restart Recovery time: ▪ Short (miliseconds..seconds..minutes) Ability to recover
  • 30. Page ▪ 30 Verify and Maintain high-availability Robustness, Recovery,… X X Test Harness Process A Process A Process B Process B New Process A
  • 31. Page ▪ 31 Verify and Maintain high-availability Scenarios ▪ SW crashes ▪ HW component failures ▪ Node failures ▪ Connection problems ▪ Maintenance activities Robustness, Recovery,… - scenarios
  • 32. Page ▪ 32 Verify and Maintain high-availability Load and stability test environment Simulator 5 Simulator 4 MTAS XDMS Simulator 7 Simulator 3 Simulator 2 Simulator 6 Simulator 1 Simulator 15 SUT Simulator 12 Simulator 11 Simulator 14 Simulator 10 Simulator 9 Simulator 13 Simulator 8
  • 33. Page ▪ 33 Verify and Maintain high-availability Type of stresses ▪ Few-hour overload situations (1.5x engineered load) ▪ One-hour heavy load (4x engineered load) Load Test and Stability Test – level of stress
  • 34. Page ▪ 34 Verify and Maintain high-availability Specification ▪ Simulates several million subscribers (5-15 million) ▪ Simulates several tens of thousands of call set-ups per second (5000- 6000) while handling ongoing sessions ▪ Simulates large part of the telephony network ▪ Scalable ▪ Test harness is TTCN 3-based in- house-developed Load Test and Stability Test - test environment
  • 35. Page ▪ 35 Verify and Maintain high-availability Maintain and improve availability Shorter runs every night Long runs during the weekend
  • 36. Page ▪ 36 Final Note Design for High- Availability Maintain High- Availability High- Availability System
  • 37. Page ▪ 37 Do You Have Any Questions?