Troubleshooting the Virtual Router - Run and Get
Diagnostics
CCC / ApacheCon NA Las Vegas 19
Boris Stoyanov
Software Engineer in Test
boris.stoyanov@shapeblue.com
twitter: @shapeblue
about me
•Break Stuff @ ShapeBlue
•Background:
•More than 10 years in Software Development
and Testing
•Specialize in:
•Test Management
•Automated Testing
•Testing Frameworks
•Joined ShapeBlue and CloudStack 2016
•Recently invited to join PMC
Troubleshooting the Virtual Router - Run and Get
Diagnostics
The Virtual Router is essential to
CloudStack Networking model
Get Diagnostics
Allows you to execute commands and gather log and config files from
the VR and System VMs
Get Diagnostics - Defaults
VR – ‘diagnostics.data.router.defaults’ global setting
“[IPTABLES], [IFCONFIG], [ROUTE], /etc/dnsmasq.conf, /etc/resolv.conf, /etc/haproxy.conf,
/etc/hosts.conf, /etc/dnsmaq-resolv.conf, /var/log/cloud.log, /var/log/routerServiceMonitor.log,
/var/log/dnsmasq.log”
System Vms – ‘diagnostics.data.systemvm.defaults’ global setting
“[IPTABLES], [IFCONFIG], [ROUTE], /usr/local/cloud/systemvm/conf/agent.properties,
/usr/local/cloud/systemvm/conf/consoleproxy.properties, /var/log/cloud.log”
Get Diagnostics - Customs
One can get custom files or execute scripts on the VR
• Custom scripts needs to be within brackets ‘ [] ‘. Any custom script
needs to be in ‘/opt/cloud/bin/‘
• Files needs to be pointed with the absolute path of the file
Get Diagnostics - Settings
diagnostics.data.gc.enable
Enable the garbage collector background task to delete old files from
secondary storage. Requires management server restart
true/false
diagnostics.data.gc.interval
The interval at which the garbage collector background tasks in
seconds. Requires management server restart
3600
diagnostics.data.retrieval.timeout Overall data retrieval timeout in seconds 86400 (1 day)
diagnostics.data.max.file.age
Sets the maximum time in seconds a file can stay in secondary
storage before it is deleted.
86400
diagnostics.data.disable.threshold
Sets the secondary storage disk utilisation percentage for file retrieval.
Used to look for suitable secondary storage with enough space,
otherwise an exception is thrown when no secondary store is found.
0.9
Get Diagnostics - more info
• PR has been submitted: 3350
• https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+G
Get Diagnostics - limitations
• For successful file retrieval, operator needs to specify the correct
absolute path of file to be retrieved.
• A working/running SSVM is required in order to create/generate
the public file download URL.
• The Zone where target VM is running should have at least 1
secondary storage that has a disk quota utilization of less than 90
%.
• Only supports system VMs
• Any script that must be executed as part of this API is expected to
be present in the system VM under the directory ‘/opt/cloud/bin/’
• The response from the API is only a URL for download.
Demo
Run Diagnostics
Allows the admin to execute network utility commands remotely on
any System VM (VR, SSVM, CPVM)
Run Diagnostics - supported
commands
• ping - test if any destination is reachable by tge System VM
• traceroute - check path and trasits to destination
• arping - test if destination is reachable through a specific NIC
Run Diagnostics - more info
• PR 2833 - Merged in 4.12
• Feature Spec:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Clo
udStack+Remote+Diagnostics+API
• Blog: https://www.shapeblue.com/troubleshooting-
cloudstack-virtual-routers/
Demo
Credits
• Dingane Hlaluku, Rohit Yadav and Shapelue Dev Team
• ACS Community – Code reviews and further testing
We’re hiring!
https://www.shapeblue.com/careers/
Q&A

Boris Stoyanov - Troubleshooting the Virtual Router - Run and Get Diagnostics

  • 1.
    Troubleshooting the VirtualRouter - Run and Get Diagnostics CCC / ApacheCon NA Las Vegas 19 Boris Stoyanov Software Engineer in Test boris.stoyanov@shapeblue.com twitter: @shapeblue
  • 2.
    about me •Break Stuff@ ShapeBlue •Background: •More than 10 years in Software Development and Testing •Specialize in: •Test Management •Automated Testing •Testing Frameworks •Joined ShapeBlue and CloudStack 2016 •Recently invited to join PMC
  • 3.
    Troubleshooting the VirtualRouter - Run and Get Diagnostics
  • 4.
    The Virtual Routeris essential to CloudStack Networking model
  • 5.
    Get Diagnostics Allows youto execute commands and gather log and config files from the VR and System VMs
  • 6.
    Get Diagnostics -Defaults VR – ‘diagnostics.data.router.defaults’ global setting “[IPTABLES], [IFCONFIG], [ROUTE], /etc/dnsmasq.conf, /etc/resolv.conf, /etc/haproxy.conf, /etc/hosts.conf, /etc/dnsmaq-resolv.conf, /var/log/cloud.log, /var/log/routerServiceMonitor.log, /var/log/dnsmasq.log” System Vms – ‘diagnostics.data.systemvm.defaults’ global setting “[IPTABLES], [IFCONFIG], [ROUTE], /usr/local/cloud/systemvm/conf/agent.properties, /usr/local/cloud/systemvm/conf/consoleproxy.properties, /var/log/cloud.log”
  • 7.
    Get Diagnostics -Customs One can get custom files or execute scripts on the VR • Custom scripts needs to be within brackets ‘ [] ‘. Any custom script needs to be in ‘/opt/cloud/bin/‘ • Files needs to be pointed with the absolute path of the file
  • 8.
    Get Diagnostics -Settings diagnostics.data.gc.enable Enable the garbage collector background task to delete old files from secondary storage. Requires management server restart true/false diagnostics.data.gc.interval The interval at which the garbage collector background tasks in seconds. Requires management server restart 3600 diagnostics.data.retrieval.timeout Overall data retrieval timeout in seconds 86400 (1 day) diagnostics.data.max.file.age Sets the maximum time in seconds a file can stay in secondary storage before it is deleted. 86400 diagnostics.data.disable.threshold Sets the secondary storage disk utilisation percentage for file retrieval. Used to look for suitable secondary storage with enough space, otherwise an exception is thrown when no secondary store is found. 0.9
  • 9.
    Get Diagnostics -more info • PR has been submitted: 3350 • https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+G
  • 10.
    Get Diagnostics -limitations • For successful file retrieval, operator needs to specify the correct absolute path of file to be retrieved. • A working/running SSVM is required in order to create/generate the public file download URL. • The Zone where target VM is running should have at least 1 secondary storage that has a disk quota utilization of less than 90 %. • Only supports system VMs • Any script that must be executed as part of this API is expected to be present in the system VM under the directory ‘/opt/cloud/bin/’ • The response from the API is only a URL for download.
  • 11.
  • 12.
    Run Diagnostics Allows theadmin to execute network utility commands remotely on any System VM (VR, SSVM, CPVM)
  • 13.
    Run Diagnostics -supported commands • ping - test if any destination is reachable by tge System VM • traceroute - check path and trasits to destination • arping - test if destination is reachable through a specific NIC
  • 14.
    Run Diagnostics -more info • PR 2833 - Merged in 4.12 • Feature Spec: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Clo udStack+Remote+Diagnostics+API • Blog: https://www.shapeblue.com/troubleshooting- cloudstack-virtual-routers/
  • 15.
  • 16.
    Credits • Dingane Hlaluku,Rohit Yadav and Shapelue Dev Team • ACS Community – Code reviews and further testing
  • 17.