Troubleshooting CloudStack
Rajesh Battala, Likitha Shetty & Sailaja Mada
Wednesday, December 18, 2013
Agenda
 ACS developer
–
–
–
–

ACS Error codes
Debugging tips in ACS development
SSVM troubleshooting
ACS ports

 ACS Cl...
ACS Developer

Troubleshooting CloudStack
ACS error codes
-

Client error codes
public static final int MALFORMED_PARAMETER_ERROR = 430;
public static final int PAR...
Debugging tips in CS development
- Generally use eclipse to attach debugger to the management server
- SystemVM agents
- k...
SSVM troubleshooting
-

Login
-

-

-

ssh -i /root/.ssh/id_rsa.cloud -p 3922 root@ip where ip is link
local on XenServer ...
And a couple more …
-

DB Encryption

To decrypt the database secret key use the following
java -classpath /usr/share/java...
ACS Ports
-

-

-

-

Management Server
- 8080: Primary GUI / Authentication API Port
- 8096: User/Client Management Serve...
ACS Administrator

Troubleshooting CloudStack
ACS Administrator
 Install, Configuration & Deployment
 Log analysis
 Important Global Config Parameters

 Best Practi...
Install ,Configuration & Deployment Issues
? Failed to login to ACS Management server



4.2 requires Min 2 GB RAM
Redep...
Install ,Configuration & Deployment Issues
? Failed to add host
 XCP host – Copy Echo plugin
 Host License
 Compatible ...
Install ,Configuration & Deployment Issues
? Host in Alert State
 Monitor Host Root Disk usage
?





Host/Storage po...
Logs
 Management Server logs
- /var/log/cloudstack/managementserver.log
- /var/log/cloudstack/api.log
 SSVM

- /var/log/...
Global Config Parameters
expunge.delay

Determines how long (in seconds) to wait before actually
expunging destroyed vm. T...
Best Practises


Switch port configurations ( VLANs must be trunked).



Restrict the IP addresses which can access stor...
Reusing Hypervisors

•
•
•
•
•
•
•

xe vm-uninstall --multiple –force
Unmount Storage
xe vif-unplug uuid=<uuid>
xe vif-de...
Cloud Database














op_dc_vnet_alloc
op_dc_ip_address_alloc
user_ip_address
image_store
vm_template
...
Troubleshooting CloudStack
References
o

https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM%2C+templates%2C+Secondary+storage+t
roubleshooti...
Get Involved
Web: http://cloudstack.apache.org/
Mailing Lists: cloudstack.apache.org/mailing-lists.html

IRC: irc.freenode...
Upcoming SlideShare
Loading in …5
×

Troubleshooting Apache Cloudstack

3,045 views

Published on

Troubleshooting Apache Cloudstack by Sailaja, Rajesh, and Likitha

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,045
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
83
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Troubleshooting Apache Cloudstack

  1. 1. Troubleshooting CloudStack Rajesh Battala, Likitha Shetty & Sailaja Mada Wednesday, December 18, 2013
  2. 2. Agenda  ACS developer – – – – ACS Error codes Debugging tips in ACS development SSVM troubleshooting ACS ports  ACS Cloud Admin – – – – – – Install, Configuration & Deployment Log analysis Important Global Config Parameters Best Practices Cloud Database Reusing Hypervisors  References  Q&A Troubleshooting CloudStack
  3. 3. ACS Developer Troubleshooting CloudStack
  4. 4. ACS error codes - Client error codes public static final int MALFORMED_PARAMETER_ERROR = 430; public static final int PARAM_ERROR = 431; public static final int UNSUPPORTED_ACTION_ERROR = 432; public static final int PAGE_LIMIT_EXCEED = 433; - Server error codes public static final int INTERNAL_ERROR = 530; public static final int ACCOUNT_ERROR = 531; public static final int ACCOUNT_RESOURCE_LIMIT_ERROR= 532; public static final int INSUFFICIENT_CAPACITY_ERROR = 533; public static final int RESOURCE_UNAVAILABLE_ERROR = 534; public static final int RESOURCE_ALLOCATION_ERROR = 534; public static final int RESOURCE_IN_USE_ERROR = 536; public static final int NETWORK_RULE_CONFLICT_ERROR = 537 Insert Presentation Title Here
  5. 5. Debugging tips in CS development - Generally use eclipse to attach debugger to the management server - SystemVM agents - kill the running process - add -Xdebug Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=878 7 to /usr/local/cloud/systemvm/_run.sh - open port 8787 - start the java process - ./run.sh - Usage - To check if events are being logged in check usage_events in cloud DB - To start usage server in dev setup mvn -pl usage -Drun -Dpid=$$ Insert Presentation Title Here
  6. 6. SSVM troubleshooting - Login - - - ssh -i /root/.ssh/id_rsa.cloud -p 3922 root@ip where ip is link local on XenServer and private ip in case of VMware Script to check the health of SSVM - /usr/local/cloud/systemvm/ssvm-check.sh Check if port 8250 is open In global configuration value of ‘host’ is right set to the management server ip Check agent status – service cloud status Logs can be found at - /var/log/cloud/cloud.log Template status can be found in template_store_ref DB table Insert Presentation Title Here
  7. 7. And a couple more … - DB Encryption To decrypt the database secret key use the following java -classpath /usr/share/java/cloud-jasypt-1.8.jar org.jasypt.intf.cli.JasyptPBEStringDecryptionCLI decrypt.sh input=<encryptedValue> password=<secretKey> verbose=false (where secretKey is the value in /etc/cloudstack/management/key file) - GUI timeout - - Default timeout is 15 minutes To increase the timeout edit /usr/share/cloud/management/webapps/client/WEB-INF/web.xml to add <session-config> <session-timeout>60</session-timeout> </session-config> Restart the server Insert Presentation Title Here
  8. 8. ACS Ports - - - - Management Server - 8080: Primary GUI / Authentication API Port - 8096: User/Client Management Server (unauthenticated) - 8787: CloudStack (Tomcat) debug socket - 9090: Cloudstack Management Cluster Interface SystemVM Agent - 3922: SystemVM to Management (secure) - 8250: SystemVM to Management (unsecure) MySQL Server - 3306: MySQL Server Hypervisor - 22/443: XenServer - 22: KVM - 443: vCenter 7080: AWS API server Insert Presentation Title Here
  9. 9. ACS Administrator Troubleshooting CloudStack
  10. 10. ACS Administrator  Install, Configuration & Deployment  Log analysis  Important Global Config Parameters  Best Practices  Reuse of Hypervisors  Cloud database Troubleshooting CloudStack
  11. 11. Install ,Configuration & Deployment Issues ? Failed to login to ACS Management server   4.2 requires Min 2 GB RAM Redeploy DB and start cloudstack-setup-management ? Issue with Instances in isolated network  VLAN Trunking in Switch port configuration ? Failed to deploy instances  Insufficient resources : Management server log analysis Troubleshooting CloudStack
  12. 12. Install ,Configuration & Deployment Issues ? Failed to add host  XCP host – Copy Echo plugin  Host License  Compatible host while creating the cluster of hosts ? Host/Storage pool in avoid set  Reachability issues  Timeout  Capacity of the storage pool / Host  Alert state ? Move XS hosts from Alert state  Unmanage the cluster with the affected host.  Clear the host tags of the affected host. xe host-param-clear param-name=tags uuid=<UUID of affected host>  Manage the cluster with the affected host. Troubleshooting CloudStack
  13. 13. Install ,Configuration & Deployment Issues ? Host in Alert State  Monitor Host Root Disk usage ?     Host/Storage pool in avoid set Reachability issues Timeout Capacity of the storage pool / Host Alert state ? Move XS hosts from Alert state  Unmanage the cluster with the affected host.  Clear the host tags of the affected host. xe host-param-clear param-name=tags uuid=<UUID of affected host>  Manage the cluster with the affected host. Troubleshooting CloudStack
  14. 14. Logs  Management Server logs - /var/log/cloudstack/managementserver.log - /var/log/cloudstack/api.log  SSVM - /var/log/cloud/cloud.out  KVM cloudstak Agent - /var/log/cloudstack/agent/agent.log  vSphere logs - /var/log/hostd.log (host log) - /var/log/vmkernel.log (kernel log) - /var/log/vpxa.log (agent log)  Xenserver logs - /var/log/Smlog -/var/log/xensource.log  /etc/cloudstack/management/log4j-cloud.xml - Set the priority to TRACE Levels - FATAL, ERROR, WARNING, INFO, DEBUG, TRACE Troubleshooting CloudStack
  15. 15. Global Config Parameters expunge.delay Determines how long (in seconds) to wait before actually expunging destroyed vm. The default value = the default value of expunge.interval 60 expunge.workers The interval (in seconds) to wait before running the expunge thread. Number of workers performing expunge network.gc.interval Seconds to wait before checking for networks to shutdown 600 network.gc.wait Time (in seconds) to wait before shutting down a network that's not in used 600 pool.storage.allocated.capacity.disablethreshold Percentage (as a value between 0 and 1) of allocated storage utilization above which allocators will disable using the pool for low allocated storage available. secstorage.allowed.internal.sites Comma separated list of cidrs internal to the datacenter that can host template download servers, please note 0.0.0.0 is not a valid site wait Time in seconds to wait for control commands to return vmware.vcenter.session.timeout integration.api.port VMware client timeout in seconds Defaul API port The interval (in seconds) to wait before running the storage cleanup thread. expunge.interval storage.cleanup.interval Troubleshooting CloudStack 60 1 1 1800 12000 8096 86400
  16. 16. Best Practises  Switch port configurations ( VLANs must be trunked).  Restrict the IP addresses which can access storage to avoid data loss .  Monitor host disk space .  All hosts must be 64-bit and must support HVM (Intel-VT or AMD-V enabled). All Hosts within a Cluster must be homogeneous.  The volumes used for Primary and Secondary storage should be accessible from Management Server and the hypervisors. These volumes should allow root users to read/write data. These volumes must be for the exclusive use of CloudStack and should not contain any data  With Advanced Networking, separate subnets must be used for private and public networks  The Management Servers communicate with the XenServers on ports 22 (ssh) and 80 (HTTP).  The Management Servers communicate with VMware vCenter servers on port 443 (HTTPs).  The Management Servers communicate with the KVM servers on port 22 (ssh). Troubleshooting CloudStack
  17. 17. Reusing Hypervisors  • • • • • • • xe vm-uninstall --multiple –force Unmount Storage xe vif-unplug uuid=<uuid> xe vif-destroy uuid=<uuid> xe network-destroy uuid=<cloud link Local uuid> sh /opt/xensource/bin/cloud-clean-vlan.sh Disable cloud tags created on host  • • • • Xenserver Vmware Delete all instances Delete Templates Unmount Datastores Remove all cloud networks Troubleshooting CloudStack
  18. 18. Cloud Database              op_dc_vnet_alloc op_dc_ip_address_alloc user_ip_address image_store vm_template Template_store_ref volume storage_pool host vm_instance nics network_offering physical_network_traffic_types Troubleshooting CloudStack
  19. 19. Troubleshooting CloudStack
  20. 20. References o https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM%2C+templates%2C+Secondary+storage+t roubleshooting o https://cwiki.apache.org/confluence/display/CLOUDSTACK/Ports+used+by+CloudStack o http://dlafferty.blogspot.in/2013/08/using-cloudstacks-log-files-xenserver.html Troubleshooting CloudStack
  21. 21. Get Involved Web: http://cloudstack.apache.org/ Mailing Lists: cloudstack.apache.org/mailing-lists.html IRC: irc.freenode.net: 6667 #cloudstack Twitter: @cloudstack LinkedIn: www.linkedin.com/groups/CloudStack-Users-Group-3144859 If it didn’t happen on the mailing list, it didn’t happen. Troubleshooting CloudStack

×