Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Aarna networks debugging oom failures webinar

207 views

Published on

Slides that go along with the Webinar conducted as part of the ONAP University on May-21-2018. The slide deck describes the issues we faced with ONAP Amsterdam release, and how we worked around the issues.

Published in: Technology
  • Be the first to comment

Aarna networks debugging oom failures webinar

  1. 1. Debugging OOM Failures An ONAP Webinar © 2018 Aarna Networks, Inc.
  2. 2. © 2017 Aarna Networks, Inc. Some Introductions Vivekanandan Muthukrishnan (Vivek) ● Over 17 years of experience in the field of OpenStack, Storage, Network and Server Virtualization Amar Kapadia ● Author of “Understanding OPNFV” ● NFV specialist with background in OpenStack, Ceph, dataplane acceleration technologies
  3. 3. © 2017 Aarna Networks, Inc. Goals To create a lab environment for A) ONAP training B) Development purposes
  4. 4. © 2017 Aarna Networks, Inc. Requirements/Solution ● Requirements: ○ Light footprint ○ Self-service or instructor provided infrastructure for online and in- class users ● Solution ○ Laptop was eliminated because ONAP is quite heavy ○ So we went with a Google Cloud Platform (GCP) VM ○ GCP offers both options — self-service or instructor provided (ssh) ○ Considered but set aside public OpenStack clouds it is more complicated and OpenStack clouds are regional in nature (easy to add on though for specific needs) ○ Finally decided to create an ONAP+OPNFV-all-in-one VM
  5. 5. © 2017 Aarna Networks, Inc. GCP VM Architecture 192.168.x.x SOCKS5
  6. 6. © 2017 Aarna Networks, Inc. Our Approach ● Used OOM to get a lighter footprint ○ Amsterdam release ○ Rancher to deploy Kubernetes ○ Scripts to allow users to deploy a subset of ONAP projects vs. all projects ● Used OPNFV for OpenStack ○ Euphrates release ○ Apex installer ○ Basic scenario: no ha, no sdn, no features
  7. 7. © 2017 Aarna Networks, Inc. OOM Issues Summary Issues ● AAI containers failed to transition to Running state ● SDC UI is not getting loaded ● SDC Service Distribution Error ● VID Service Deployment Error ● VID ADD VNF Error ● SDNC User creation failed ● Robot init_robot failed with missing attributes Tools Used ● Healthchecks ● Robot logs ● Container logs Outcomes ● Single-command workaround scripts
  8. 8. © 2017 Aarna Networks, Inc. OOM Issues & Workaround ● AAI containers failed to transition to Running state ○ NAMESPACE NAME READY STATUS RESTARTS AGE ○ onap aai-service-5698ddc455-s25dd 0/1 Init:0/1 33 6h ● Workaround: bounce AAI containers ○ cd oom/kubernetes/oneclick/ ○ source ./setenv.bash ○ ./deleteAll.bash -n onap -a aai ○ ./createAll.bash -n onap -a aai
  9. 9. © 2017 Aarna Networks, Inc. OOM Issues & Workaround ● SDC UI is not getting loaded ○ SDC UI Failed to load ○ Status code : 503 Service Unavailable ● Workaround ○ Bounce SDC containers in specific order: SDC → CS → BE → FE ○ Wait for one container to come up before bouncing the next dependent container
  10. 10. © 2017 Aarna Networks, Inc. OOM Issues & Workaround ● SDC Service Distribution Error ○ Error Code : POL5000 ○ Status Code : 500 ● Workaround ○ Bounce global-kafka container ○ Bounce dmaap & dmaap-listener container ○ Bounce SDC containers in order ○ Bounce the vnc-portal & portalwidgets containers
  11. 11. © 2017 Aarna Networks, Inc. OOM Issues & Workaround ● VID Service Deployment Error ○ "requestType": "createInstance" ○ "requestStatus": "SVC3001Resource not found... ● Workaround ○ Bounce model-loader-service ○ Run ROBOT scripts init_robot & init
  12. 12. © 2017 Aarna Networks, Inc. OOM Issues & Workaround ● VID ADD VNF Error ○ Error : SVC0002 400 Bad request Error ● Workaround ○ Bounce Model-loader-service ○ Bounce mso container ○ Bounce SDC containers in order ○ Execute ROBOT init_robot & init ○ Redistribute the service
  13. 13. © 2017 Aarna Networks, Inc. OOM Issues & Workaround ● SDNC User creation failed ○ Database connection issue ● Workaround ○ Bounce sdnc container ○ Bounce sdnc-portal container
  14. 14. © 2017 Aarna Networks, Inc. OOM Issues & Workaround ● Robot init_robot failed with missing attributes ○ Missing attribute GLOBAL_INJECTED_UBUNTU_1604_IMAGE ○ Missing attribute GLOBAL_INJECTED_UBUNTU_1404_IMAGE ● Workaround ○ Update file /dockerdata-nfs/onap/robot/eteshare/config/vm_properties.py ○ Include lines after line (GLOBAL_INJECTED_SDNC_PORTAL_IP_ADDR) ■ GLOBAL_INJECTED_UBUNTU_1604_IMAGE = "xenial" ■ GLOBAL_INJECTED_UBUNTU_1404_IMAGE = "trusty ○ JSON attributes under GLOBAL_INJECTED_PROPERTIES ■ "GLOBAL_INJECTED_UBUNTU_1604_IMAGE" : "xenial" ■ "GLOBAL_INJECTED_UBUNTU_1404_IMAGE" : "trusty"
  15. 15. © 2017 Aarna Networks, Inc. OOM Debugging Approaches ● Check SDC BE & FE health status ○ curl -s -X GET "http://<ONAP SERVER IP>:30205/sdc2/rest/healthCheck" | python -m json.tool ○ curl -s -X GET http://<ONAP SERVER IP>:30206/sdc1/rest/healthCheck | python -m json.tool ● Check ROBOT container logs ○ Robot portal URL = http://<ONAP SERVER IP>:30209/logs/demo/ ○ username/password = test/test ● Connect & check the target container logs (folder /var/log/onap) ○ kubectl exec -it -n {NAME SPACE} {CONTAINER ID} -- /bin/bash ○ kubectl describe pod -n {NAME SPACE} {CONTAINER ID} ● Bouncing a failed container ○ kubectl delete pod -n {NAME SPACE} {CONTAINER ID}
  16. 16. © 2017 Aarna Networks, Inc. Debugging Outcome ● Created easy to use scripts that address each issue ● Training participants are given instructions on which workaround to run in a particular error situation
  17. 17. © 2017 Aarna Networks, Inc. About Aarna ONAP related products, training, and professional services Founded in 2017 by serial entrepreneurs (NFVI/VIM experts) Located in San Jose, CA & Bangalore, India 60 participants trained in 8 sessions 2 development distros & 4 packaged services
  18. 18. Thank You! www.aarnanetworks.com vmuthukrishnan@aarnanetworks.com akapadia@aarnanetworks.com

×