Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Anurag Awasthi - Machine Learning applications for CloudStack

38 views

Published on

While Machine learning and data mining has had profound impact on how we model applications and use data for better product consumption, there is scope for extending prediction algorithms to lower levels as well. Some useful applications of machine learning in ACS could be exploring better resource allocation that is aware of usage statistics, predicting faults, load balancing, etc. In this talk we will * take a broad overview of what Machine Learning/Data mining is and how it is being used in today's tech ecosystemn* explore ways in which we can make ACS more efficientn* discuss some recent advancements in how ML can benefit datacenters from research community

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Anurag Awasthi - Machine Learning applications for CloudStack

  1. 1. Applications of ML/AI in CloudStack Anurag Awasthi Software Engineer at ShapeBlue anurag.awasthi@shapeblue.com
  2. 2. $whoami ● Software Engineer @ ShapeBlue and contributor to Apache CloudStack ○ Things I have worked on - CloudStack feature development, KVM, VR. ○ Relatively new to ACS ● Formerly Engineering at Twitter, PocketGems, Microsoft Research ● Formerly at Twitter, PocketGems, Microsoft Research. ○ Diverse experiences - Backend, Web, iOS, Android, Machine Learning. ● Loves programming (github.com/anuragaw), dogs and trekking
  3. 3. Introduction ● Aim of this talk ○ Introduce, explore, ignite a debate, demystify. ○ Don’t aim to make Data scientists but to propose ML in ACS ● Machine learning toolbox ○ Common tools, examples ● Current Scenario ● Possible use cases in ACS ● Conclusion
  4. 4. Machine learning toolbox ● What is Machine Learning? ○ Data, data and more data ○ Classification ○ Prediction ○ Underlying loss function: Optimization ● Supervised vs Unsupervised ● Common models ○ Support Vector Machines ○ Decision trees ○ Neural networks ○ Logistic regression ○ Deep learning
  5. 5. Machine learning toolbox ● What is Machine Learning? ○ Data, data and more data ○ Classification ○ Prediction ○ Underlying loss function: Optimization ● Supervised vs Unsupervised ● Common models ○ Support Vector Machines ○ Decision trees ○ Neural networks ○ Logistic regression ○ Deep learning Y = m * X + c
  6. 6. Machine learning toolbox ● What happens when dimensions grows? X1 X2 X3 X4 X5 X6 Y 0 1 2 3 4 5 -12 2 9 7 9 11 0 32 5 4 0 22 -1 9 21 ... ... ... ... ... ... ...
  7. 7. Machine learning toolbox ●
  8. 8. Machine learning toolbox : SVM ● Support vector machines
  9. 9. Machine learning toolbox : Decision Trees http://jcsites.juniata.edu/faculty/rhodes/ida/decision Trees.html
  10. 10. Machine learning toolbox : Decision Trees http://jcsites.juniata.edu/faculty/rhodes/ida/decision Trees.html
  11. 11. Machine learning toolbox : Neural Networks Variants: ANN, CNN, RNN, DNN etc https://towardsdatascience.com/understanding-neural-networks-fro m-neuron-to-rnn-cnn-and-deep-learning-cd88e90e0a90
  12. 12. Machine learning toolbox : Neural Networks Multi Layer Perceptron- https://www.3blue1brown.com/
  13. 13. Current scenario ● Automated control system implemented at Google to cool its data centers autonomously ● Claims saved 30% energy ● Methodology- ○ PUE = Power to facility / Power of IT Equipment ○ According to the Uptime Institute, the typical data center has an average PUE of 2.5 ○ The input features (DC input variables) included the IT load, weather conditions, number of chillers and cooling towers running, equipment setpoints, etc ○ Using the machine learning framework developed in this paper, we are able to predict DC PUE within 0.004 +/ 0.005, approximately 0.4 percent error for a PUE of 1.1 https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42542.pdf
  14. 14. Current scenario ● Facebook uses ML in Data center operations and job scheduling. ● Other companies - IBM, Huawei and HPE (has written white paper about ML reducing downtime) ● Litbit has developed the first AI-powered, data center operator, Dac. ○ Dac will be able to use the strategically placed Internet of Things (IoT) sensors to detect loose electric wires in the server rooms or water leakages in a cooling system.
  15. 15. Current scenario ● AI based DCIM are on the rise as well - gathering raw IoT data, centralizing it, and using AI algorithms to identify patterns, actionable information is generated that provides operators with clear visibility to IT and infrastructure asset behaviors and recommended corrective actions. E.g. Hewlett Packard’s Infosight, ● Maya HTT's data center infrastructure management software, Datacenter Clarity LC, uses AI-powered tools to analyze individual servers to detect anomalies and opportunities for optimization.
  16. 16. Possible use cases in ACS ● Time for ACS to evolve? ● Don’t need ML scientists in community but need some awareness. ● Perhaps start a dialogue in...
  17. 17. Possible use cases in Apache CloudStack ● Energy aware CloudStack ● Load balancing techniques ○ Network traffic congestion can be avoided ● VM host placement & migration ○ More aware VM deployment ● Volume host placement & migration ○ More available storage ● Host failure prediction and maintenance ○ Trigger migrations in case of failures and notify admins
  18. 18. Possible use cases in ACS ● Smarter Router ○ AI driven security tools injected into VRs ○ Predict failures and self healing and not just services driven ● Logs analysis for recommendation ○ Logs generated can provide meaningful information for possible healing actions ○ Can help in fixing some if not all cases to self heal ● Architecture wise separate plugins leveraging existing open source ML tools can be used
  19. 19. Conclusion ● Challenges: ○ Community needs to be aware for future trends ○ Data: ■ Where is the data? (Some effort in logging host level performance) ■ How to manage data? ● Next Steps: ○ Open a dialogue on users@ and dev@ to gather opinions ○ Add API support for non ACS user/experts to help train models, export data
  20. 20. Thoughts/Questions? Thank you! (anurag.awasthi@shapeblue.com, github: @anuragaw)

×