SlideShare a Scribd company logo
1 of 31
A Configuration Crawler for
        Cloud Appliances
Michael Menzel, Markus Klems, Hoang Anh-Le, Stefan Tai

             eOrganization Research Group
         Karlsruhe Institute of Technology (KIT)
      March 27, 2013, International Conference on Cloud Engineering (IC2E)
Agenda
1.   Foundations, Motivation & Existing Work
2.   Method: A Configuration Crawler
3.   Validation: Implementation for AWS EC2
4.   Conclusion & Outlook




                                        #2
Cloud Appliances and Configuration Meta-Data

MOTIVATION & FOUNDATIONS


                                               #3
Cloud Appliances in Compute IaaS*
• Differently configured Virtual Machine Images
              VM Image                         VM Image
                                          Executables & Data

                                          Software Platforms

                                                Libraries

          Operating System                 Operating System

       Operating System only           Full/Partial Software Stack


* Infrastructure as a Service (IaaS)                         #4
Appliances in Today‘s Public Clouds
• Not all Providers offer Appliances
                  • Rackspace
           Simple                                             Cloud
                                    Both     • AWS EC2
           VM Images            • GoGrid                 Appliances



      Centralized Packaging                     Decentralized Packaging


• Engaged Users create many Appliances


             Top 3 public AMI owners in US-East-1, April 13 2012

                                                                          #5
Meta-Data on Cloud Appliances
• There is Meta-Data, but not on Configuration




• Crawling needed to gain more information
                                        #6
Applications
• Interoperability: Convert Appliances to
  Configuration Management Manifests

• Decision Support: Consider Configuration
  Data in Virtual Machine Selection

• Statistics: Aggregate Configuration Data


                                            #7
Existing Work
• Meta-Data bundled with VM Image Files [1]

• Configuration Mgmt. to upgrade Appliances [2]

• Chef Ohai and Puppet Facter to collect installed
  libraries in Systems
       – For most Operating Systems
       – For most Package Managers

[1] D. Lutterkort and M. McLoughlin, “Manageable virtual appliances,” Linux Symposium, 2007.
[2] R. Filepp, L. Shwartz, C. Ward, R. Kearney, K. Cheng, C. Young, and Y. Ghosheh, “Image selection as a
service for cloud computing environments,” in Service-Oriented Computing and Applications                   #8
(SOCA), 2010 IEEE International Conference on, dec. 2010, pp. 1 –8.
A METHOD FOR CRAWLING
VIRTUAL APPLIANCE CONFIGURATIONS

                            #9
Method for Configuration Crawling
• Procedure Model for
  Crawling Virtual
  Appliance
  Configurations



  Parameter Input
  Operation
  Data Artifact

                            # 10
Discovering
• Retrieve Meta-Data via Compute Cloud API
• Filter out ineligible Virtual Appliances




                                             # 11
Crawling Configuration Data
• Split Function allows parallel
  processing

• Instantiate & Crawl multiple
  Virtual Appliances in parallel

• Leverage configuration mgmt.
  Agents* to detect configuration

• Collect configuration meta-data
  from started Appliance Instance



                                    # 12
Data Persistence
• Centralized storing of crawled configuration meta-
  data

• Persistent, centralized data store enables to reuse
  data in several applications




                                             # 13
Data Model
• Centralized storing of configuration meta-data
  needs common scheme




                                        # 14
Application: Decision Support
• Employ Config. Meta-Data in Requirement
  Definitions for Appliance Selections




                                     # 15
Application: Interoperability
• Generate Manifests from Config. Meta-Data




                                     # 16
VALIDATION WITH
PROOF-OF-CONCEPT

                   # 17
Implementation for AWS EC2 [3]
• Ruby Discoverer with filter & blacklist

• Ruby Crawler EC2 Instances injecting Chef Ohai [4] to
  instantiated Appliances
    – Ohai requires Ruby
    – Intermediate Result Collection to AWS S3

• Crawling Appliance 21 min. avg., costs 1 EC2-h

• MongoDB to store JSON Data, and copy on Google
  AppEngine for WebApp
[3] Available at http://github.com/myownthemepark/ami-crawler
[4] http://wiki.opscode.com/display/chef/Ohai                   # 18
Find it online!
You can find the Crawler Database as a Web App on




              myownthemepark.com

                      ... enhancing it permanently.

                                          # 19
CONCLUSION & OUTLOOK


                       # 20
Conclusion
• Crawling Configuration Data of Cloud
  Appliances is feasible
  – Proposed a procedure and data model
  – Validated the approach with a Proof-of-Concept


• Several Applications for collected
  Configuration Meta-Data of Appliances
  – Configuration Manifests for Interoperability
  – Statistics and Decision Support

                                              # 21
Outlook
• Extend implementation with support for more
  Cloud compute services

• Use Crawler Data in Decision Support
  Frameworks for Web Applications (e.g.,
  CloudGenius [5])



[5] M. Menzel and R. Ranjan, “CloudGenius: Decision Support for Web
Server Cloud Migration,” in Proceedings of the 21st International     # 22
Conference on World Wide Web. New York, NY, USA: ACM, 2012.
Discussion on the findings

THANK YOU!
TIME FOR QUESTIONS AND COMMENTS

                             # 23
Contact Me

For Questions, Discussions,
or Initiating Research Exchange:
Michael Menzel
Karlsruhe Institute of Technology (KIT)
Englerstr. 11
76131 Karlsruhe



Email: menzel@kit.edu
More slides

BACKUP


              # 25
Related Work
•   Security Analysis:
     – T. Garfinkel and M. Rosenblum, “A virtual machine introspection based architecture for
       intrusion detection,” in NDSS, 2003.

•   Configuration Management:
     – R. Filepp, L. Shwartz, C. Ward, R. Kearney, K. Cheng, C. Young, and Y. Ghosheh, “Image
       selection as a service for cloud computing environments,” in Service-Oriented Computing
       and Applications (SOCA), 2010 IEEE International Conference on, dec. 2010, pp. 1 –8.
     – K. Magoutis, M. Devarakonda, N. Joukov, and N. G. Vogl, “Galapagos: Model-driven discovery
       of end-to-end application-storage relationships in distributed systems,” IBM Journal of
       Research and Development, vol. 52, no. 4.5, pp. 367 –377, july 2008.
     – IBM, “Tivoli application dependency discovery manager,” http://www-
       01.ibm.com/software/tivoli/products/taddm/, accessed 25th April 2012.
     – A. V. Dastjerdi, S. G. H. Tabatabaei, and R. Buyya, “An Effective Architecture for Automated
       Appliance Management System Applying Ontology-Based Cloud Discovery,” in Proceedings
       of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing,
       IEEE Computer Society. Ieee, 2010, pp. 104–112.

•   Meta-Data in VM Image Files
     – D. Lutterkort and M. McLoughlin, “Manageable virtual appliances,” Linux Symposium, 2007.



                                                                                   # 26
Appliances in Today‘s Public Clouds

           Cloud      • Centralized Packaging
         Appliances   • Decentralized Packaging


         Simple VM    • Centralized Packaging
           Images




                                    # 27
Appliances in AWS‘ Public Cloud
• Amazon accounts for >50.000 AMIs, growing
  daily




• AMIs differ in multiple attributes, including its
  software configuration

                                           # 28
AWS AMIs in Regions




                      # 29
AWS Decentralized AMI Creation




                           # 30
Full Procedure Model




                       # 31

More Related Content

What's hot

Google cloud computing
Google cloud computingGoogle cloud computing
Google cloud computingBrian Pichman
 
10 benefits to thinking inside Box
10 benefits to thinking inside Box10 benefits to thinking inside Box
10 benefits to thinking inside BoxIBM Analytics
 
Cloud computing
Cloud computingCloud computing
Cloud computingDhruv Seth
 
Emerging computer environments- By Dr. V. Rajaraman
Emerging computer environments- By Dr. V. RajaramanEmerging computer environments- By Dr. V. Rajaraman
Emerging computer environments- By Dr. V. Rajaramanmunicsaa
 
The new big data
The new big dataThe new big data
The new big dataAdam Doyle
 
Virtual Machine provisioning and migration services
Virtual Machine provisioning and migration servicesVirtual Machine provisioning and migration services
Virtual Machine provisioning and migration servicesANUSUYA T K
 
Postgres Plus Cloud Database Presentation
Postgres Plus Cloud Database PresentationPostgres Plus Cloud Database Presentation
Postgres Plus Cloud Database PresentationEDB
 
Big Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 TelcoBig Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 TelcoBlueData, Inc.
 
Birds Eye View on Big Data by STKI
Birds Eye View on Big Data by STKIBirds Eye View on Big Data by STKI
Birds Eye View on Big Data by STKIIdan Tohami
 
Virtualization in cloud computing
Virtualization in cloud computingVirtualization in cloud computing
Virtualization in cloud computingShashank Viswanadha
 
Windows 2012 Technical Overview
Windows 2012 Technical OverviewWindows 2012 Technical Overview
Windows 2012 Technical OverviewAmit Gatenyo
 
Spark Infrastructure Made Easy
Spark Infrastructure Made EasySpark Infrastructure Made Easy
Spark Infrastructure Made EasyBlueData, Inc.
 

What's hot (20)

Google cloud computing
Google cloud computingGoogle cloud computing
Google cloud computing
 
10 benefits to thinking inside Box
10 benefits to thinking inside Box10 benefits to thinking inside Box
10 benefits to thinking inside Box
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
GCP Cloud Storage Security
GCP Cloud Storage SecurityGCP Cloud Storage Security
GCP Cloud Storage Security
 
Openstack: starter level
Openstack: starter levelOpenstack: starter level
Openstack: starter level
 
Nimbus Concept
Nimbus ConceptNimbus Concept
Nimbus Concept
 
Emerging computer environments- By Dr. V. Rajaraman
Emerging computer environments- By Dr. V. RajaramanEmerging computer environments- By Dr. V. Rajaraman
Emerging computer environments- By Dr. V. Rajaraman
 
Cloud computing1
Cloud computing1Cloud computing1
Cloud computing1
 
The new big data
The new big dataThe new big data
The new big data
 
Virtual Machine provisioning and migration services
Virtual Machine provisioning and migration servicesVirtual Machine provisioning and migration services
Virtual Machine provisioning and migration services
 
Postgres Plus Cloud Database Presentation
Postgres Plus Cloud Database PresentationPostgres Plus Cloud Database Presentation
Postgres Plus Cloud Database Presentation
 
Analytics&IoT
Analytics&IoTAnalytics&IoT
Analytics&IoT
 
Big Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 TelcoBig Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 Telco
 
About Nimbus Concept
About Nimbus ConceptAbout Nimbus Concept
About Nimbus Concept
 
Birds Eye View on Big Data by STKI
Birds Eye View on Big Data by STKIBirds Eye View on Big Data by STKI
Birds Eye View on Big Data by STKI
 
Virtualization in cloud computing
Virtualization in cloud computingVirtualization in cloud computing
Virtualization in cloud computing
 
Cloud Service Model
Cloud Service ModelCloud Service Model
Cloud Service Model
 
Google Cloud Platform Intro to Data and Storage Services
Google Cloud Platform Intro to Data and Storage ServicesGoogle Cloud Platform Intro to Data and Storage Services
Google Cloud Platform Intro to Data and Storage Services
 
Windows 2012 Technical Overview
Windows 2012 Technical OverviewWindows 2012 Technical Overview
Windows 2012 Technical Overview
 
Spark Infrastructure Made Easy
Spark Infrastructure Made EasySpark Infrastructure Made Easy
Spark Infrastructure Made Easy
 

Similar to Configuration Crawler for Cloud Appliances Meta-Data

Clould Computing and its application in Libraries
Clould Computing and its application in LibrariesClould Computing and its application in Libraries
Clould Computing and its application in LibrariesAmit Shaw
 
module1st-cloudcomputing-180131063409 - Copy.pdf
module1st-cloudcomputing-180131063409 - Copy.pdfmodule1st-cloudcomputing-180131063409 - Copy.pdf
module1st-cloudcomputing-180131063409 - Copy.pdfBenakappaSM
 
Cloud computing by Luqman
Cloud computing by LuqmanCloud computing by Luqman
Cloud computing by LuqmanLuqman Shareef
 
A Complete Guide Cloud Computing
A Complete Guide Cloud ComputingA Complete Guide Cloud Computing
A Complete Guide Cloud ComputingSripati Mahapatra
 
Deployment of private cloud infrastructure.
Deployment of private cloud infrastructure.Deployment of private cloud infrastructure.
Deployment of private cloud infrastructure.Saket Kumar
 
Deployment of private cloud infrastructure copy
Deployment of private cloud infrastructure   copyDeployment of private cloud infrastructure   copy
Deployment of private cloud infrastructure copyprabhat kumar
 
Containers as Infrastructure for New Gen Apps
Containers as Infrastructure for New Gen AppsContainers as Infrastructure for New Gen Apps
Containers as Infrastructure for New Gen AppsKhalid Ahmed
 
Mahika cloud services
Mahika cloud servicesMahika cloud services
Mahika cloud servicesSomnath Sen
 
Federating Infrastructure as a Service cloud computing systems to create a un...
Federating Infrastructure as a Service cloud computing systems to create a un...Federating Infrastructure as a Service cloud computing systems to create a un...
Federating Infrastructure as a Service cloud computing systems to create a un...David Wallom
 
Introduction to Cloud Computing
Introduction to Cloud ComputingIntroduction to Cloud Computing
Introduction to Cloud ComputingBharat Kalia
 
CHAPTER 2 cloud computing technology in cs
CHAPTER 2 cloud computing technology in csCHAPTER 2 cloud computing technology in cs
CHAPTER 2 cloud computing technology in csTSha7
 
Cloud and its job oppertunities
Cloud and its job oppertunitiesCloud and its job oppertunities
Cloud and its job oppertunitiesRamya SK
 
Big data application using hadoop in cloud [Smart Refrigerator]
Big data application using hadoop in cloud [Smart Refrigerator] Big data application using hadoop in cloud [Smart Refrigerator]
Big data application using hadoop in cloud [Smart Refrigerator] Pushkar Bhandari
 
Cloud virtualization
Cloud virtualizationCloud virtualization
Cloud virtualizationSarwan Singh
 
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud ComputingDavid Wallom
 
Introduction to Cloud Computing
Introduction to Cloud ComputingIntroduction to Cloud Computing
Introduction to Cloud ComputingAnimesh Chaturvedi
 

Similar to Configuration Crawler for Cloud Appliances Meta-Data (20)

Clould Computing and its application in Libraries
Clould Computing and its application in LibrariesClould Computing and its application in Libraries
Clould Computing and its application in Libraries
 
module1st-cloudcomputing-180131063409 - Copy.pdf
module1st-cloudcomputing-180131063409 - Copy.pdfmodule1st-cloudcomputing-180131063409 - Copy.pdf
module1st-cloudcomputing-180131063409 - Copy.pdf
 
Cloud computing by Luqman
Cloud computing by LuqmanCloud computing by Luqman
Cloud computing by Luqman
 
A Complete Guide Cloud Computing
A Complete Guide Cloud ComputingA Complete Guide Cloud Computing
A Complete Guide Cloud Computing
 
Deployment of private cloud infrastructure.
Deployment of private cloud infrastructure.Deployment of private cloud infrastructure.
Deployment of private cloud infrastructure.
 
Dbms
DbmsDbms
Dbms
 
Deployment of private cloud infrastructure copy
Deployment of private cloud infrastructure   copyDeployment of private cloud infrastructure   copy
Deployment of private cloud infrastructure copy
 
Containers as Infrastructure for New Gen Apps
Containers as Infrastructure for New Gen AppsContainers as Infrastructure for New Gen Apps
Containers as Infrastructure for New Gen Apps
 
Mahika cloud services
Mahika cloud servicesMahika cloud services
Mahika cloud services
 
Federating Infrastructure as a Service cloud computing systems to create a un...
Federating Infrastructure as a Service cloud computing systems to create a un...Federating Infrastructure as a Service cloud computing systems to create a un...
Federating Infrastructure as a Service cloud computing systems to create a un...
 
Introduction to Cloud Computing
Introduction to Cloud ComputingIntroduction to Cloud Computing
Introduction to Cloud Computing
 
CHAPTER 2 cloud computing technology in cs
CHAPTER 2 cloud computing technology in csCHAPTER 2 cloud computing technology in cs
CHAPTER 2 cloud computing technology in cs
 
Cloud and its job oppertunities
Cloud and its job oppertunitiesCloud and its job oppertunities
Cloud and its job oppertunities
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Big data application using hadoop in cloud [Smart Refrigerator]
Big data application using hadoop in cloud [Smart Refrigerator] Big data application using hadoop in cloud [Smart Refrigerator]
Big data application using hadoop in cloud [Smart Refrigerator]
 
Cloud virtualization
Cloud virtualizationCloud virtualization
Cloud virtualization
 
Cloud Computing - Introduction
Cloud Computing - IntroductionCloud Computing - Introduction
Cloud Computing - Introduction
 
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud Computing
 
Introduction to Cloud Computing
Introduction to Cloud ComputingIntroduction to Cloud Computing
Introduction to Cloud Computing
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 

Configuration Crawler for Cloud Appliances Meta-Data

  • 1. A Configuration Crawler for Cloud Appliances Michael Menzel, Markus Klems, Hoang Anh-Le, Stefan Tai eOrganization Research Group Karlsruhe Institute of Technology (KIT) March 27, 2013, International Conference on Cloud Engineering (IC2E)
  • 2. Agenda 1. Foundations, Motivation & Existing Work 2. Method: A Configuration Crawler 3. Validation: Implementation for AWS EC2 4. Conclusion & Outlook #2
  • 3. Cloud Appliances and Configuration Meta-Data MOTIVATION & FOUNDATIONS #3
  • 4. Cloud Appliances in Compute IaaS* • Differently configured Virtual Machine Images VM Image VM Image Executables & Data Software Platforms Libraries Operating System Operating System Operating System only Full/Partial Software Stack * Infrastructure as a Service (IaaS) #4
  • 5. Appliances in Today‘s Public Clouds • Not all Providers offer Appliances • Rackspace Simple Cloud Both • AWS EC2 VM Images • GoGrid Appliances Centralized Packaging Decentralized Packaging • Engaged Users create many Appliances Top 3 public AMI owners in US-East-1, April 13 2012 #5
  • 6. Meta-Data on Cloud Appliances • There is Meta-Data, but not on Configuration • Crawling needed to gain more information #6
  • 7. Applications • Interoperability: Convert Appliances to Configuration Management Manifests • Decision Support: Consider Configuration Data in Virtual Machine Selection • Statistics: Aggregate Configuration Data #7
  • 8. Existing Work • Meta-Data bundled with VM Image Files [1] • Configuration Mgmt. to upgrade Appliances [2] • Chef Ohai and Puppet Facter to collect installed libraries in Systems – For most Operating Systems – For most Package Managers [1] D. Lutterkort and M. McLoughlin, “Manageable virtual appliances,” Linux Symposium, 2007. [2] R. Filepp, L. Shwartz, C. Ward, R. Kearney, K. Cheng, C. Young, and Y. Ghosheh, “Image selection as a service for cloud computing environments,” in Service-Oriented Computing and Applications #8 (SOCA), 2010 IEEE International Conference on, dec. 2010, pp. 1 –8.
  • 9. A METHOD FOR CRAWLING VIRTUAL APPLIANCE CONFIGURATIONS #9
  • 10. Method for Configuration Crawling • Procedure Model for Crawling Virtual Appliance Configurations Parameter Input Operation Data Artifact # 10
  • 11. Discovering • Retrieve Meta-Data via Compute Cloud API • Filter out ineligible Virtual Appliances # 11
  • 12. Crawling Configuration Data • Split Function allows parallel processing • Instantiate & Crawl multiple Virtual Appliances in parallel • Leverage configuration mgmt. Agents* to detect configuration • Collect configuration meta-data from started Appliance Instance # 12
  • 13. Data Persistence • Centralized storing of crawled configuration meta- data • Persistent, centralized data store enables to reuse data in several applications # 13
  • 14. Data Model • Centralized storing of configuration meta-data needs common scheme # 14
  • 15. Application: Decision Support • Employ Config. Meta-Data in Requirement Definitions for Appliance Selections # 15
  • 16. Application: Interoperability • Generate Manifests from Config. Meta-Data # 16
  • 18. Implementation for AWS EC2 [3] • Ruby Discoverer with filter & blacklist • Ruby Crawler EC2 Instances injecting Chef Ohai [4] to instantiated Appliances – Ohai requires Ruby – Intermediate Result Collection to AWS S3 • Crawling Appliance 21 min. avg., costs 1 EC2-h • MongoDB to store JSON Data, and copy on Google AppEngine for WebApp [3] Available at http://github.com/myownthemepark/ami-crawler [4] http://wiki.opscode.com/display/chef/Ohai # 18
  • 19. Find it online! You can find the Crawler Database as a Web App on myownthemepark.com ... enhancing it permanently. # 19
  • 21. Conclusion • Crawling Configuration Data of Cloud Appliances is feasible – Proposed a procedure and data model – Validated the approach with a Proof-of-Concept • Several Applications for collected Configuration Meta-Data of Appliances – Configuration Manifests for Interoperability – Statistics and Decision Support # 21
  • 22. Outlook • Extend implementation with support for more Cloud compute services • Use Crawler Data in Decision Support Frameworks for Web Applications (e.g., CloudGenius [5]) [5] M. Menzel and R. Ranjan, “CloudGenius: Decision Support for Web Server Cloud Migration,” in Proceedings of the 21st International # 22 Conference on World Wide Web. New York, NY, USA: ACM, 2012.
  • 23. Discussion on the findings THANK YOU! TIME FOR QUESTIONS AND COMMENTS # 23
  • 24. Contact Me For Questions, Discussions, or Initiating Research Exchange: Michael Menzel Karlsruhe Institute of Technology (KIT) Englerstr. 11 76131 Karlsruhe Email: menzel@kit.edu
  • 26. Related Work • Security Analysis: – T. Garfinkel and M. Rosenblum, “A virtual machine introspection based architecture for intrusion detection,” in NDSS, 2003. • Configuration Management: – R. Filepp, L. Shwartz, C. Ward, R. Kearney, K. Cheng, C. Young, and Y. Ghosheh, “Image selection as a service for cloud computing environments,” in Service-Oriented Computing and Applications (SOCA), 2010 IEEE International Conference on, dec. 2010, pp. 1 –8. – K. Magoutis, M. Devarakonda, N. Joukov, and N. G. Vogl, “Galapagos: Model-driven discovery of end-to-end application-storage relationships in distributed systems,” IBM Journal of Research and Development, vol. 52, no. 4.5, pp. 367 –377, july 2008. – IBM, “Tivoli application dependency discovery manager,” http://www- 01.ibm.com/software/tivoli/products/taddm/, accessed 25th April 2012. – A. V. Dastjerdi, S. G. H. Tabatabaei, and R. Buyya, “An Effective Architecture for Automated Appliance Management System Applying Ontology-Based Cloud Discovery,” in Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, IEEE Computer Society. Ieee, 2010, pp. 104–112. • Meta-Data in VM Image Files – D. Lutterkort and M. McLoughlin, “Manageable virtual appliances,” Linux Symposium, 2007. # 26
  • 27. Appliances in Today‘s Public Clouds Cloud • Centralized Packaging Appliances • Decentralized Packaging Simple VM • Centralized Packaging Images # 27
  • 28. Appliances in AWS‘ Public Cloud • Amazon accounts for >50.000 AMIs, growing daily • AMIs differ in multiple attributes, including its software configuration # 28
  • 29. AWS AMIs in Regions # 29
  • 30. AWS Decentralized AMI Creation # 30