Pivotal HD as a Cloud Foundry Service

4,289
-1

Published on

Published in: Technology, News & Politics
0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,289
On Slideshare
0
From Embeds
0
Number of Embeds
17
Actions
Shares
0
Downloads
2
Comments
0
Likes
9
Embeds 0
No embeds

No notes for slide
  • As a user of Cloud Foundry, you’re probably aware of openness, as it pertains to avoiding vendor lock-in. But, similarly fundamental is the notion that the Cloud Foundry PaaS itself can be extended to enhance the PaaS value proposition.
  • Its through exactly this extensibility that Pivotal CF allows Hadoop to exist as a complementary service to application developers. With this “enhanced” PaaS, the application developer, besides hosting support, getting domain names, and single-node services like Postgres or MySQL, he or she will now have the ability to leverage large Hadoop clusters for analytics. But, how exactly does this work? What is this extensibility we’re talking about?
  • At the core of this extensibility, is a communication that occurs between the Cloud Controller and a Service Broker, whose responsibility is to negotiate service capabilities on behalf of the other nodes comprising the service, whatever that service may be. This communication is responsible for establishing a couple of exchanges:Catalog Management – Declaring what service is available, and variants of it can be requested by CF adminstrators. Think of shared MySQL servers, dedicated MySQL servers.Provisioning – The act of reserving resources on the cluster.Binding – The act of enabling access of particulars apps to the cluster.What’s key to note about this communication is the flexibility of the protocol, which allows the provisioning to be service-defined. This is going to be critical as we start to look at what it means to treat Hadoop as a service. For such a complex, distributed service like Hadoop, there are many configurations and different use cases for how a typical configuration exists in an enterprise. We’ll start with what’s likely the most accessible and straightforward approach to provisioning Hadoop.
  • Our first of many variants of Hadoop-as-a-service is comprised of a shared, static HDFS cluster that gets BOSH-deployed, along with the service broker, using the same infrastructure that your Cloud Foundry PaaS was deployed upon.
  • In this model, the provision request will be received by the Service Broker are propagated to the various sub-components of the cluster.
  • Ultimately, the act of provisioning will have reserved resources on each of the Hadoop components. For example, on HDFS, some amount of space will have been reserved on the filesystem; and with HAWQ, a database will have been created to house SQL data. The ensuing bind requests will allow apps to gain access to the HDFS subfolder, to that HAWQ database, and so on.
  • Shared cluster that is BOSH deployed side-by-side with your CF.
  • Shared cluster that is BOSH deployed side-by-side with your CF.
  • Shared cluster that is BOSH deployed side-by-side with your CF.
  • Shared cluster that is BOSH deployed side-by-side with your CF.
  • Shared cluster that is BOSH deployed side-by-side with your CF.
  • Shared cluster that is BOSH deployed side-by-side with your CF.
  • Stepping stone to dynamic MapReduce queries, namely the ability to through a simple API, spin up the cluster, send the mapreduce job, execute, return analysis, and tear down the cluster.
  • Pivotal HD as a Cloud Foundry Service

    1. 1. 1
    2. 2. 2 Ashwin Kumar Pivotal Pivotal Hadoop on Cloud Foundry
    3. 3. 3  Open source software for reliable, scalable, distributed computing  HDFS - A distributed file system for “large” I/O  YARN – A framework for resource scheduling and management  MapReduce – Popular paradigm for parallel batch processing Apache Hadoop
    4. 4. 4  Enterprise grade Hadoop distribution • Cluster Management and Monitoring • Bulk Data Loader • Extensions for Virtualization  Advanced Database Services • World’s Fastest SQL on Hadoop • 100% SQL Compliance Pivotal Hadoop
    5. 5. 5  Provision Hadoop resources to power data-centric Cloud Foundry Apps • Park unstructured data on HDFS. • Execute batch processing via MapReduce. • Perform deep, complex analytics in SQL using HAWQ. Pivotal HD for Cloud Foundry
    6. 6. 6 Extensibility in PaaS Pivotal CF
    7. 7. 7 Extensibility in PaaS Pivotal CF Pivotal HD
    8. 8. 8 Cloud Foundry Service API Pivotal CF Pivotal HD PHD Service Broker Cloud Controller
    9. 9. 9 Pivotal HD Service HDFS Hive YARN HBase HAWQ ZooKeeper Slave Node Slave Node Slave Node Slave Node Slave Node Slave Node Slave Node Slave Node Slave Node PHD Service Broker
    10. 10. 10 Pivotal HD Service PHD Service Broker HDFS Hive YARN HBase HAWQ ZooKeeper Slave Node Slave Node Slave Node Slave Node Slave Node Slave Node Slave Node Slave Node Slave Node
    11. 11. 11 Pivotal HD Service PHD Service Broker HDFS Hive YARN HBase HAWQ ZooKeeper Slave Node Slave Node Slave Node Slave Node Slave Node Slave Node Slave Node Slave Node Slave Node
    12. 12. 12  Shared Clusters  Bare Metal Installs  Negotiation across Multiple Clusters  Exclusive Clusters  Dynamic Provisioning Pivotal HD Service
    13. 13. 13  Shared Clusters  Bare Metal Installs  Negotiation across Multiple Clusters  Exclusive Clusters  Dynamic Provisioning Pivotal HD Service
    14. 14. 14 Pivotal HD Service PHD Service Broker HDFS Hive YARN HBase HAWQ ZooKeeper Slave Node Slave Node Slave Node Slave Node Slave Node Slave Node Slave Node Slave Node Slave Node
    15. 15. 15  Shared Clusters  Bare Metal Installs  Negotiation across Multiple Clusters  Exclusive Clusters  Dynamic Provisioning Pivotal HD Service
    16. 16. 16 Pivotal HD Service PHD Service Broker HDFS YARN HAWQ HDFS YARN HAWQ HDFS YARN HAWQ
    17. 17. 17 Pivotal HD Service PHD Service Broker HDFS YARN HAWQ HDFS YARN HAWQ HDFS YARN HAWQ
    18. 18. 18  Shared Clusters  Bare Metal Installs  Negotiation across Multiple Clusters  Exclusive Clusters  Dynamic Provisioning Pivotal HD Service
    19. 19. 19 Pivotal HD Service PHD Service Broker HDFS YARN HAWQ HDFS YARN HAWQ HDFS YARN HAWQ
    20. 20. 20 Pivotal HD Service PHD Service Broker HDFS YARN HAWQ HDFS YARN HAWQ HDFS YARN HAWQ
    21. 21. 21  Shared Clusters  Bare Metal Installs  Negotiation across Multiple Clusters  Exclusive Clusters  Dynamic Provisioning Pivotal HD Service
    22. 22. 22 Pivotal HD Service PHD Service Broker HDFS YARN HAWQ HDFS YARN HAWQ
    23. 23. 23 Pivotal HD Service PHD Service Broker HDFS YARN HAWQ HDFS YARN HAWQ HDFS YARN HAWQ
    24. 24. 24  Pivotal HD • Deployable by BOSH • Exposed as a Cloud Foundry service  Data intensive apps are coming  Only possible through extensibility of Cloud Foundry Conclusion
    25. 25. 25

    ×