Transcript of "Unit i introduction to grid computing"
UNIT 1INTRODUCTION ABOUT GRID COMPUTING
DEFINITION Grid computing is the federation of computer resources from multiple administrative domains to reach a common goal. “computataional of grid is a hardware and software infrastructure that provides independent pervasive and inexpensive access to high end computational capabilities”.
What is Grid Computing? Who Needs It? An Illustrative Example Grid Users Current Grids
What is Grid Computing? Computational Grids Homogeneous (e.g., Clusters) Heterogeneous (e.g., with one-of-a-kind instruments) Cousins of Grid Computing Methods of Grid Computing
Computational Grids Each user A network of geographically distributed resources including computers, peripherals, switches, instruments, and data. should have a single login account to access all resources. Resources may be owned by diverse organizations.
Computational Grids Grids are typically managed by grid ware. Grid ware can be viewed as a special type of middleware that enable sharing and manage grid components based on user requirements and resource attributes (e.g., capacity, performance, availability…)
Cousins of Grid Computing Parallel Computing Distributed Computing Peer-to-Peer Computing Many others: Cluster Computing, Network Computing, Client/Server Computing, Internet Computing, etc...
Distributed Computing People often ask: Is Grid Computing a fancy new name for the concept of distributed computing? In general, the answer is “no.” Distributed Computing is most often concerned with distributing the load of a program across two or more processes.
PEER2PEER Computing Sharing of computer resources and services by direct exchange between systems. Computers can act as clients or servers depending on what role is most efficient for the network.
Distributed Supercomputing Combining multiple high-capacity resources on a computational grid into a single, virtual distributed supercomputer. Tackle problems that cannot be solved on a single system.
High-Throughput Computing Uses the grid to schedule large numbers of loosely coupled or independent tasks, with the goal of putting unused processor cycles to work.
On-Demand Computing Uses grid capabilities to meet short-term requirements for resources that are not locally accessible. Models real-time computing demands.
Collaborative Computing Concerned primarily with enabling and enhancing human-to-human interactions. Applications are often structured in terms of a virtual shared space.
Logistical Networking Global scheduling and optimization of data movement. Contrasts with traditional networking, which does not explicitly model storage resources in the network. Called "logistical" because of the analogy it bears with the systems of warehouses, depots, and distribution channels.
Who Needs Grid Computing? A chemist may utilize hundreds of processors to screen thousands of compounds per hour. Teams of engineers worldwide pool resources to analyze terabytes of structural data. Meteorologists seek to visualize and analyze petabytes of climate data with enormous computational demands.
An Illustrative Example Tiffany Moisan, a NASA research scientist, collected microbiological samples in the tidewaters around Wallops Island, Virginia. She needed the high-performance microscope located at the National Center for Microscopy and Imaging Research (NCMIR), University of California, San Diego.
She sent the samples to San Diego and used NPACI’s Telescience Grid and NASA’s Information Power Grid (IPG) to view and control the output of the microscope from her desk on Wallops Island. Thus, in addition to viewing the samples, she could move the platform holding them and make adjustments to the microscope.
CONT….. The microscope produced a huge dataset of images. This dataset was stored using a storage resource broker on NASA’s IPG. Moisan was able to run algorithms on this very dataset while watching the results in real time.
Grid Users Grid developers Tool developers Application developers End Users System Administrators
Grid Developers Very small group. Implementers of a grid “protocol” who provides the basic services required to construct a grid.
Tool Developers Implement the programming models used by application developers. Implement basic services similar to conventional computing services: User authentication/authorization Process management Data access and communication
Application Developers Construct grid-enabled applications for end-users who should be able to use these applications without concern for the underlying grid. Provide programming models that are appropriate for grid environments and services that programmers can rely on when developing (higher-level) applications.
System Administrators Balance local and global concerns. Manage grid components and infrastructure. Some tasks still not well delineated due to the high degree of sharing required.
ADVANTAGE Can solve larger, more complex problems in a shorter time Easier to collaborate with other organizations Make better use of existing hardware
DISADVANTAGE Grid software and standards are still evolving Learning curve to get started Non-interactive job submission
The grid- Present, Past, Future Number of derivatives in grid computing. Share resources and different architecture.1. Compute Grids2. Data Grids3. Science Grids4. Access Grids5. Knowledge Grids6. Cluster Grids7. Terra Grids8. Commodity Grids
1. Compute Grids vendors: Grid Gain - Professional Open Source JPPF - Open Source2. Data Grids vendors: Oracle Coherence- Commercial GemStone- Commercial GigaSpaces – Commercial JBossCache - Professional Open Source EhCache- Open Source
Data Functional data requirements for Grid Computing applications are: •To integrate multiple distributed, heterogeneous, and independently managed data sources. •Data transfer mechanisms •Data caching and/or replication mechanisms to minimize network traffic. •Data discovery mechanisms •Data encryption and integrity •Backup/restore mechanisms and policies
ComputationFunctional computational requirements for grid applications are: •Independent management of computing resources. •Intelligently and transparently select computing resources. •Availability, dynamic resource configuration, •Failure detection and failover mechanisms. •Secure resource management, access, and integrity.
Computational and Data GridsData requirements in the early grid solutions: Discover data. Databases, utilizing meta-data and other attributes of the data. The provisioning of computing facilities for high-speed data movement. Flexible data access and data filtering capabilities.
Current Grid Activities Sharing of resources can be different in present grid.1. Computing power2. Data3. Hardware4. Software5. Network services
Dynamic benefits of coordinated resource sharing in a virtualorganization.
The usage patterns found within each of the virtual organizations. A virtual organization for weather prediction. For example, this virtual organization requires resources such as weather prediction software applications to perform the mandatory environmental simulations associated with predicting weather. A virtual organization for financial modeling. For example, this virtual organization requires resources such as software modeling tools for performing a multitude of financial analytics, virtualized blades to run the above software, and access to data storage facilities for storing and accessing data.
Number of requirements for Grid Computing architectureThree categories1. Resource categories2. Virtual organization3. Users/Applications
Providing facilities for the following scenarios: Dynamic discovery of computing resources, based on their capabilities and functions. Immediate allocation and provisioning of these resources, based on their availability and the user demands or requirements. The management of these resources to meet the required service level agreements (SLAs). The provisioning of multiple autonomic features for the resources, such as self- diagnosis, self-healing, self-configuring, and self-management. The provisioning of secure access methods to the resources, and bindings with the local security mechanisms based upon the autonomic control policies.
Virtual organization must be capable of providing facilities for: Virtual task forces, or groups, to solve specific problems associated with the virtual organization. Dynamic collection of resources from heterogeneous providers based upon users needs and the sophistication levels of the problems. Dynamic identification and automatic problem resolution of a wide variety of troubles, with automation of event correlation, linking the specific problems to the required resource and service providers. The dynamic provisioning and management capabilities of the resources required meeting the SLAs. The formation of a secured federation (or governance model) and common management model for all of the resources respective to the virtual organization. The secure delegation of user credentials and identity mapping to the local domain(s). The management of resources, including utilization and allocation, to meet a budget and other economic criteria.
Users/applications typically found in Grid Computing environments must be able to perform the following characteristics: The clear and unambiguous identification of the problem The identification and mapping of the resources The ability to sustain the required levels of QoS, while adhering to the anticipated and necessary SLAs. The capability to collect feedback regarding resource status, including updates for the environments respective applications.
GRID APPLICATIONS Grid computing applications can be aligned to have a common needs Application partitioning that involves breaking the problem into discrete pieces Discovery and scheduling of tasks and workflow Data communication distributing the problem data where and when it is required Provisioning and Distributing application codes to specific system nodes
Contd… Results management assisting in the decision process of the environment Autonomic features such as self-configuration, self- optimization , self-recovery and self management Let us explore some of these Grid application and their usage pattern
Schedulers Responsible for management of jobs such as allocating the resource needed for any specific job , parallel execution of tasks, data management and service level management Schedulers form the hierarchical structures , with meta schedulers as the root and other schedulers as the leaves Meta schedulers or cluster schedulers for parallel execution
Scheduler embodies local , meta-level and cluster schedulers LOCALDiagram : SCHEDUL ER JOB META JOB META SCHEDUL SCHEDULE USER ER R JOB CLUSTER SCHEDULE R
Contd… Jobs submitted to the grid computing applications are evaluated based on the ir service level requirement It involves the complex work flow management and data movement activities to occur on a regular basis
Contd… There are schedulers that must provide capabilities for areas such as Advanced resource reservation SLA validation and enforcement Monitoring job execution and status Rescheduling and corrective action
Resource Broker It providing the paring service between the service requester and service provider This paring enables the selection of best available resource It will collect the information from the respective resources and uses this information for paring purpose
Resource Broker Diagram: RESOURCE BROKER SELECT RESOURCE INFORMATION RESOURCE 1 USER SELECT SCHEDULER EXECUTE TASK INFORMATION SCHEDULER RESOURCE 2 EXECUTE TASK
Contd… Resource broker provides the feed back to the users on the available resource Resource broker may select the suitable scheduler for the execution of tasks
Contd… The paring process in a resource broker involves allocation and support functions such as Allocate appropriate resource for task execution Support users deadline and budget constraints for scheduling optimizations
Load balancing Load balancing features must always be integrated into any system in order to avoid processing delays and over commitment of resource Load balancing may be built in connection with resource broker and schedulers The level of load balancing involves partitioning of jobs ,identifying the resource and queueing of the jobs
Contd… Used to running the parallel jobs in parallel It support failure detection and management It redistribute the jobs to other resource if needed
Grid portals Grid portals are like the web portals grid portals provide the uniform access to the grid resources Grid portals provide 1. Resource access 2. Scheduling capabilities 3. monitoring statusinformation
Contd… some examples of grid portals capabilities are 1. Querying database 2. File transfer facilities 3. Manage job through job status feed back 4. Security management 5. Provide personalized solution
Integrated solutions It is the combination of existing advanced middleware and application functionalities combined to provide high performance results across the grid computing environment It support more complex utilization of grid such as the coordinated and optimized resource sharing, enhanced security management, cost optimization ,etc It achieves the level of flexibility utilizing infrastructure provided by application and middleware frame works
GRID INFRASTRUCTURE GRID infrastructure forms the core foundation for the successful grid applications Grid computing infrastructure component must address several potentially complicated areas in many stages of implementation , they are 1. Security 2. Resource management 3. Information services 4. Data management
Diagram: GRID APPLICATIONS G G R R I I D RES INF D DAT M M OUR OR A I I CE MAT D SECU ION D MA D D RITY MA NAG L NAG SER L EME E E EME VICE NT W W NT S A A R R E E HOSTING ENVIRONMENT
Security Heterogeneous nature of resources – complicated polices - complex security schemes These computing resources are hosted in differing security domains and Heterogeneous platforms Security requirements – data integrity , confidentiality and information privacy
Contd… The grid computing data exchange must be protected using secure communication channels including SSL/TLS Secure message exchange mechanisms such as WS- Security Security infrastructure – grid security infrastructure (GSI)
Contd… Resource management area is the selection of correct resource from grid resource pool Fully based on SLA
Information services Providing valuable information respective to grid computing infrastructure resources Service are entirely depends on resource availability, capacity and utilization The information is valuable and mandatory feedback respective to resource managers Grid solutions are constructed to reflect portals Metrics are helpful in SLA
Data management Data forms the single most important asset in a grid computing system Data maybe – input to the resource – output from the resource Data must be near to the computation where it is used Data storage mechanisms – Storage Area Network (SAN) ,network file system, virtual database
contd… Developers and providers must factor into decision are related to selecting the most appropriate data management mechanism for grid computing infrastructureThis includes size of –1. data repositories2. resource geographical distribution3. security requirements4. schemes for replication5. caching facilities
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.