• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Distributed Computing
 

Distributed Computing

on

  • 10,599 views

Comprehensive study of parallel, cluster, distributed, grid and cloud computing paradigms

Comprehensive study of parallel, cluster, distributed, grid and cloud computing paradigms

Statistics

Views

Total Views
10,599
Views on SlideShare
10,500
Embed Views
99

Actions

Likes
3
Downloads
454
Comments
0

3 Embeds 99

http://froom.tistory.com 67
http://www.slideshare.net 27
http://heeha.wordpress.com 5

Accessibility

Categories

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Distributed Computing Distributed Computing Presentation Transcript

  • Distributed Computing Sudarsun Santhiappan sudarsun@{burning-glass.com, gmail.com} Burning Glass Technologies Kilpauk, Chennai 600010
  • Technology is Changing...
    • Computational Power gets Doubled every 18 months
    • Networking Bandwidth and Speed getting Doubled every 9 months
    • How to tap the benefits of this Technology ?
    • Should we grow as an Individual ?
    • Should we grow as a Team ?
  • The Coverage Today
    • Parallel Processing
    • Multiprocessor or Multi-Core Computing
    • Symmetric Multiprocessing
    • Cluster Computing {PVM}
    • Distributed Computing {TAO, OpenMP}
    • Grid Computing {Globus Toolkit}
    • Cloud Computing {Amazon EC2}
  • Parallel Computing
    • It is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently in parallel.
    • Multi-Core, Multiprocessor SMP, Massively Parallel Processing (MPP) Computers
    • Is it easy to write a parallel program ?
  • Cluster Computing
    • A computer cluster is a group of linked computers, working together closely so that in many respects they form a single computer
    • Operate in shared memory mode (mostly)
    • Tightly coupled with high-speed networking, mostly with optical fiber channels.
    • HA, Load Balancing, Compute Clusters
    • Can we Load Balance using DNS ?
  • Distributed Computing
    • Wikipedia : It deals with hardware and software systems containing more than one processing element or storage element, concurrent processes, or multiple programs, running under a loosely or tightly controlled regime
  • Grid Computing
    • Wikipedia: A form of distributed computing whereby a super and virtual computer is composed of a cluster of networked, loosely-coupled computers, acting in concert to perform large tasks.
    • pcwebopedia.com : Unlike conventional networks that focus on communication among devices, grid computing harnesses unused processing cycles of all computers in a network for solving problems too intensive for any stand-alone machine.
    • IBM: Grid computing enables the virtualization of distributed computing and data resources such as processing, network bandwidth and storage capacity to create a single system image, granting users and applications seamless access to vast IT capabilities. Just as an Internet user views a unified instance of content via the Web, a grid user essentially sees a single, large virtual computer.
    • Sun: Grid Computing is a computing infrastructure that provides dependable, consistent, pervasive and inexpensive access to computational capabilities.
  • Cloud Computing
    • Wikipedia: It is a style of computing in which dynamically stable and often virtualised resources are provided as a service over the Internet.
    • Infrastructure As A Service (IaaS)
    • Platform As A Service (PaaS)
    • Software as a Service (SaaS)
    • Provide common business applications online accessible from a web browser.
    • Amazon Elastic Computing, Google Apps
  • Hardware: IBM p690 Regatta 32 POWER4 CPUs (1.1 GHz) 32 GB RAM 218 GB internal disk OS: AIX 5.1 Peak speed: 140.8 GFLOP/s * Programming model: shared memory multithreading (OpenMP) (also supports MPI) * GFLOP/s: billion floating point operations per second
  • 270 Pentium4 XeonDP CPUs 270 GB RAM 8,700 GB disk OS: Red Hat Linux Enterprise 3 Peak speed: 1.08 TFLOP/s * Programming model: distributed multiprocessing (MPI) * TFLOP/s: trillion floating point operations per second Hardware: Pentium4 Xeon Cluster
  • 56 Itanium2 1.0 GHz CPUs 112 GB RAM 5,774 GB disk OS: Red Hat Linux Enterprise 3 Peak speed: 224 GFLOP/s * Programming model: distributed multiprocessing (MPI) * GFLOP/s: billion floating point operations per second Hardware: Itanium2 Cluster schooner.oscer.ou.edu New arrival!
  • Vector Processing
    • It is based on array processors where the instruction set includes operations that can perform mathematical operations on data elements simultaneously
    • Example: Finding Scalar dot product between two vectors
    • Is vector processing a parallel computing model?
    • What are the limitations of Vector processing ?
    • Extensively in Video processing & Games...
  • Pipelined Processing
    • The fundamental idea is to split the processing of a computer instruction into a series of independent steps, with storage at the end of each step.
    • This allows the computer's control circuitry to issue instructions at the processing rate of the slowest step, which is much faster than the time needed to perform all steps at once.
    • A non-pipeline architecture is inefficient because some CPU components (modules) are idle while another module is active during the instruction cycle
    • Processors with pipelining are organized inside into stages which can semi-independently work on separate jobs
  • Parallel Vs Pipelined Processing
    • Parallel processing
    • Pipelined processing
    a1 a2 a3 a4 b1 b2 b3 b4 c1 c2 c3 c4 d1 d2 d3 d4 a1 b1 c1 d1 a2 b2 c2 d2 a3 b3 c3 d3 a4 b4 c4 d4 P1 P2 P3 P4 P1 P2 P3 P4 time Colors: different types of operations performed a, b, c, d: different data streams processed Less inter-processor communication Complicated processor hardware time More inter-processor communication Simpler processor hardware
  • Data Dependence
    • Parallel processing requires NO data dependence between processors
    • Pipelined processing will involve inter-processor communication
    P1 P2 P3 P4 P1 P2 P3 P4 time time
  • Typical Computing Elements Hardware Operating System Applications Programming paradigms P P P P P P   Microkernel Multi-Processor Computing System Threads Interface Process Processor Thread P
  • Why Parallel Processing ?
    • Computation requirements are ever increasing; for instance -- visualization, distributed databases, simulations, scientific prediction (ex: climate, earthquake), etc.
    • Sequential architectures reaching physical limitation (speed of light, thermodynamics)
    • Limit on number of transistor per square inch
    • Limit on inter-component link capacitance
  • Symmetric Multiprocessing SMP
    • Involves a multiprocessor computer architecture where two or more identical processors can connect to a single shared main memory
    • Kernel can execute on any processor
    • Typically each processor does self-scheduling form the pool of available process or threads
    • Scalability problems in Uniform Memory Access
    • NUMA to improve speed, but limitations on data migration
    • Intel, AMD processors are SMP units
    • What is ASMP ?
  •  
  •  
  • SISD : A Conventional Computer
    • Speed is limited by the rate at which computer can transfer information internally.
    Ex:PC, Macintosh, Workstations Processor Data Input Data Output Instructions
  • The MISD Architecture
    • More of an intellectual exercise than a practical configuration. Few built, but commercially not available
    Data Input Stream Data Output Stream Processor A Processor B Processor C Instruction Stream A Instruction Stream B Instruction Stream C
  • SIMD Architecture Ex: CRAY machine vector processing, Intel MMX (multimedia support) C i <= A i * B i Instruction Stream Processor A Processor B Processor C Data Input stream A Data Input stream B Data Input stream C Data Output stream A Data Output stream B Data Output stream C
  • Unlike SISD, MISD, MIMD computer works asynchronously. Shared memory (tightly coupled) MIMD Distributed memory (loosely coupled) MIMD MIMD Architecture Processor A Processor B Processor C Data Input stream A Data Input stream B Data Input stream C Data Output stream A Data Output stream B Data Output stream C Instruction Stream A Instruction Stream B Instruction Stream C
  • Shared Memory MIMD machine
    • Communication: Source Processor writes data to GM & destination retrieves it.
    • Limitation : reliability & expandability A memory component or any processor failure affects the whole system.
    • Increase of processors leads to memory contention.
    Ex. : Silicon graphics supercomputers.... Global Memory System Processor A Processor B Processor C MEMORY BUS MEMORY BUS MEMORY BUS
  • Distributed Memory MIMD
    • Communication : IPC on High Speed Network.
    • Network can be configured to ... Tree, Mesh, Cube, etc.
    • Unlike Shared MIMD
      • Readily expandable
      • Highly reliable (any CPU failure does not affect the whole system)
    Processor A Processor B Processor C IPC channel IPC channel MEMORY BUS MEMORY BUS MEMORY BUS Memory System A Memory System B Memory System C
  • Laws of caution.....
    • Speed of computers is proportional to the square of their cost.
    i.e. cost = Speed
    • Speedup by a parallel computer increases as the logarithm of the number of processors.
      • Speedup = log 2 (no. of processors)
    S P logP C S (speed = cost 2 )
  • Micro Kernel based Operating Systems for High Performance Computing
    • Three approaches to building OS....
      • Monolithic OS
      • Layered OS
      • Microkernel based OS
    Client server OS Suitable for MPP systems
    • Simplicity, flexibility and high performance are crucial for OS.
    Operating System Models
  • Monolithic Operating System
    • Better application Performance
    • Difficult to extend
    Ex: MS-DOS Application Programs Application Programs System Services Hardware User Mode Kernel Mode
  • Layered OS
    • Easier to enhance
    • Each layer of code access lower level interface
    • Low-application performance
    Application Programs System Services User Mode Kernel Mode Memory & I/O Device Mgmt Hardware Process Schedule Application Programs Ex : UNIX
  • Microkernel/Client Server OS (for MPP Systems)
    • Tiny OS kernel providing basic primitive (process, memory, IPC)
    • Traditional services becomes subsystems
    • Monolithic Application Perf. Competence
    • OS = Microkernel + User Subsystems
    Client Application Thread lib. File Server Network Server Display Server Microkernel Hardware Send Reply Ex: Mach, PARAS, Chorus, etc. User Kernel
  • What are Micro Kernels ?
    • Small operating system core
    • Contains only essential core operating systems functions
    • Many services traditionally included in the operating system are now external subsystems
      • Device drivers
      • File systems
      • Virtual memory manager
      • Windowing system
      • Security services
  •  
  • HPC Cluster Architecture Frontend Node Public Ethernet Private Ethernet Network Application Network (Optional) Power Distribution (Net addressable units as option) Node Node Node Node Node Node Node Node Node Node
  • Most Critical Problems with Clusters
    • The largest problem in clusters is software skew
      • When software configuration on some nodes is different than on others
      • Small differences (minor version numbers on libraries) can cripple a parallel program
    • The second most important problem is lack of adequate job control of the parallel process
      • Signal propagation
      • Cleanup
  • Top 3 Problems with Software Packages
    • Software installation works only in interactive mode
      • Need a significant work by end-user
    • Often rational default settings are not available
      • Extremely time consuming to provide values
      • Should be provided by package developers but …
    • Package is required to be installed on a running system
      • Means multi-step operation: install + update
      • Intermediate state can be insecure
  • Clusters Classification..1
    • Based on Focus (in Market)
      • High Performance (HP) Clusters
        • Grand Challenging Applications
      • High Availability (HA) Clusters
        • Mission Critical applications
  • HA Cluster: Server Cluster with &quot;Heartbeat&quot; Connection
  • Clusters Classification..2
    • Based on Workstation/PC Ownership
      • Dedicated Clusters
      • Non-dedicated clusters
        • Adaptive parallel computing
        • Also called Communal multiprocessing
  • Clusters Classification..3
    • Based on Node Architecture ..
      • Clusters of PCs (CoPs)
      • Clusters of Workstations (COWs)
      • Clusters of SMPs (CLUMPs)
  • Building Scalable Systems: Cluster of SMPs (Clumps) Performance of SMP Systems Vs. Four-Processor Servers in a Cluster
  • Clusters Classification..4
    • Based on Node OS Type ..
      • Linux Clusters (Beowulf)
      • Solaris Clusters (Berkeley NOW)
      • NT Clusters (HPVM)
      • AIX Clusters (IBM SP2)
      • SCO/Compaq Clusters (Unixware)
      • Digital VMS Clusters, HP clusters
  • Clusters Classification..5
      Based on Processor Arch, Node Type
    • Homogeneous Clusters
      • All nodes will have similar configuration
    • Heterogeneous Clusters
      • Nodes based on different processors and running different Operating Systems
  • Cluster Implementation
    • What is Middleware ?
    • What is Single System Image ?
    • Benefits of Single System Image
  • What is Cluster Middle-ware ?
    • An interface between user applications and cluster hardware and OS platform.
    • Middle-ware packages support each other at the management, programming, and implementation levels.
    • Middleware Layers:
      • SSI Layer
      • Availability Layer: It enables the cluster services of
        • Checkpointing, Automatic Failover, recovery from failure,
        • fault-tolerant operating among all cluster nodes.
  • Middleware Design Goals
    • Complete Transparency
      • Lets the see a single cluster system..
        • Single entry point, ftp, telnet, software loading...
    • Scalable Performance
      • Easy growth of cluster
        • no change of API & automatic load distribution.
    • Enhanced Availability
      • Automatic Recovery from failures
        • Employ checkpointing & fault tolerant technologies
      • Handle consistency of data when replicated..
  • What is Single System Image (SSI) ?
    • A single system image is the illusion , created by software or hardware, that a collection of computing elements appear as a single computing resource.
    • SSI makes the cluster appear like a single machine to the user, to applications, and to the network.
    • A cluster without a SSI is not a cluster
  • Benefits of Single System Image
    • Usage of system resources transparently
    • Improved reliability and higher availability
    • Simplified system management
    • Reduction in the risk of operator errors
    • User need not be aware of the underlying system architecture to use these machines effectively
  • Distributed Computing
    • No shared memory
    • Communication among processes
      • Send a message
      • Receive a message
    • Asynchronous
    • Synchronous
    • Synergy among processes
  • Messages
    • Messages are sequences of bytes moving between processes
    • The sender and receiver must agree on the type structure of values in the message
    • “ Marshalling”: data layout so that there is no ambiguity such as “four chars” v. “one integer”.
  • Message Passing
    • Process A sends a data buffer as a message to process B.
    • Process B waits for a message from A, and when it arrives copies it into its own local memory.
    • No memory shared between A and B.
  • Message Passing
    • Obviously,
      • Messages cannot be received before they are sent.
      • A receiver waits until there is a message.
    • Asynchronous
      • Sender never blocks, even if infinitely many messages are waiting to be received
      • Semi-asynchronous is a practical version of above with large but finite amount of buffering
  • Message Passing: Point to Point
    • Q: send(m, P)
      • Send message M to process P
    • P: recv(x, Q)
      • Receive message from process Q, and place it in variable x
    • The message data
      • Type of x must match that of m
      • As if x := m
  • Broadcast
    • One sender Q, multiple receivers P
    • Not all receivers may receive at the same time
    • Q: broadcast (m)
      • Send message M to processes
    • P: recv(x, Q)
      • Receive message from process Q, and place it in variable x
  • Synchronous Message Passing
    • Sender blocks until receiver is ready to receive.
    • Cannot send messages to self.
    • No buffering.
  • Asynchronous Message Passing
    • Sender never blocks.
    • Receiver receives when ready.
    • Can send messages to self.
    • Infinite buffering.
  • Message Passing
    • Speed not so good
      • Sender copies message into system buffers.
      • Message travels the network.
      • Receiver copies message from system buffers into local memory.
      • Special virtual memory techniques help.
    • Programming Quality
      • less error-prone cf. shared memory
  • Distributed Programs
    • Spatially distributed programs
      • A part here, a part there, …
      • Parallel
      • Synergy
    • Temporally distributed programs
      • Compute half today, half tomorrow
      • Combine the results at the end
    • Migratory programs
      • Have computation, will travel
  • Technological Bases of Distributed+Parallel Programs
    • Spatially distributed programs
      • Message passing
    • Temporally distributed programs
      • Shared memory
    • Migratory programs
      • Serialization of data and programs
  • Technological Bases for Migratory programs
    • Same CPU architecture
      • X86, PowerPC, MIPS, SPARC, …, JVM
    • Same OS + environment
    • Be able to “checkpoint”
      • suspend, and
      • then resume computation
      • without loss of progress
  • Message Passing Libraries
    • Programmer is responsible for initial data distribution, synchronization, and sending and receiving information
    • Parallel Virtual Machine (PVM)
    • Message Passing Interface (MPI)
    • Bulk Synchronous Parallel model (BSP)
  • BSP: Bulk Synchronous Parallel model
    • Divides computation into supersteps
    • In each superstep a processor can work on local data and send messages.
    • At the end of the superstep, a barrier synchronization takes place and all processors receive the messages which were sent in the previous superstep
  • BSP: Bulk Synchronous Parallel model
    • http://www.bsp-worldwide.org/
    • Book: Rob H. Bisseling, “Parallel Scientific Computation: A Structured Approach using BSP and MPI,” Oxford University Press, 2004, 324 pages, ISBN 0-19-852939-2.
  • BSP Library
    • Small number of subroutines to implement
      • process creation,
      • remote data access, and
      • bulk synchronization.
    • Linked to C, Fortran, … programs
  • Portable Batch System (PBS)
    • Prepare a .cmd file
      • naming the program and its arguments
      • properties of the job
      • the needed resources 
    • Submit .cmd to the PBS Job Server: qsub command 
    • Routing and Scheduling: The Job Server
      • examines .cmd details to route the job to an execution queue.
      • allocates one or more cluster nodes to the job
      • communicates with the Execution Servers (mom's) on the cluster to determine the current state of the nodes. 
      • When all of the needed are allocated, passes the .cmd on to the Execution Server on the first node allocated (the &quot;mother superior&quot;). 
    • Execution Server
      • will login on the first node as the submitting user and run the .cmd file in the user's home directory. 
      • Run an installation defined prologue script.
      • Gathers the job's output to the standard output and standard error
      • It will execute installation defined epilogue script.
      • Delivers stdout and stdout to the user.
  • TORQUE, an open source PBS
    • Tera-scale Open-source Resource and QUEue manager (TORQUE) enhances OpenPBS
    • Fault Tolerance
      • Additional failure conditions checked/handled
      • Node health check script support
    • Scheduling Interface
    • Scalability
      • Significantly improved server to MOM communication model
      • Ability to handle larger clusters (over 15 TF/2,500 processors)
      • Ability to handle larger jobs (over 2000 processors)
      • Ability to support larger server messages
    • Logging
    • http://www.supercluster.org/projects/torque/
  • PVM, and MPI
    • Message passing primitives
    • Can be embedded in many existing programming languages
    • Architecturally portable
    • Open-sourced implementations
  • Parallel Virtual Machine ( PVM )
    • PVM enables a heterogeneous collection of networked computers to be used as a single large parallel computer.
    • Older than MPI
    • Large scientific/engineering user community
    • http://www.csm.ornl.gov/pvm/
  • Message Passing Interface (MPI)
    • http ://www-unix.mcs.anl.gov/mpi/
    • MPI-2.0 http://www.mpi-forum.org/docs/
    • MPI CH: www.mcs.anl.gov/mpi/mpich / by Argonne National Laboratory and Missisippy State University
    • LAM: http://www.lam-mpi.org/
    • http://www.open-mpi.org/
  • OpenMP for shared memory
    • Distributed shared memory API
    • User-gives hints as directives to the compiler
    • http://www.openmp.org
  • SPMD
    • Single program, multiple data
    • Contrast with SIMD
    • Same program runs on multiple nodes
    • May or may not be lock-step
    • Nodes may be of different speeds
    • Barrier synchronization
  • Condor
    • Cooperating workstations: come and go.
    • Migratory programs
      • Checkpointing
      • Remote IO
    • Resource matching
    • http://www.cs.wisc.edu/condor/
  • Migration of Jobs
    • Policies
      • Immediate-Eviction
      • Pause-and-Migrate
    • Technical Issues
      • Check-pointing: Preserving the state of the process so it can be resumed.
      • Migrating from one architecture to another
  • OpenMosix Distro
    • Quantian Linux
      • Boot from DVD-ROM
      • Compressed file system on DVD
      • Several GB of cluster software
      • http:// dirk.eddelbuettel.com/quantian.html
    • Live CD/DVD or Single Floppy Bootables
      • http://bofh.be/clusterknoppix/
      • http://sentinix.org/
      • http://itsecurity.mq.edu.au/chaos/
      • http://openmosixloaf.sourceforge.net/
      • http://plumpos.sourceforge.net/
      • http://www.dynebolic.org/
      • http://bccd.cs.uni.edu/
      • http://eucaristos.sourceforge.net/
      • http://gomf.sourceforge.net/
    • Can be installed on HDD
  • What is openMOSIX?
    • An open source enhancement to the Linux kernel
    • Cluster with come-and-go nodes
    • System image model: Virtual machine with lots of memory and CPU
    • Granularity: Process
    • Improves the overall (cluster-wide) performance.
    • Multi-user, time-sharing environment for the execution of both sequential and parallel applications
    • Applications unmodified (no need to link with special library)
  • What is openMOSIX?
    • Execution environment:
      • farm of diskless x86 based nodes
      • UP (uniprocessor), or
      • SMP (symmetric multi processor)
      • connected by standard LAN (e.g., Fast Ethernet)
    • Adaptive resource management to dynamic load characteristics
      • CPU, RAM, I/O, etc.
    • Linear scalability
  • Users’ View of the Cluster
    • Users can start from any node in the cluster, or sysadmin sets-up a few nodes as login nodes
    • Round-robin DNS: “hpc.clusters” with many IPs assigned to same name
    • Each process has a Home-Node
      • Migrated processes always appear to run at the home node, e.g., “ps” show all your processes, even if they run elsewhere
  • MOSIX architecture
    • network transparency
    • preemptive process migration
    • dynamic load balancing
    • memory sharing
    • efficient kernel communication
    • probabilistic information dissemination algorithms
    • decentralized control and autonomy
  • A two tier technology
    • Information gathering and dissemination
      • Support scalable configurations by probabilistic dissemination algorithms
      • Same overhead for 16 nodes or 2056 nodes
    • Pre-emptive process migration that can migrate any process, anywhere, anytime - transparently
      • Supervised by adaptive algorithms that respond to global resource availability
      • Transparent to applications, no change to user interface
  • Tier 1: Information gathering and dissemination
    • In each unit of time (e.g., 1 second) each node gathers information about:
      • CPU(s) speed, load and utilization
      • Free memory
      • Free proc-table/file-table slots
    • Info sent to a randomly selected node
    • Scalable - more nodes better scattering
  • Tier 2: Process migration
    • Load balancing: reduce variance between pairs of nodes to improve the overall performance
    • Memory ushering: migrate processes from a node that nearly exhausted its free memory, to prevent paging
    • Parallel File I/O: bring the process to the file-server, direct file I/O from migrated processes
  • Network transparency
    • The user and applications are provided a virtual machine that looks like a single machine.
    • Example: Disk access from diskless nodes on fileserver is completely transparent to programs
  • Preemptive process migration
    • Any user’s process, trasparently and at any time, can/may migrate to any other node.
    • The migrating process is divided into:
      • system context ( deputy ) that may not be migrated from home workstation (UHN);
      • user context ( remote ) that can be migrated on a diskless node;
  • Splitting the Linux process
    • System context (environment) - site dependent- “home” confined
    • Connected by an exclusive link for both synchronous (system calls) and asynchronous (signals, MOSIX events)
    • Process context (code, stack, data) - site independent - may migrate
    Deputy Remote Kernel Kernel Userland Userland openMOSIX Link Local master node diskless node
  • Dynamic load balancing
    • Initiates process migrations in order to balance the load of farm
    • responds to variations in the load of the nodes, runtime characteristics of the processes, number of nodes and their speeds
    • makes continuous attempts to reduce the load differences among nodes
    • the policy is symmetrical and decentralized
      • all of the nodes execute the same algorithm
      • the reduction of the load differences is performed indipendently by any pair of nodes
  • The ACE ORB
    • What Is CORBA?
    • CORBA Basics
      • Clients, Servers, and Servants
      • ORBs and POAs
      • IDL and the Role of IDL Compilers
      • IORs
      • Tying it all together
    • Overview of ACE/TAO
    • CORBA Services
      • Naming Service
      • Trading Service
      • Event Service
    • Multi-Threaded Issues Using CORBA
  • What Is CORBA?
    • C ommon O bject R equest B roker A rchitecture
      • Common Architecture
      • Object Request Broker – ORB
    • Specification from the OMG
      • http://www.omg.org/technology/documents/corba_spec_catalog.htm
      • Must be implemented before usable
  • What Is CORBA?
    • More specifically:
      • “ ( CORBA ) is a standard defined by the Object Management Group (OMG) that enables software components written in multiple computer languages and running on multiple computers to work together ” (1)
      • Allows for Object Interoperability, regardless of:
        • Operating Systems
        • Programming Language
        • Takes care of Marshalling and Unmarshalling of Data
      • A method to perform Distributed Computing
  • What Is CORBA? Program A
    • Running on a Windows PC
    • Written in Java
    Program B
    • Running on a Linux Machine
    • Written in C++
    CORBA
  • CORBA Basics: Clients, Servers, and Servants
    • CORBA Clients
      • An Application (program)
      • Request services from Servant object
        • Invoke a method call
      • Can exist on a different computer from Servant
        • Can also exist on same computer, or even within the same program, as the Servant
      • Implemented by Software Developer
  • CORBA Basics: Clients, Servers, and Servants
    • CORBA Servers
      • An Application (program)
      • Performs setup needed to get Servants configured properly
        • ORB’s, POA’s
      • Instantiates and starts Servants object(s)
      • Once configuration done and Servant(s) running, Clients can begin to send messages
      • Implemented by Software Developer
  • CORBA Basics: Clients, Servers, and Servants
    • Servants
      • Objects
      • Implement interfaces
      • Respond to Client requests
      • Exists within the same program as the Server that created and started it
      • Implemented by Software Developer
  • ORB’s and POA’s
    • ORB: Object Request Broker
      • The “ORB” in “CORBA”
        • At the heart of CORBA
      • Enables communication
      • Implemented by ORB Vendor
        • An organization that implements the CORBA Specification (a company, a University, etc.)
      • Can be viewed as an API/Framework
        • Set of classes and method
      • Used by Clients and Servers to properly setup communication
        • Client and Server ORB’s communicate over a network
        • Glue between Client and Server applications
  • ORB’s and POA’s
    • POA: Portable Object Adapter
      • A central CORBA goal: Programs using different ORB’s (provided by different ORB Vendors) can still communicate
      • The POA was adopted as the solution
      • Can be viewed as an API/Framework
        • Set of classes and method
      • Sits between ORB’s and Servants
        • Glue between Servants and ORBs
      • Job is to:
        • Receive messages from ORB’s
        • Activate the appropriate Servant
        • Deliver the message to the Servant
  • CORBA Basics: IDL
    • IDL: The Interface Definition Language
      • Keyword: Definition
        • No “executable” code (cannot implement anything)
        • Very similar to C++ Header Files
        • Language independent from Target Language
          • Allows Client and Server applications to be written in different (several) languages
      • A “contract” between Clients and Servers
        • Both MUST have the exact same IDL
        • Specifies messages and data that can be sent by Clients and received by Servants
      • Written by Software Developer
  • CORBA Basics: IDL
    • Used to define interfaces (i.e. Servants)
      • Classes and methods that provide services
    • IDL Provides…
      • Primitive Data Types (int, float, boolean, char, string)
      • Ability to compose primitives into more complex data structures
      • Enumerations, Unions, Arrays, etc.
      • Object-Oriented Inheritance
  • CORBA Basics: IDL
    • IDL Compilers
      • Converts IDL files to target language files
      • Done via Language Mappings
        • Useful to understand your Language Mapping scheme
      • Target language files contain all the implementation code that facilitates CORBA-based communication
        • More or less “hides” the details from you
      • Creates client “stubs” and Server “skeletons”
      • Provided by ORB Vendor
  • CORBA Basics: IDL IDL File IDL Compiler Client Stub Files Server Skeleton Files Generates Generates Generated Files are in Target Language:
    • C++
    • Java
    • etc.
    Generated Files are in Target Language:
    • C++
    • Java
    • etc.
    Client Programs used the classes in the Client Stub files to send messages to the Servant objects Client Program Servant Object Servant Objects inherit from classes in the Server Skeleton files to receive messages from the Client programs Association Inheritance
  • CORBA Basics: IDL
    • Can also generate empty Servant class files
    IDL Compiler converts to C++ (in this case)
  • CORBA Basics: IOR’s
    • IOR: Interoperable Object Reference
      • Can be thought of as a “Distributed Pointer”
      • Unique to each Servant
      • Used by ORB’s and POA’s to locate Servants
        • For Clients, used to find Servants across networks
        • For Servers, used to find proper Servant running within the application
      • Opaque to Client and Server applications
        • Only meaningful to ORB’s and POA’s
        • Contains information about IP Address, Port Numbers, networking protocols used, etc.
      • The difficult part is obtaining them
        • This is the purpose/reasoning behind developing and using CORBA Services
  • CORBA Basics: IOR’s
    • Can be viewed in “stringified” format, but…
      • Still not very meaningful
  • CORBA Basics: IOR’s
    • Standardized, to some degree:
    … … Standardized by the OMG:
    • Used by Client side ORB’s to locate Server side (destination) ORB’s
    • Contains information needed to make physical connection
    NOT Standardized by the OMG; proprietary to ORB Vendors
    • Used by Server side ORB’s and POA’s to locate destination Servants
  • CORBA Basics: Tying it All Together
  • CORBA Basics: Tying it All Together Client Program IOR (Servant Ref) Server Program Servant Message(Data) Logical Flow Client Program Server Program Servant Actual Flow POA ORB IOR (Servant Ref) ORB Once ORB’s and POA’s set up and configured properly, transparency is possible
    • ORB’s communicate over network
    • POA’s activate servants and deliver messages
  • Overview of ACE/TAO
    • ACE: Adaptive Communications Environment
      • Object-Oriented Framework/API
      • Implements many concurrent programming design patterns
      • Can be used to build more complex communications-based packages
        • For example, an ORB
  • Overview of ACE/TAO
    • TAO: The ACE ORB
      • Built on top of ACE
      • A CORBA implementation
      • Includes many (if not all) CORBA features specified by the OMG
        • Not just an ORB
        • Provides POA’s, CORBA Services, etc.
      • Object-Oriented Framework/API
  • CORBA Services: The Naming Service
    • The CORBA Naming Service is similar to the White Pages (phone book)
    • Servants place their “names,” along with their IOR’s, into the Naming Service
      • The Naming Service stores these as pairs
    • Later, Clients obtain IOR’s from the Naming Service by passing the name of the Servant object to it
      • The Naming Service returns the IOR
    • Clients may then use to make requests
  • CORBA Services: The Trading Service
    • The CORBA Naming Service is similar to the Yellow Pages (phone book)
    • Servants place a description of the services they can provide (i.e. their “Trades”), along with their IOR’s, into the Trading Services
      • The Trading Service stores these
    • Clients obtain IOR’s from the Trading Service by passing the type(s) of Services they require
      • The Trading Service returns an IOR
    • Clients may then use to make requests
  • Multi-Threaded Issues Using CORBA
    • Server performance can be improved by using multiple threads
      • GUI Thread
      • Listening Thread
      • Processing Thread
    • Can also use multiple ORBs and POAs to improve performance
      • Requires a multi-threaded solution
  • What is Grid Computing?
    • Computational Grids
      • Homogeneous (e.g., Clusters)
      • Heterogeneous (e.g., with one-of-a-kind instruments)
    • Cousins of Grid Computing
    • Methods of Grid Computing
  • Computational Grids
    • A network of geographically distributed resources including computers, peripherals, switches, instruments, and data.
    • Each user should have a single login account to access all resources.
    • Resources may be owned by diverse organizations.
  • Computational Grids
    • Grids are typically managed by gridware.
    • Gridware can be viewed as a special type of middleware that enable sharing and manage grid components based on user requirements and resource attributes (e.g., capacity, performance, availability…)
  • Cousins of Grid Computing
    • Parallel Computing
    • Distributed Computing
    • Peer-to-Peer Computing
    • Many others: Cluster Computing, Network Computing, Client/Server Computing, Internet Computing, etc...
  • Distributed Computing
    • People often ask: Is Grid Computing a fancy new name for the concept of distributed computing?
    • In general, the answer is “no.” Distributed Computing is most often concerned with distributing the load of a program across two or more processes.
  • PEER2PEER Computing
    • Sharing of computer resources and services by direct exchange between systems.
    • Computers can act as clients or servers depending on what role is most efficient for the network.
  • Methods of Grid Computing
    • Distributed Supercomputing
    • High-Throughput Computing
    • On-Demand Computing
    • Data-Intensive Computing
    • Collaborative Computing
    • Logistical Networking
  • Distributed Supercomputing
    • Combining multiple high-capacity resources on a computational grid into a single, virtual distributed supercomputer.
    • Tackle problems that cannot be solved on a single system.
  • High-Throughput Computing
    • Uses the grid to schedule large numbers of loosely coupled or independent tasks, with the goal of putting unused processor cycles to work.
  • On-Demand Computing
    • Uses grid capabilities to meet short-term requirements for resources that are not locally accessible.
    • Models real-time computing demands.
  • Data-Intensive Computing
    • The focus is on synthesizing new information from data that is maintained in geographically distributed repositories, digital libraries, and databases.
    • Particularly useful for distributed data mining.
  • Collaborative Computing
    • Concerned primarily with enabling and enhancing human-to-human interactions.
    • Applications are often structured in terms of a virtual shared space.
  • Logistical Networking
    • Global scheduling and optimization of data movement.
    • Contrasts with traditional networking, which does not explicitly model storage resources in the network.
    • Called &quot;logistical&quot; because of the analogy it bears with the systems of warehouses, depots, and distribution channels.
  • Globus
    • A collaboration of Argonne National Laboratory’s Mathematics and Computer Science Division, the University of Southern California’s Information Sciences Institute, and the University of Chicago's Distributed Systems Laboratory.
    • Started in 1996 and is gaining popularity year after year.
  • Globus
    • A project to develop the underlying technologies needed for the construction of computational grids.
    • Focuses on execution environments for integrating widely-distributed computational platforms, data resources, displays, special instruments and so forth.
  • The Globus Toolkit
    • The Globus Resource Allocation Manager (GRAM)
      • Creates, monitors, and manages services.
      • Maps requests to local schedulers and computers.
    • The Grid Security Infrastructure (GSI)
      • Provides authentication services.
  • The Globus Toolkit
    • The Monitoring and Discovery Service (MDS)
      • Provides information about system status, including server configurations, network status, and locations of replicated datasets, etc.
    • Nexus and globus_io
      • provides communication services for heterogeneous environments.
  • What are Clouds?
    • Clouds are “Virtual Clusters” (“Virtual Grids”) of possibly “Virtual Machines”
      • They may cross administrative domains or may “just be a single cluster”; the user cannot and does not want to know
    • Clouds support access (lease of) computer instances
      • Instances accept data and job descriptions (code) and return results that are data and status flags
    • Each Cloud is a “Narrow” (perhaps internally proprietary) Grid
    • Clouds can be built from Grids
    • Grids can be built from Clouds
  • Virtualization and Cloud Computing
    • The Virtues of Virtualization
      • Portable environments, enforcement and isolation, fast to deploy, suspend/resume, migration…
    • Cloud computing
      • SaaS: software as a service
      • Service: provide me with a workspace
      • Virtualization makes it easy to provide a workspace/VM
    • Cloud computing
      • resource leasing, utility computing, elastic computing
      • Amazon’s Elastic Compute Cloud (EC2)
    • Is this real? Or is this just a proof-of-concept?
      • Successfully used commercially on a large scale
      • More experience for scientific applications
    Virtual Workspaces: http//workspace.globus.org
  • Two major types of cloud
    • Compute and Data Cloud
      • EC2, Google Map Reduce, Science clouds
      • Provision platform for running science codes
      • Open source infrastructure: workspace, eucalyptus, hub0
      • Virtualization: providing environments as VMs
    • Hosting Cloud
      • GoogleApp Engine
      • Highly-available, fault tolerance, robustness, etc for Web capabilities
      • Community example: IU hosting environment (quarry)
    Virtual Workspaces: http//workspace.globus.org
  • Technical Questions on Clouds
    • How is data compute affinity tackled in clouds?
      • Co-locate data and compute clouds?
      • Lots of optical fiber i.e. “just” move the data?
    • What happens in clouds when demand for resources exceeds capacity – is there a multi-day job input queue?
      • Are there novel cloud scheduling issues?
    • Do we want to link clouds (or ensembles as atomic clouds); if so how and with what protocols
    • Is there an intranet cloud e.g. “cloud in a box” software to manage personal (cores on my future 128 core laptop) department or enterprise cloud?
  • Thanks Much..
    • 99% of the slides are taken from the Internet from various Authors. Thanks to all of them!
    • Sudarsun Santhiappan
    • Director – R & D
    • Burning Glass Technologies
    • Kilpauk, Chennai 600010