• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Lecture slides / notes.

Lecture slides / notes.






Total Views
Views on SlideShare
Embed Views



1 Embed 1

http://www.slideshare.net 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • The Grid Computing discipline involves the actual networking services and connections of a potentially unlimited number of ubiquitous computing devices within a "grid." This new innovative approach to computing can be most simply thought of as a massively large power "utility" grid, such as what provides power to our homes and businesses each and every day. This delivery of utility-based power has become second nature to many of us, worldwide. We know that by simply walking into a room and turning on the lights, the power will be directed to the proper devices of our choice for that moment in time. In this same utility fashion, Grid Computing openly seeks and is capable of adding an infinite number of computing devices into any grid environment, adding to the computing capability and problem resolution tasks within the operational grid environment. Utility-connected systems, also called "grid-connected" or "grid-tied" systems, are for homes or commercial buildings that are connected to an electric utility. They are designed to provide a modest part to all of the building's total electricity needs utility computing Utility computing is a service provisioning model in which a service provider makes computing resources and infrastructure management available to the customer as needed, and charges them for specific usage rather than a flat rate. Like other types of on-demand computing (such as grid computing ), the utility model seeks to maximize the efficient use of resources and/or minimize associated costs. The word utility is used to make an analogy to other services, such as electrical power, that seek to meet fluctuating customer needs, and charge for the resources based on usage rather than on a flat-rate basis. This approach, sometimes known as pay-per-use or metered services is becoming increasingly common in enterprise computing and is sometimes used for the consumer market as well, for Internet service, Web site access, file sharing , and other applications. Another version of utility computing is carried out within an enterprise. In a shared pool utility model, an enterprise centralizes its computing resources to serve a larger number of users without unnecessary redundancy. In general, provisioning means "providing". In telecommunications terminology, provisioning means providing a product or service, such as wiring or bandwidth.

Lecture slides / notes. Lecture slides / notes. Presentation Transcript

  • Lectures on Grid Computing Tu ğ ba Ta ş kaya-Temizel Prof. K. Ahmad January 2005
  • Grid Computing Everywhere
    • Business : Sectors like financial services, industrial manufacturing, energy…
    Humanitarian works Research : Health, Aerospace, Astronomy, Finance… Government
  • Grid Computing
    • The internet took 20 years to be taken seriously by business. By comparison the grid is happening far more rapidly. Tom Hawk, IBM
    • Insight Research says the worldwide market for grid technology and services is doubling every year and will reach $5 billion by 2008.
    • Grid computing is just one of the technologies the UK government says, in its latest report, should receive more support and funding. (December 17,2003)
  • Grid Computing
    • "We really do believe that grid computing is real," CEO of Hewlett-Packard Carly Fiorina said. "It is driving the R&D in our industry. For the first time our energy is focused on something else than building a killer app or a hot box. We are more focused on making system that combines the best of IT and business. Imagine what is possible." (September 11, 2003)
    • "The Grid will be the major new direction for IT," said Geoff Brown, technical director for ATS Core Technologies at Oracle . (October 28, 2002)
    • A network of high-voltage transmission lines and connections that supply electricity from a number of generating stations to various distribution centres in a country or a region, so that no consumer is dependent on a single station.
    • (Term) used of any network that serves a similar purpose for other services.
  • DEFINITIONS: Grid? GRID: The Grid is envisaged to be ‘the computing and data management infrastructure that will provide the electronic underpinning for a global society in business, government, science and entertainment’ Berman, Fox and Hey (2003:9)
  • DEFINITIONS: Grid? GRID: A virtual information processing environment where the user has the ‘illusion’ of a seamless single-source computing power which is actually distributed.
  • Why should you care?
    • Ian Foster explains why we should care Grids in three points:
    Vision Reality Future
  • Why should you care?
    • Grid is a disruptive technology [ Vision ]
      • It ushers in a virtualized, collaborative, distributed world.
    • Two interrelated opportunities
      • 1) Enhance economy, flexibility, access by virtualizing computing resources
      • 2) Deliver entirely new capabilities by integrating distributed resources
    Source : Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003
  • Why should you care? Virtualization Source : The Grid: Blueprint for a New Computing Infrastructure (2 nd Edition) , 2004 Application Virtualization
    • Automatically connect applications to services
    • Dynamic & intelligent
    • provisioning
    Infrastructure Virtualization
    • Dynamic & intelligent
    • provisioning
    • Automatic failover
  • Why should you care? Distributed System Integration UK e-Science Centres Source: http://www.nesc.ac.uk/centres/
  • Why should you care? Source : “The Anatomy of the Grid”, Foster, Kesselman, Tuecke, 2001 The real and specific problem that underlies the Grid concept is coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organization s .
  • Why should you care? Terminology
    • Grid has strong links with “Utility Computing”, “Autonomic Computing” and “Service Oriented Architecture”.
  • Why should you care?
    • Grid addresses pain points now [ Reality ]
      • Grids are built not bought, but are delivering real benefits in commercial settings
      • Low utilization of enterprise resources
      • High cost of provisioning for peak demand
      • Inadequate resources prevent use of advanced applications
      • Lack of information integration
    Source : Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003
  • Why should you care? Early Commercial Applications Source : Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003 Sources: IDC, 2000 and Bear Stearns- Internet 3.0 - 5/01 Analysis by SAI Grid Services Market Opportunity 2005
    • Leading adopters (Oct 2003) *
    • Financial services: 31%
    • Life sciences: 26%
    • Manufacturing: 18%
    * Grids 2004: From Rocket Science To Business Service , The 451 Group “ Gridified” Infrastructure Financial Services Derivatives Analysis Statistical Analysis Portfolio Risk Analysis Manufacturing Mechanical/ Electronic Design Process Simulation Finite Element Analysis Failure Analysis LS / Bioinformatics Cancer Research Drug Discovery Protein Folding Protein Sequencing Other Web Applications Weather Analysis Code Breaking/ Simulation Academic Energy Seismic Analysis Reservoir Analysis Entertainment Digital Rendering Massive Multi-Player Games Streaming Media
  • Why should you care? Grid Deployment Strategies
    • A range of excellent commercial & open source products for resource federation
      • Federate enterprise computing resources
      • Federate enterprise information resources
      • Globus Toolkit ® : inter-enterprise sharing
    • But, “Grids are built, not bought”
      • Integration with other enterprise systems is needed to deliver complete solution
    • Start small & with well-defined ROI case
      • Grow based on experience
    Source : Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003
  • Image courtesy Christian Richters: Source:Wired News Data Grids for High Energy Physics Fastest particle accelarator: Large Hadron Collider When completed in 2005, CERN's Large Hadron Collider will send protons and ions from hydrogen nuclei rushing through a 17-mile circular tunnel at speeds of up to 52,200,000 miles per hour.
  • Image courtesy Harvey Newman, Caltech Data Grids for High Energy Physics Tier2 Centre ~1 TIPS Online System Offline Processor Farm ~20 TIPS CERN Computer Centre FermiLab ~4 TIPS France Regional Centre Italy Regional Centre Germany Regional Centre Institute Institute Institute Institute ~0.25TIPS Pentium II 300 MHz Pentium II 300 MHz Pentium II 300 MHz Pentium II 300 MHz Physicist workstations ~100 MBytes/sec ~100 MBytes/sec ~622 Mbits/sec ~1 MBytes/sec HPSS HPSS There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Physics data cache ~PBytes/sec ~622 Mbits/sec or Air Freight (deprecated) HPSS HPSS HPSS Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS Caltech ~1 TIPS ~622 Mbits/sec Tier 0 Tier 1 Tier 2 Tier 4 1 TIPS is approximately 25,000 SpecInt95 equivalents
  • Mathematicians Solve NUG30 Quadratic Assignment Problem MetaNEOS: Argonne, Iowa, Northwestern, Wisconsin Source:Shawn McKee The Grid:The Future of High Energy Physics Computing? January 7,2002 Location 1 Location 2 Location 3 Location 4
    • The distances are:
      • d(1,2) = 22,
      • d(1,3) = 53,
      • d(2,3) = 40,
      • d(3,4) = 55.
    • The required flows between facilities are:
      • f(2,4) = 1,
      • f(1,4) = 2,
      • f(1,2) = 3,
      • f(3,4) = 4.
    The permutation p corresponding to this graphical solution is ( 2, 1, 4, 3 ).
  • Mathematicians Solve NUG30
    • Looking for the solution to the NUG30 quadratic assignment problem
    • An informal collaboration of mathematicians and computer scientists
    • Condor-G delivered 3.46E8 CPU seconds in 7 days (peak 1009 processors) in U.S. and Italy (8 sites)
      • NUG30 Solution: 14,5,28,24,1,3,16,15,
      • 10,9,21,2,4,29,25,22,
      • 13,26,17,30,6,20,19,
      • 8,18,7,27,12,11,23
    MetaNEOS: Argonne, Iowa, Northwestern, Wisconsin Source:Shawn McKee The Grid:The Future of High Energy Physics Computing? January 7,2002
  • Network for Earthquake Engineering Simulation
    • NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other
    • On-demand access to experiments, data streams, computing, archives, collaboration
    NEESgrid: Argonne, Michigan, NCSA, UIUC, USC
  • The 13.6 TF TeraGrid: Computing at 40 Gb/s 26 24 8 4 HPSS 5 HPSS HPSS UniTree External Networks External Networks External Networks External Networks Site Resources Site Resources Site Resources Site Resources NCSA/PACI 8 TF 240 TB SDSC 4.1 TF 225 TB Caltech Argonne TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne www.teragrid.org
  • iVDGL:International Virtual Data Grid Laboratory U.S. PIs: Avery, Foster, Gardner, Newman, Szalay www.ivdgl.org Image courtesy of http://www.sdss.org/news/releases/20050111.yardstick.html
    • Sloan Digital Sky Survey is the most ambitious astronomical survey project ever undertaken.
    • The survey will map in detail one-quarter of the entire sky, determining the positions and absolute brightnesses of more than 100 million celestial objects.
    • It will also measure the distances to more than a million galaxies and quasars
  • iVDGL:International Virtual Data Grid Laboratory U.S. PIs: Avery, Foster, Gardner, Newman, Szalay www.ivdgl.org Tier0/1 facility Tier2 facility 10 Gbps link 2.5 Gbps link 622 Mbps link Other link Tier3 facility
  • Why should you care?
    • An open Grid is to your advantage [ Future ]
      • Standards are being defined now that will determine the future of this technology
    Source : Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003
  • Grid Vision, Marketing, and Reality
    • Vision
      • Computing & data resources can be shared like content on the Wb
    • Marketing
      • Have we got a [Data, compute, knowledge, information, desktop, PC, enterprise, cluster, …] Grid for you!
    • Reality
      • Commercial products mostly noninteroperable
      • Open source tools offer de facto standards, but are also far from a complete solution
    Source : Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003
  • Standards Matter!
    • Open, standard protocols
      • Enable interoperability
      • Avoid product/vendor lock-in
      • Enable innovation/competition on end points
      • Enable ubiquity
    • In Grid space, must address how we
      • Describe, discover, & access resources
      • Monitor, manage, & coordinate, resources
      • Account & charge for resources
      • For many different types of resource
    Source : Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003
  • Open Grid Services Architecture
    • Define a service-oriented architecture …
      • the key to effective virtualization
    • … that addresses vital “Grid” requirements
      • AKA utility, on-demand, system management, collaborative computing
      • in particular, distributed service management
    • … building on Web services standards
      • extending those standards where needed
    “ The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration”, Foster, Kesselman, Nick, Tuecke, 2002
    • A family of six Web services specifications
      • A design pattern to specify how to use Web services to access “stateful” components
      • Message-based publish-subscribe to Web services
    Latest Step Forward: WS-Resource Framework Source : Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003 Groups References Notification Faults Properties Lifetime WS-Resource Framework
  • WS-Resource Framework Completes Grid-WS Convergence Grid Web The definition of WSRF means that Grid and Web communities can move forward on a common base Source : Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003 WSRF Started far apart in apps & tech OGSI GT2 GT1 HTTP WSDL, WS-* WSDL 2, WSDM Have been converging
  • The Evolution of the GRID Grid II: Grid is a platform for integrating loosely coupled applications: some components running in parallel and some for linking disparate resources largely developed in the serial-von-Neumann paradigm - storage, visualisation, a-d/d-a converters and sensors 2000 Grid 1: Extend the advances in parallel computing to geographically distributed systems 1990’s Parallel computing clusters - improved performance from tightly coupled clusters and data sharing 1980’s
  • The Evolution of the GRID
    • Currently there are (clusters) of very powerful computing/ communications systems
    • (i) Systems for acquiring digital data and processing data ( Amazon.com or Oracle clusters )
    • (ii) Systems for analysing and visualising information ( CERN’s large hadron collider, Protein Synthesis systems )
    • (iii) Systems for imaging, analysis and visualisation for distributed data ( weather prediction, satellite based military civilian systems )
    • (iv) Systems that can link Sensors and predict on real-time information ( military systems, video surveillance )
  • The Evolution of the GRID
    • Developments in networking technologies, operating systems, clustered data bases, application services and device technologies have enabled developers to build systems with literally distributed millions of nodes for providing:
    • Web-based services personal commercial transactions
    • Content delivery networks that can cache web-pages seamlessly
    • Wireless networks have spawned ad-hoc distributed systems that when linked to wide-area networks lead to a complex distributed system.
    • Problems of efficiency, reliability, accessibility and security are not addressed in ‘global’ terms.
  • The Evolution of the GRID Source: www.gridbus.org * Sputnik 1960 1970 1975 1980 1985 1990 1995 2000 * ARPANET * Email * Ethernet * TCP/IP * IETF * Internet Era * WWW Era * Mosaic * XML * PC Clusters * Crays * MPPs * Mainframes * HTML * W3C * P2P * Grids * XEROX PARC worm COMPUTING Communication * Web Services * Minicomputers * PCs * WS Clusters * PDAs * Workstations * HTC
  • The Evolution of the GRID Personal Device SMPs or SuperComputers Local Cluster Global Grid PERFORMANCE + Q o S
    • Individual
    • Group
    • Department
    • Campus
    • State
    • National
    • Globe
    • Inter Planet
    • Universe
    Administrative Barriers Enterprise Cluster/Grid Source: www.gridbus.org 2100 2100 2100 2100 2100 2100 2100 2100 2100
  • The Evolution of the GRID Grid is being developed not only to make distributed resources available to end-user not also to co-ordinate such usage  for sharing and aggregation of resources.
  • The Evolution of the GRID
    • Moore’s law improvements in computing produce highly functional end-systems
    • The internet and burgeoning wired and wireless provide wide-spread connectivity
    • Changing modes of working and problem solving emphasise teamwork, computation
    • Network growth produce dramatic changes in topology and geography
  • The Evolution of the GRID
    • The first generation involved proprietary solutions for sharing high-performance computing resources
    • The second generation introduced middleware to cope with scale and heterogeneity
    • The third generation introduced a service-oriented approach leading to commercial projects in addition to the scientific projects now collectively known as e-Science
  • The Evolution of the GRID
    • The first generation
      • FAFNER, I-WAY
    • The second generation
      • Technologies: Globus, Legion
      • Distributed object systems (Jini and RMI, The common component architecture form)
      • Grid resource brokers and schedulers
      • Grid portals
      • Integrated systems
      • Peer-to-Peer computing
    • The third generation
      • Service-oriented architecture (web services, OGSA, Agents)
      • Information aspects: relation with the World Wide Web
      • Live information systems
  • Source : Ian Foster’ s presentation on “The First 50 Years” , British Computer Society, Lovelace Medal Award Presentation, May, 2003 The Evolution of the GRID Globus Toolkit ® History DARPA, NSF, and DOE begin funding Grid work NASA begins funding Grid work, DOE adds support The Grid: Blueprint for a New Computing Infrastructure published GT 1.0.0 Released Early Application Successes Reported NSF & European Commission Initiate Many New Grid Projects Anatomy of the Grid Paper Released Significant Commercial Interest in Grids Physiology of the Grid Paper Released GT 2.0 Released Does not include downloads from: NMI, UK eScience, EU Datagrid, IBM, Platform, etc.
  • Building blocks of the Grid
    • Networks
    • Computational ‘nodes’ on the Grid
    • Pulling it all together
    • Common infrastructure: standards
  • GRID: Key Issues Development, Testing Application Computers, Services, Networks Hardware Economy, Management  Administration. Efficiency Access, Security, Networks Availability Discovery, Allocation, Scheduling Resources
  • GRID: Key Issues  Sharing
    • A biochemist will be able to exploit 10,000 computers to screen 100,000 compounds in an hour
    • 1,000 physicists worldwide will be able to pool resources for petop analyses of petabytes of data
    • A multidisciplinary analysis in aerospace couples code and data in geographically distributed organisations may be possible
    • Civil engineers colloborate to design, execute, and analyse shake table experiments
    • Climate scientists will be able to visualise, annotate, and analyse terabyte simulation datasets
  • GRID: Key Issues  Sharing Online Access to Scientific Instruments DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago tomographic reconstruction real-time collection wide-area dissemination desktop & VR clients with shared controls Advanced Photon Source archival storage
    • Resource
    • Network protocol
    • Network enabled service
    • Application Programming Interface(API)
    • Software Development Kit (SDK)
    • Syntax
  • MORE DEFINTIONS : Resource
    • An entity that is to be shared
      • E.g., computers, storage, data, software
    • Does not have to be physical entity
      • E.g., Condor pool, distributed file system,…
    • Defined in terms of interfaces, not devices
      • E.g. scheduler such as LSF and PBS define a compute resource
      • Open/close/read/write define access to a distributed file system, e.g NFS, AFS, DFS
  • MORE DEFINTIONS : Network protocol
    • A formal description of message formats and a set of rules for message exchange
      • Rules may define sequence of message exchanges
      • Protocol may define state-change in endpoint, e.g. file system state change
    • Good protocols designed to do one thing
      • Protocols can be layered
    • Examples of protocols
      • IP, TCP, TLS( was SSL), HTTP, Kerberos
  • MORE DEFINTIONS : Network enabled services
    • Implementation of a protocol that defines a set of capabilities
      • Protocol defines interaction with service
      • All services require protocols
      • Not all protocols are used to provide services (e.g. IP, TLS)
    • Examples: FTP and Web servers
  • MORE DEFINTIONS : Application Programming Interface (API)
    • A specification for a set of routines to facilitate application development
    • Spec often language specific (or IDL)
      • Routine name, number, order and type of arguments; mapping to language constructs
      • Behaviour or function of routine
    • Examples
      • GSS API(security), MPI (message passing)
  • MORE DEFINTIONS : Software Development Kit (SDK)
    • A particular instantiation of API
    • SDK consists of libraries and tools
      • Provides implementation of API specification
    • Can have multiple SDKs for an API
    • Examples of SDKs
      • MPICH, Motif Widgets
    • Rules for encoding information, e.g.
      • XML, Condor ClassAds, Globus RSL
    • Distinct from protocols
      • One syntax may be used by many protocols
    • Syntaxes may be layered
      • E.g., Condor ClassAds -> XML->ASCII
  • References
    • Berman F., Fox G., Hey T. (2003) Grid Computing: Making the Global Infrastructure a Reality , Chichester, John Willey & Sons Inc.
    • http://www.computing.surrey.ac. uk /courses/csm23/list.html
  • CSM23 Assessment and Weighting Implementation:20% IEEE Report:20% Presentation:10% Students are expected to implement a Grid project ad write IEEE formatted report about their projects. In addition, the students are asked to give a presentation. Project 20% Students are required to implement small-scale laboratory homework during the semester. Laboratory Exercise 20% Oral Examination 10% Students are required to write a 200 word summary of each of 5 key research papers Annotated bibliography Percentage weighting Method(s) Components of Assessment
  • CSM23 Timetable Mrs.Tugba Taskaya-Temizel Grid Applications 21/02/2005 Seminars Parallel Computing Security Parallel Computing Security Peer to Peer Computing Grid Architecture, Technologies and Resource Allocation Overview and Motivation Topic Mrs.Tugba Taskaya-Temizel 28/02,7/03, 14/03, 21/03 Dr.Roger M A Peel, Dr.James Heather 14/02/2005 Dr.Roger M A Peel, Dr.James Heather 7/02/2005 Dr Nick Antonopoulos 31/01/2005 Mrs.Tugba Taskaya-Temizel 24/01/2005 Mrs.Tugba Taskaya-Temizel 17/01/2005 Lecturer Date