TThheeSSuuppeerr CCoommppuutteerr
PPAARRAALLLLEEXX –– TTHHEE SSUUPPEERR CCOOMMPPUUTTEERRA PROJECT REPORTSubmitted byMr. AMIT KUMARMr. ANKIT SINGHMr. SUSHANT...
CERTIFICATECertified that this project report “Parallex - The Super Computer” isthe bonafide work ofMr. AMIT KUMAR (Seat N...
AcknowledgmentThe success of any project is never limited to an individual undertakingthe project. It is the collective ef...
AbstractParallex is a parallel processing cluster consisting of control nodes andexecution nodes. Our implementation remov...
Table of ContentsChapter No. Title Page No.LIST OF FIGURES ILIST OF TABLES II1. A General Introduction1.1 Basic concepts 1...
4. SYSTEM DESIGN 215. IMPLEMENTATION DETAIL 245.1 Hardware architecture 245.2 Software architecture 265.3 Description for ...
10. CONCLUSION 6711. FUTURE ENHANCEMENT 6812. REFERENCE 69APPENDIX A 70 – 77APPENDIX B 78 – 88GLOSSARY 89 – 92MEMORABLE JO...
I. LIST OF FIGURES:1.1 High-performance distributed system.1.2 Transistor vs. Clock Speed4.1 Design Framework4.2 Parallex ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of ...
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Parallex - The Supercomputer
Upcoming SlideShare
Loading in...5
×

Parallex - The Supercomputer

1,255

Published on

Bachelor thesis - Parallex - The Supercomputer. Memorable days.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,255
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Parallex - The Supercomputer"

  1. 1. TThheeSSuuppeerr CCoommppuutteerr
  2. 2. PPAARRAALLLLEEXX –– TTHHEE SSUUPPEERR CCOOMMPPUUTTEERRA PROJECT REPORTSubmitted byMr. AMIT KUMARMr. ANKIT SINGHMr. SUSHANT BHADKAMKARin partial fulfillment for the award of the degreeOfBACHELOR OF ENGINEERINGINCOMPUTER SCIENCEGUIDE: MR. ANIL KADAMAISSMS’S COLLEGE OF ENGINEERING, PUNEUNIVERSITY OF PUNE2007 - 2008
  3. 3. CERTIFICATECertified that this project report “Parallex - The Super Computer” isthe bonafide work ofMr. AMIT KUMAR (Seat No. :: B3*****7)Mr. ANKIT SINGH (Seat No. :: B3*****8)Mr. SUSHANT BHADKAMKAR (Seat No. :: B3*****2)who carried out the project work under my supervision.Prof. M. A. Pradhan Prof. Anil KadamHEAD OF DEPARTMENT GUIDE
  4. 4. AcknowledgmentThe success of any project is never limited to an individual undertakingthe project. It is the collective effort of people around the individual thatspell success. There are some key personalities involved whose role hasbeen very vital to pave way for the success of the project. We take theopportunity to express our sincere thanks and gratitude to them.We would like to thank all the faculties (teaching & non-teaching) ofComputer Engineering Department of AISSMS College of Engineering,Pune. Our project guide Prof. Anil Kadam was very generous in histime and knowledge with us. We are grateful to Mr. ShasikantAthavale who was the source of constant motivation and inspiration forus. We are very thankful and obliged by the valuable suggestionsconstantly given by Prof. Nitin Talhar and Ms. Sonali Nalamwarwhich proved to be very helpful for the success of our project. Ourdeepest gratitude to Prof. M. A. Pradhan for her thoughtful commentsaccompanied with her gentle support during the academics.We would like to thank the college authorities for providing us with fullsupport regarding lab, network and related software.
  5. 5. AbstractParallex is a parallel processing cluster consisting of control nodes andexecution nodes. Our implementation removes all the requirements of kernel levelmodification and kernel patches to run a Beowulf cluster system. There can be manycontrol nodes in a typical Parallex cluster. The many control nodes will no longer justmonitor but will also take part in execution if resources permit. We have removed allthe restrictions of kernel, architecture and platform dependencies making out clustersystem work with completely different sets of CPU powers, operating systems, andarchitectures, that too without the use of any existing parallel libraries, such as MPIand PVM.With a radically new perspective of how parallel system is supposed to be, wehave implemented our own distribution algorithms and parallel algorithms aimed atease of administration and simplicity of usage, without compromising the efficiency.With a fully modular 7-step design we attack the traditional complications anddeficiencies in existing parallel system, such as redundancy, scheduling, clusteraccounting and parallel monitoring.A typical Parallex cluster may consist of a few old-386 running NetBSD,some ultra modern Intel – Dual Core running Linux, and some server class MIPSprocessor running IRIX, all working in parallel with full homogeneity.
  6. 6. Table of ContentsChapter No. Title Page No.LIST OF FIGURES ILIST OF TABLES II1. A General Introduction1.1 Basic concepts 11.2 Promises and Challenges 51.2.1 Processing technology 61.2.2 Networking technology 61.2.3 Software tools and technology 71.3 Current scenario 81.3.1 End user perspectives 81.3.2 Industrial perspective 81.3.3 Developers, researchers & scientists perspective 91.4 Obstacles and Why we don’t have 10 GHz today 91.5 Myths and Realities: 2 x 3 GHz < 6GHz 101.6 The problem statement 111.7 About PARALLEX 111.8 Motivation 121.9 Feature of PARALLEX 131.10 Why our design is “alternative” to parallel system 131.11 Innovation 142. REQURIREMENT ANALYSIS 162.1 Determining the overall mission of Parallex 162.2 Functional requirement for Parallex system 162.3 Non-functional requirement for system 173. PROJECT PLAN 19
  7. 7. 4. SYSTEM DESIGN 215. IMPLEMENTATION DETAIL 245.1 Hardware architecture 245.2 Software architecture 265.3 Description for software behavior 285.3.1 Events 325.3.2 States 326. TECNOLOGIES USED 336.1 General terms 337. TESTING 358. COST ESTIMATION 449. USER MANUAL 459.1 Dedicated cluster setup 459.1.1 BProc Configuration 459.1.2 Bringing up BProc 479.1.3 Build phase 2 image 489.1.4 Loading phase 2 image 489.1.5 Using the cluster 499.1.6 Managing the cluster 509.1.7 Troubleshooting techniques 519.2 Share cluster setup 529.2.1 DHCP 529.2.2 NFS 549.2.2.1 Running NFS 559.2.3 SSH 579.2.3.1 Using SSH 609.2.4 Host file and name service 659.3 Working with PARALLEX 65
  8. 8. 10. CONCLUSION 6711. FUTURE ENHANCEMENT 6812. REFERENCE 69APPENDIX A 70 – 77APPENDIX B 78 – 88GLOSSARY 89 – 92MEMORABLE JOURNEY (PHOTOS) 93 – 95PARALLEX ACHIEVEMENTS 96 - 97
  9. 9. I. LIST OF FIGURES:1.1 High-performance distributed system.1.2 Transistor vs. Clock Speed4.1 Design Framework4.2 Parallex Design5.1 Parallel System H/W Architecture5.2 Parallel System S/W Architecture7.1 Cyclomatic Diagram for the system7.2 System Usage pattern7.3 Histogram7.4 One frame from Complex Rendering on Parallex: Simulation of anexplosionII. LIST OF TABLES:1.1 Project Plan7.1 Logic/ coverage/decidion Testing7.2 Functional Test7.3 Console Test cases7.4 Black box Testing7.5 Benchmark Results
  10. 10. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 1 -Chapter 1. A General Introduction1.1 BASIC CONCEPTSThe last two decades spawned a revolution in the world of computing; a move awayfrom central mainframe-based computing to network-based computing. Today,servers are fast achieving the levels of CPU performance, memory capacity, and I/Obandwidth once available only in mainframes, at cost orders of magnitude below thatof a mainframe. Servers are being used to solve computationally intensive problemsin science and engineering that once belonged exclusively to the domain ofsupercomputers. A distributed computing system is the system architecture that makesa collection of heterogeneous computers, workstations, or servers act and behave as asingle computing system. In such a computing environment, users can uniformlyaccess and name local or remote resources, and run processes from anywhere in thesystem, without being aware of which computers their processes are running on.Distributed computing systems have been studied extensively by researchers, and agreat many claims and benefits have been made for using such systems. In fact, it ishard to rule out any desirable feature of a computing system that has not been claimedto be offered by a distributed system [24]. However, the current advances inprocessing and networking technology and software tools make it feasible to achievethe following advantages:• Increased performance. The existence of multiple computers in a distributed systemallows applications to be processed in parallel and thus improves application andsystem performance. For example, the performance of a file system can be improvedby replicating its functions over several computers; the file replication allows severalapplications to access that file system in parallel. Furthermore, file replicationdistributes network traffic associated with file access across the various sites and thusreduces network contention and queuing delays.• Sharing of resources. Distributed systems are cost-effective and enable efficientaccess to all system resources. Users can share special purpose and sometimes
  11. 11. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 2 -expensive hardware and software resources such as database servers, compute servers,virtual reality servers, multimedia information servers, and printer servers, to namejust a few.• Increased extendibility. Distributed systems can be designed to be modular andadaptive so that for certain computations, the system will configure itself to include alarge number of computers and resources, while in other instances, it will just consistof a few resources. Furthermore, limitations in file system capacity and computingpower can be overcome by adding more computers and file servers to the systemincrementally.• Increased reliability, availability, and fault tolerance. The existence of multiplecomputing and storage resources in a system makes it attractive and cost-effective tointroduce fault tolerance to distributed systems. The system can tolerate the failure inone computer by allocating its tasks to another available computer. Furthermore, byreplicating system functions and/or resources, the system can tolerate one or morecomponent failures.• Cost-effectiveness. The performance of computers has been approximately doublingevery two years, while their cost has decreased by half every year during the lastdecade. Furthermore, the emerging high speed network technology [e.g., wave-division multiplexing, asynchronous transfer mode (ATM)] will make thedevelopment of distributed systems attractive in terms of the price/performance ratiocompared to that of parallel computers. These advantages cannot be achieved easilybecause designing a general purpose distributed computing system is several orders ofmagnitude more difficult than designing centralized computing systems—designing areliable general-purpose distributed system involves a large number of options anddecisions, such as the physical system configuration, communication network andcomputing platform characteristics, task scheduling and resource allocation policiesand mechanisms, consistency control, concurrency control, and security, to name justa few. The difficulties can be attributed to many factors related to the lack of maturityin the distributed computing field, the asynchronous and independent behavior of the
  12. 12. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 3 -systems, and the geographic dispersion of the system resources. These aresummarized in the following points:• There is a lack of a proper understanding of distributed computing theory—the fieldis relatively new and we need to design and experiment with a large number ofgeneral-purpose reliable distributed systems with different architectures before we canmaster the theory of designing such computing systems. One interesting explanationfor the lack of understanding of the design process of distributed systems was givenby Mullender. Mullender compared the design of a distributed system to the design ofa reliable national railway system that took a century and half to be fully understoodand mature. Similarly, distributed systems (which have been around forapproximately two decades) need to evolve into several generations of differentdesign architectures before their designs, structures, and programming techniques canbe fully understood and mature.• The asynchronous and independent behavior of the system resources and/or(hardware and software) components complicate the control software that aims atmaking them operate as one centralized computing system. If the computers arestructured in a master–slave relationship, the control software is easier to develop andsystem behavior is more predictable. However, this structure is in conflict with thedistributed system property that requires computers to operate independently andasynchronously.• The use of a communication network to interconnect the computers introducesanother level of complexity. Distributed system designers not only have to master thedesign of the computing systems and system software and services, but also have tomaster the design of reliable communication networks, how to achievesynchronization and consistency, and how to handle faults in a system composed ofgeographically dispersed heterogeneous computers. The number of resourcesinvolved in a system can vary from a few to hundreds, thousands, or even hundreds ofthousands of computing and storage resources.Despite these difficulties, there has been limited success in designing special-purposedistributed systems such as banking systems, online transaction systems, and point-of-sale systems. However, the design of a general purpose reliable distributed system
  13. 13. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 4 -that has the advantages of both centralized systems (accessibility, management, andcoherence) and networked systems (sharing, growth, cost, and autonomy) is still achallenging task. Kleinrock makes an interesting analogy between the human-madecomputing systems and the brain. He points out that the brain is organized andstructured very differently from our present computing machines. Nature has beenextremely successful in implementing distributed systems that are far more intelligentand impressive than any computing machines humans have yet devised. We havesucceeded in manufacturing highly complex devices capable of high speedcomputation and massive accurate memory, but we have not gained sufficientunderstanding of distributed systems; our systems are still highly constrained andrigid in their construction and behavior. The gap between natural and man-madesystems is huge, and more research is required to bridge this gap and to design betterdistributed systems. In the next section we present a design framework to betterunderstand the architectural design issues involved in developing and implementinghigh performance distributed computing systems. A high-performance distributedsystem (HPDS) (Figure 1.1) includes a wide range of computing resources, such asworkstations, PCs, minicomputers, mainframes, supercomputers, and other special-purpose hardware units. The underlying network interconnecting the system resourcescan span LANs, MANs, and even WANs, can have different topologies (e.g., bus,ring, full connectivity, random interconnect), and can support a wide range ofcommunication protocols.
  14. 14. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 5 -Fig. 1.1 High-performance distributed system.1.2 PROMISES AND CHALLENGES OF PARALLEL ANDDISTRIBUTED SYSTEMSThe proliferation of high-performance systems and the emergence of high speednetworks (terabit networks) have attracted a lot of interest in parallel and distributedcomputing. The driving forces toward this end will be(1) The advances in processing technology,(2) The availability of high-speed network, and(3) The increasing research efforts directed toward the development of softwaresupport and programming environments for distributed computing.Further, with the increasing requirements for computing power and the diversity inthe computing requirements, it is apparent that no single computing platform willmeet all these requirements. Consequently, future computing environments need tocapitalize on and effectively utilize the existing heterogeneous computing resources.Only parallel and distributed systems provide the potential of achieving such anintegration of resources and technologies in a feasible manner while retaining desiredusability and flexibility. Realization of this potential, however, requires advances on a
  15. 15. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 6 -number of fronts: processing technology, network technology, and software tools andenvironments.1.2.1 Processing TechnologyDistributed computing relies to a large extent on the processing power of theindividual nodes of the network. Microprocessor performance has been growing at arate of 35 to 70 percent during the last decade, and this trend shows no indication ofslowing down in the current decade. The enormous power of the future generations ofmicroprocessors, however, cannot be utilized without corresponding improvements inmemory and I/O systems. Research in main-memory technologies, high-performancedisk arrays, and high-speed I/O channels are, therefore, critical to utilize efficientlythe advances in processing technology and the development of cost-effective highperformance distributed computing.1.2.2 Networking TechnologyThe performance of distributed algorithms depends to a large extent on the bandwidthand latency of communication among work nodes. Achieving high bandwidth andlow latency involves not only fast hardware, but also efficient communicationprotocols that minimize the software overhead. Developments in high-speed networksprovide gigabit bandwidths over local area networks as well as wide area networks atmoderate cost, thus increasing the geographical scope of high-performance distributedsystems.The problem of providing the required communication bandwidth for distributedcomputational algorithms is now relatively easy to solve given the mature state offiber-optic and optoelectronic device technologies. Achieving the low latenciesnecessary, however, remains a challenge. Reducing latency requires progress on anumber of fronts. First, current communication protocols do not scale well to a high-speed environment. To keep latencies low, it is desirable to execute the entire protocolstack, up to the transport layer, in hardware. Second, the communication interface ofthe operating system must be streamlined to allow direct transfer of data from thenetwork interface to the memory space of the application program. Finally, the speed
  16. 16. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 7 -of light (approximately 5 microseconds per kilometer) poses the ultimate limit tolatency. In general, achieving low latency requires a two-pronged approach:1. Latency reduction. Minimize protocol-processing overhead by using streamlinedprotocols executed in hardware and by improving the network interface of theoperating system.2. Latency hiding. Modify the computational algorithm to hide latency by pipeliningcommunication and computation. These problems are now perhaps most fundamentalto the success of parallel and distributed computing, a fact that is increasingly beingrecognized by the research community.1.2.3 Software Tools and EnvironmentsThe development of parallel and distributed applications is a nontrivial process andrequires a thorough understanding of the application and the architecture. Although aparallel and distributed system provides the user with enormous computing power anda great deal of flexibility, this flexibility implies increased degrees of freedom whichhave to be optimized in order to fully exploit the benefits of the distributed system.For example, during software development, the developer is required to select theoptimal hardware configuration for the particular application, the best decompositionof the problem on the hardware configuration selected, and the best communicationand synchronization strategy to be used, and so on. The set of reasonable alternativesthat have to be evaluated in such an environment is very large, and selecting the bestalternative among these is a nontrivial task. Consequently, there is a need for a set ofsimple and portable software development tools that can assist the developer inappropriately distributing the application computations to make efficient use of theunderlying computing resources. Such a set of tools should span the software lifecycle and must support the developer during each stage of application development,starting from the specification and design formulation stages, through theprogramming, mapping, distribution, scheduling phases, tuning, and debuggingstages, up to the evaluation and maintenance stages.
  17. 17. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 8 -1.3 Current ScenarioThe current scenario of the Parallel Systems can be viewed under threeperspectives. A common concept that applies to all of the following is the idea ofTotal Ownership Cost (TOC). By far TOC is a common scale on which level ofcomputer processing is assessed worldwide. TOC is defined by the ratio of Total Costof Implementation and maintenance by the net throughput the parallel cluster delivers.TOTAL COST OF IMPLEMENTATION AND MAINTENANCETOC = ------------------------------------------------------------------------------------NETSYSTEM THROUGHPUT (IN FLOATING POINT / SEC)1.3.1 End user perspectivesVarious activities such as rendering, adobe Photoshop applications anddifferent processes come under this category. As there is increase in need ofprocessing power day by day it thereby increases hardware cost. From the end userprospective the Parallel Systems aims to reduce the expenses and avoid thecomplexities. At this stage we are trying to implement a Parallel System which ismore cost effective and user friendly. However, as the end user, TOC is less importantin most cases because Parallel Clusters could rarely be owned by a single user, and inthat case the net throughput of the Parallel System becomes the most crucial factor.1.3.2 Industrial PerspectiveIn Corporate Sectors Parallel Systems are extensively implemented. Such aParallel Systems consist of machines that have to handle millions of nodestheoretically not practically. From the industrial point of view the Parallel Systemaims at resource isolation, replacing large scale dedicated commodity hardware andMainframes. Corporate sectors often place TOC as the primary criteria at which aParallel Cluster is judged. With increase in scalability, the cost of owing ParallelClusters shoot up to unmanageable heights and our primary aim is this area is to bringdown the TOC as much as possible.
  18. 18. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 9 -1.3.3 Developers, Researchers & Scientists PerspectiveScientific applications such as 3D simulations, high scale scientific rendering,intense numerical calculations, complex programming logic, and large scaleimplementation of algorithms (BLAS and FFT Libraries) require levels of processingand calculation that no modern day dedicated vector CPU could possibly meet.Consequently, the Parallel Systems are proven to be the only and the most efficientalternative in order to keep pace with modern day scientific advancements andresearch. TOC is rarely a matter of concern here.1.4 Obstacles and Why we don’t have 10 GHz today…Fig 1.2 Transistor vs. Clock SpeedCPU performance growth as we have known it hit a wallFigure graphs the history of Intel chip introductions by clock speed and number oftransistors. The number of transistors continues to climb, at least for now. Clockspeed, however, is a different story.
  19. 19. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 10 -Around the beginning of 2003, you’ll note a disturbing sharp turn in the previoustrend toward ever-faster CPU clock speeds. We have added lines to show the limittrends in maximum clock speed; instead of continuing on the previous path, asindicated by the thin dotted line, there is a sharp flattening. It has become harder andharder to exploit higher clock speeds due to not just one but several physical issues,notably heat (too much of it and too hard to dissipate), power consumption (too high),and current leakage problems.Sure, Intel has samples of their chips running at even higher speeds in thelab—but only by heroic efforts, such as attaching hideously impractical quantities ofcooling equipment. You won’t have that kind of cooling hardware in your office anyday soon, let alone on your lap while computing on the plane.1.5 Myths and Realities: 2 x 3GHz < 6 GHzSo a dual-core CPU that combines two 3GHz cores practically offers 6GHz ofprocessing power. Right?Wrong. Even having two threads running on two physical processors doesn’tmean getting two times the performance. Similarly, most multi-threaded applicationswon’t run twice as fast on a dual-core box. They should run faster than on a single-core CPU; the performance gain just isn’t linear, that’s all.Why not? First, there is coordination overhead between the cores to ensurecache coherency (a consistent view of cache, and of main memory) and to performother handshaking. Today, a two- or four-processor machine isn’t really two or fourtimes as fast as a single CPU even for multi-threaded applications. The problemremains essentially the same even when the CPUs in question sit on the same die.Second, unless the two cores are running different processes, or differentthreads of a single process that are well-written to run independently and almost neverwait for each other, they won’t be well utilized. (Despite this, we will speculate thattoday’s single-threaded applications as actually used in the field could actually see aperformance boost for most users by going to a dual-core chip, not because the extracore is actually doing anything useful, but because it is running the ad ware and spyware that infest many users’ systems and are otherwise slowing down the single CPU
  20. 20. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 11 -that user has today. We leave it up to you to decide whether adding a CPU to run yourspy ware is the best solution to that problem.)If you’re running a single-threaded application, then the application can onlymake use of one core. There should be some speedup as the operating system and theapplication can run on separate cores, but typically the OS isn’t going to be maxingout the CPU anyway so one of the cores will be mostly idle. (Again, the spy ware canshare the OS’s core most of the time.)1.6 The problem statementSo now let us summarize and define the problem statement:• Since the growth of requirements of processing is far greater than the growthof CPU power, and since the silicon chip is fast approaching its full capacity,the implementation of parallel processing at every level of computing becomesinevitable.• There is a need to have a single and complete clustering solution whichrequires minimum user interference but at the same time supportsediting/modifications to suit the user’s requirements.• There should be no need to modify the existing applications.• The parallel system must be able to support different platforms• The system should be able to fully utilize all the available hardware resourceswithout the need of buying any extra/special kind of hardware.1.7 About PARALLEXWhile the term parallel is often used to describe clusters, they are morecorrectly described as a type of distributed computing. Typically, the term parallelcomputing refers to tightly coupled sets of computation. Distributed computing isusually used to describe computing that spans multiple machines or multiplelocations. When several pieces of data are being processed simultaneously in the sameCPU, this might be called a parallel computation, but would never be described as adistributed computation. Multiple CPUs within a single enclosure might be used for
  21. 21. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 12 -parallel computing, but would not be an example of distributed computing. Whentalking about systems of computers, the term parallel usually implies a homogenouscollection of computers, while distributed computing typically implies a moreheterogeneous collection. Computations that are done asynchronously are more likelyto be called distributed than parallel. Clearly, the terms parallel and distributed lie ateither end of a continuum of possible meanings. In any given instance, the exactmeanings depend upon the context. The distinction is more one of connotations thanof clearly established usage.Parallex is both a parallel and distributed cluster because it supports both ideasof multiple CPUs within a single enclosure as well as a heterogeneous collectionof computers.1.8 MotivationThe motivation behind this project is to provide a cheap and easy to usesolution to cater to the high performance computing requirements of organizationswithout the need to install any expensive hardware.In many organizations including our college, we have observed that when oldsystems are replaced by newer ones the older ones are generally dumped or sold atthrow away prices. We also wanted to find a solution to effectively use this “siliconwaste”. These wasted resources can be easily added to our system as the processingneed increases, because the parallel system is linearly scalable and hardwareindependent. Thus the intent is to have an environment friendly and effectivesolution that utilizes all the available CPU power to execute applications faster.1.9 Features of Parallex• Parallex simplifies the cluster setup, configuration and management process.• It supports machines with hard disks as well as diskless machines running atthe same time.• It is flexible in design and easily adaptable.• Parallex does not require any special kind of hardware.
  22. 22. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 13 -• It is multi platform compatible.• It ensures efficient utilization of silicon waste (old unused hardware).• Parallex is scalable.How these features are achieved and details of design will be discussed in subsequentchapters.1.10 Why our design is “Alternative” to parallel system?Every renowned technology needs to evolve after a particular time as newgeneration enhances the sort come of the technology used earlier. So what weachieved is a bare bone line semantic of parallel system.When we were studying about the parallel and distributed system, theadvantage is that we were working on the latest technology. The parallel systemdesigned by scientist, no doubt were far more genius and intelligent than us. Oursystem is unique because we are actually splitting up the task according to processingpower of nodes instead of just load balancing. Hence a slow processing node will geta smaller task compared to a faster one and all nodes will show the output the samecalculated time on master node.We found some difficulties that how much task should be given to theheterogeneous system in order to get result at same time. We worked on this problemto find the solution and developed mathematical distribution algorithm which wassuccessfully implemented and functional. This algorithm breaks the task according tothe speed of the CPUs by sending a test application to all nodes and storing the returntime of each node into a file. Then we further worked on the automation of the entiresystem. We were using password less secure shell login and network file system. Wewere successful up to some extent but atomization was not possible to ssh and NFSconfiguration. Hence manually setting up of new nodes every time is a demerit of sshand NFS. To overcome this demerit we sorted the alternative solution which isBeowulf cluster, but after studying we concluded that it considered all nodes of sameconfiguration and send tasks equally to all nodes.To improve our system we think differently from Beowulf cluster. We tried tomake system more cost effective. We thought of diskless cluster concept in order getreed of hard disk to cut the cost and enhance the reliability of machine. The storage
  23. 23. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 14 -device will affect the performance of entire system and increase the cost (due toreplacement of the disks) and increase the waste of time in searching the faults. So,we studied & patched the Beowulf server & Beowulf distributed process spaceaccording to our need for our system. We made a kernel images for running disklessclusters using RARP protocol. When clusters runs kernel image in its memory, itdemands for IP from master node or can also be called as server. The server assignsIP & node number of the clusters. By this, our diskless clusters system stands & readyto use for parallel computing. Then we modified our various codes including our owndistribution algorithm, according to our new design. The best part of our system wasthat there is no need for any authorization setup. Every thing is now automatic.Till now, we were working on CODE LEVEL PARALLELISM. In this, welittle bit modify code to run on our system just like MPI libraries are used to makecode parallely executable. Now, the challenge with us was that what if we didn’t getsource code instead of which we will get binary file to execute it on our parallelsystem. So, now we need to enhance our system by adding BINARY LEVELPARALLELISM. We studied Open Mosix. Once open Mosix is installed & all thenodes are booted, the Open Mosix nodes see each other in the cluster and startexchanging information about their load level and resource usage. Once the loadincreases beyond the defined level, the process migrates to any other nodes on thenetwork. There might be a situation where process demands heavy resource usage, itmay happen that the process may keep migrating from node to node without beenserviced. This is the major design flaw of the Open Mosix. And we are working out tofind the solution.So, Our Design is ALTERNATIVE to all problems in the world of parallelcomputing.1.11 InnovationFirstly our system does not require any additional hardware if the existingmachines are well connected in a network. Secondly, even in a heterogeneousenvironment, with few fast CPUs and a few slower ones, the efficiency of the systemdoes not drop by more than 1 to 5%, still maintaining an efficiency of around 80% forsuitably adapted applications. This is because the mathematical distribution algorithm
  24. 24. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 15 -considers relative processing powers of the node distributing only the amount of loadthat a node can process in the calculated optimal time of the system. All the nodeswill process respective tasks and produce output at this calculated time. The mostimportant point about our system is the ability to use diskless nodes in cluster, therebyreducing hardware costs and space and the required maintenance. Also in case ofbinary executables (when source code is not available) our system exhibits almost20% performance gains.
  25. 25. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 16 -Chapter 2. Requirement Analysis2.1 Determining the overall mission of Parallex• User base: Students, educational institutes, small to medium businessorganizations.• Cluster usage: There will be one part of the cluster fully dedicated to solve theproblem at hand and an optional part where computing resources fromindividual workstations are used. In the latter part, the parallel problems willbe having lower priorities.• Software to be run on cluster: Depends upon the user base. At the clustermanagement level, the system software will be Linux.• Dedicated or shared cluster: As mentioned above it will be both.• Extent of the cluster: Computers that are all on the same subnet2.2 Functional Requirements for Parallex systemFunctional Requirement 1The PC’s must be connected in LAN so as to enable the system to be use without anyobstacles.Functional Requirement 2There will one master or controlling node which will distribute the task according tothe processing speed of the node.ServicesThree services are to be provided on the master.1. There is a Network Monitoring tool for resource discovery (e.g. IP address,MAC addresses, UP/DOWN Status etc.)2. The Distribution Algorithm will distribute the task according to the currentprocessing speed of the nodes.3. Parallex Master Script that will send the distributed task to the nodes and getback the result and integrate it and gives out the output.
  26. 26. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 17 -Functional Requirement 3The final size of the executable code so be such that it should reside in the limitedmemory constraints on the machine.Functional Requirement 4This product will only be used to speed up the applications which are preexisting inthe enterprise.2.3 Non-Functional Requirements for system- PerformanceEven in a heterogeneous environment, with few fast CPUs and a few slower ones, theefficiency of the system does not drop by more than 1 to 5%, still maintaining anefficiency of around 80% for suitably adapted applications. This is because themathematical distribution algorithm considers relative processing powers of the nodedistributing only the amount of load that a node can process in the calculated optimaltime of the system. All the nodes will process respective tasks and produce output atthis calculated time. The most important point about our system is the ability to usediskless nodes in cluster, thereby reducing hardware costs and space and the requiredmaintenance. Also in case of binary executables (when source code is not available)our system exhibits almost 20% performance gains.- CostWhile a system of n parallel processors is less efficient than one n times fasterprocessor, the Parallel System is often cheaper to build. Parallel computation is usedfor tasks which require very large amounts of computation, take a lot of time, and canbe divided into n independent subtasks. In recent years, most high performancecomputing systems, also known as supercomputers, have parallel architectures.
  27. 27. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 18 -- Manufacturing costsNo extra hardware required. Cost of setting up LAN.- BenchmarksThere are at least three reasons for running benchmarks. First, a benchmark willprovide us with a baseline. If we make changes to our cluster or if we suspectproblems with our cluster, we can rerun the benchmark to see if performance is reallyany different. Second, benchmarks are useful when comparing systems or clusterconfigurations. They can provide a reasonable basis for selecting betweenalternatives. Finally, benchmarks can be helpful with planning.For benchmarking we will use a 3D rendering tool named Povray (Persistence OfVision Ray tracer, please see the Appendix for more details).- Hardware requiredx686 Class PCs (Linux (2.6x Kernels) installed with intranet connection)Switch (100/10T)Serial port connectors100 BASE T LAN cable, RJ 45 connectors.- Software Resources RequiredLinux (2.6.x kernel)Intel Compiler suite (Noncommercial)LSB (Linux Standard Base) Set of GNU Kits with GNU CC/C++/F77/LD/ASGNU Krell monitorNumber of PC’s connected in LAN8 NODES in the LAN.
  28. 28. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 19 -Chapter 3. Project PlanPlan of execution for the project was as follows:SerialNo.Activity SoftwareUsedNumber OfDays1 Project Planninga) Choosing domainb) Identifying Key areas ofworkc) Requirement analysis- 102 Basic Installation of LINUX. LINUX (2.6xKernel)33 Brushing up on C programming Skills - 54 Shell Scripting LINUX (2.6xKernel), GNUBASH125 C Programming in LINUX Environment GNU CCompilerSuite56 A Demo Project (Universal SudokuSolver)To familiarize with LINUXprogramming environment.GNU CCompilerSuite , INTELCompiler suite(Non-commercial)167 Study Advanced LINUX tools andInstallation of Packages & RED HATRPMs.Iptraf, mc, tar,rpm, awk, sed,GNU plot,strace, gdb, etc.10
  29. 29. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 20 -8 Studying Networking Basics & Networkconfiguration in LINUX.- 89 Recompiling, Patching andanalyzing the system kernelLINUX (Kernel2.6x.x), GNU ccompiler310 Study & implementation of AdvancedNetworking Tools : SSH & NFSssh & Openssh,nfs711 a) Preparing the preliminary design ofthe total workflow of the project.b) Deciding the modules for overallexecution, and dividing the areas of theconcentration among the project group.c) Build Stage I prototypeAll of the above 1712 Build Stage II prototype(Replacing ssh by custom madeapplication)All of the above 1513 Build Stage III prototype(Making Diskless Cluster)All of the above 1014 Testing & Building Final Packages All of the above 10Table 1.1 Project Plan
  30. 30. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 21 -Chapter 4. System DesignGenerally speaking, the design process of a distributed system involves three mainactivities:(1) designing the communication system that enables the distributed system resourcesand objects to exchange information,(2) defining the system structure (architecture) and the system services that enablemultiple computers to act as a system rather than as a collection of computers, and(3) defining the distributed computing programming techniques to develop paralleland distributed applications.Based on this notion of the design process, the distributed system design frameworkcan be described in terms of three layers:(1) network, protocol, and interface (NPI) layer,(2) system architecture and services (SAS) layer, and(3) distributed computing paradigms (DCP) layer. In what follows, we describe themain design issues to be addressed in each layer.Fig. 4.1 Design Framework
  31. 31. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 22 -• Communication network, protocol, and interface layer. This layer describes themain components of the communication system that will be used for passing controland information among the distributed system resources. This layer is decomposedinto three sub layers: network type, communication protocols, and network interfaces.• Distributed system architecture and services layer. This layer represents thedesigner’s and system manager’s view of the system. SAS layer defines the structureand architecture and the system services (distributed file system, concurrency control,redundancy management, load sharing and balancing, security service, etc.) that mustbe supported by the distributed system in order to provide a single-image computingSystem.• Distributed computing paradigms layer. This layer represents the programmer(user) perception of the distributed system. This layer focuses on the programmingparadigms that can be used to develop distributed applications. Distributed computingparadigms can be broadly characterized based on the computation and communicationmodels. Parallel and distributed computations can be described in terms of twoparadigms: functional parallel and data parallel paradigms. In functional parallelparadigm, the computations are divided into distinct functions which are thenassigned to different computers. In data parallel paradigm, all the computers run thesame program, the same program multiple data (SPMD) stream, but each computeroperates on different data streams.With reference to Fig. 4.1, Parallex can be described as follows:
  32. 32. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 23 -Fig. 4.2 Parallex Design
  33. 33. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 24 -Chapter 5. Implementation DetailsThe goal of the project is to provide an efficient system that will handle processparallelism with the help of Clusters. This parallelism will thereby reduce the time ofexecution. Currently we form a cluster of 8 nodes. Using a single computer forexecution of any heavy process takes lot of time in execution. So here we are forminga cluster and executing those processes in parallel by dividing the process into numberof sub processes. Depending on the nodes in cluster we migrate the process to thosenode and when the execution is over then it brings back the output produced by themto the Master node. By doing this we are reducing the process execution time andincreasing the CPU utilization.5.1 Hardware ArchitectureWe have implemented a Shared Nothing Architecture of parallel system bymaking use of Coarse Grain Cluster structure. The inter-connect is ordinary 8-portswitch and an optionally a Class-B or Class-C network. It is 3 level architecture:1. Master topology2. Slave Topology3. Network interconnect1. Master is a Linux running machine with a 2.6.x or 2.4.x (both under testing)kernel. It runs the parallel-server and contains the application interface to drive theremaining machines. The master runs a network scanning script to detect all the slavesthat are alive and retrieves all the necessary information about each slave. Todetermine the load on each slave just before the processing of the main application,the master sends a small diagnostic application to the slave to estimate the load it cantake at the present moment. Having collected all the relevant information, it does allthe scheduling, implementing of parallel algorithms (distributing the tasks accordingprocessing power and current load), making use of CPU extensions (MMX, SSE,3DNOW) depending upon the slave architecture, and everything except the executionof the program itself. It accepts the input/task to be executed. It allocates the tasks to
  34. 34. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 25 -underlying slave nodes constituting the parallel system, which execute the tasks inparallel and return the output to the Master. Master plays the role of watchdog, whichmay or may not participate in actual processing But manages the entire task.2. Slave is a single system cluster image (SSCI). It is basically dedicated forprocessing purpose. It accepts the sub-task along with the necessary library modulesexecutes them and returns the output back to the Master. In our case, the slaves wouldbe multi-boot capable systems, which could at one point of time be diskless clusterhosts, at other time they might behave as a general purpose cluster node and at someother time, they could act as normal CPU handling routine tasks of office and homes.In case of Diskless Machines, the slave will boot on Pre-created kernel image patchedappropriately.3. Network interconnection is to merge both Master and Slave topologies. It makesuse of an 8-port switch, RJ 45 connectors and serial CAT 5 cables. It is a Startopology where the Master and the Slaves are interconnected through the Switch.Fig. 5.1 Parallel System H/W ArchitectureCluster Monitoring: Each slave runs a server that collects the kernel processing / IO/ memory / CPU and all the related details from PROC VIRTUAL file system and
  35. 35. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 26 -forwards it to the MASTER NODE (here acting as a slave to each server running oneach slave), and a user base programs plots it interactively on the Server screen thusshowing the CPU / MEMORY / IO details of each node separately.5.2 SOFTWARE ARCHITECTURE:-This architecture consists of two parts i.e.1. Master Architecture2. Slave ArchitectureMaster consists of following levels.1. Linux BIOS: Linux BIOS usually loads a Linux kernel.2. Linux: Platform on which Master runs.3. SSCI + Beoboot: This level extracts a single system cluster image used bySlave nodes.4. Fedora Core/ Red Hat: Actual Operating System running on Master.5. System Services: Essential Services running on Master. Eg. RARP ResolverDaemon.Slave inherits the Master with the following levels.1. Linux BIOS2. Linux3. SSCIFig 5.2 Parallel System S/W Architecture
  36. 36. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 27 -Parallex is broadly divided in to following Modules:1. Scheduler: this is the heart of out system. With radically new approachtowards data and instruction level distribution, we have implemented acompletely optimal heterogeneous cluster technology. We do task allocationbased on the actual processing capability on each node and not on the giveGHz power on the manual of the system. The task allocation is dynamic andthe scheduling policy is based on POSIX scheduling implementation. We arealso capable of implementing preemption, which we right now do not do infavour of the fact that system such as Linux and FreeBSD are capable ofindustry level preemption.2. Job/instruction alligator: this is a set of remote fork like utility that allocatesthe jobs to then nodes. Unlike traditional cluster technology, this job allocatoris capable of doing execution in disconnected mode that means that thenetwork latency would substantially reduce due to temporary disconnection.3. Accounting: we have written a utility “remote cluster monitor” which iscapable of providing us samples of results from all the nodes, informationabout the CPU load, temperature, and memory statistics. We propose that withless than 0.2% of CPU power consumption, our network monitoring utility cansample over 1000 nodes in less than 3 seconds.4. Authentication: all transactions between the nodes are 128 bit encrypted anddo not require root privileges to run. Just a common user on all the standalonenode must exist. For the diskless part, we remove this restriction as well.5. Resource discovery: we run our own socket layered resource discoveryutility, which discovers any additional nodes. Also reports if the resource hasbeen lost. In case of any additional hardware capable of being used as part ofparallel system, such as an additional processor to a system, or a replacementof processor with dual core processor is also reported continually.
  37. 37. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 28 -6. Synchronizer: the central balancing of the cluster. Since the cluster is capableof simultaneously running both the diskless, and standalone nodes as part ofthe same cluster, the synchronizer makes the result more reasonable in outputis queued in real time so that data is not mixed up. It does instructiondependency analysis, and also uses pipelines in the network to makeinterconnect more communicative.5.3 Description for software behaviorThe end user will submit the process/application to the administrator in casethe application is source based, and the Cluster administrator owns the responsibilityto explicitly parallelize the application for maximum exploitation of parallelarchitectures within the CPU and across the cluster nodes. In case the application isbinary ( non source), the user might himself/herself submit the code to Master nodeprogram acceptor, which in turn would run the application with somewhat lowerefficiency as compared to the source submissions to the administrator. Now the totalsystem is responsible for minimizing the time of processing which in turn increasesthe throughput and speed up the processing.
  38. 38. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 29 -
  39. 39. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 30 -
  40. 40. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 31 -
  41. 41. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 32 -5.3.1 Events1. System Installation2. Network initialization3. Server and host configuration4. Take input5. Parallel execution6. Send response5.3.2 States1. System Ready2. System Busy3. System Idle
  42. 42. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 33 -Chapter 6. Technologies Used6.1 General termsWe will now briefly define the general terms that will be used in further descriptionsor are related to our system.Cluster: - Interconnection of large number of computers working together in closesynchronized manner to achieve higher performance, scalability and netcomputational power.Master: - Server machine which acts as the administrator of the entire parallel Clusterand executes task scheduling.Slave: - A client node which executes the task as given by the Master.SSCI: - Single System Cluster Image is a hypothetical idea of implementing clusternodes into an image, where the cluster nodes will behave as if it were an additionalprocessor; add on ram etc. into the controlling Master computer. This is the basetheory of cluster level parallelism. Example implementations are, Multi node NUMA(IBM/Sequent) Multi-quad computers, SGI ATIX Servers. However, the idea of trueSSCI remains unimplemented when it comes to heterogeneous clusters for parallelprocessing, except for Supercomputing clusters such as Thunder and Earth Stimulator.RARP: - Reverse Address ResolutionProtocol is a network layer protocol used to resolve an IP address from agiven hardware address (such as an Ethernet address / MAC Address).
  43. 43. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 34 -BProc:-The Beowulf Distributed Process Space (BProc) is set of kernel modifications,utilities and libraries which allow a user to start processes on other machines in aBeowulf-style cluster. Remote processes started with this mechanism appear in theprocess table of the front end machine in a cluster. This allows remote processmanagement using the normal UNIX process control facilities. Signals aretransparently forwarded to remote processes and exit status is received using the usualwait() mechanisms.Having discussed the basic concepts of parallel and distributed systems, the problemsin this field, and an overview of Parallex, we now move forward with the requirementanalysis and design details of our system.
  44. 44. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 35 -Chapter 7. TestingLogic Coverage/Decision Based: Test casesSINo.Test case name TestProcedurePre-conditionExpectedResultReferenceto DetailedDesign1. Initial_frame_fail Initial framenot definedNone Parallexshouldgive error& exitDistributionalgo2. Final_frame_fail Final frame notdefinedNone Parallexshouldgive error& exitDistributionalgo3. Initial_final_full Initial & Finalframe givenNone Parallexshoulddistributeaccordingtto speed.DistributionAlgo.4. Input_file_name_blankNo input filegivenNone Input filenot foundParallexMaster5. Input_parameters_blankNo parametersdefined atcommand lineNone Exit onerrorParallexMasterTable 7.1 Logic/ coverage/decidion Testing
  45. 45. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 36 -Initial Functional Test Cases for ParallexUse CaseFunction BeingTestedInitial SystemStateInput Expected OutputSystemStartupMaster is startedwhen the switchis turned "on"Master is offActivate the"on" switchMaster ONSystemStartupNodes is startedwhen the switchis turned “on”Nodes is ONActivate the"on" switchNODES is ONSystemStartupNodes assignedIP by masterBootingGet boot Imagefrom MasterMaster shows thatnodes are UPSystemShutdownSystem is shutdown when theswitch is turned"off"System is on andnot servicing acustomerActivate the"off" switchSystem is offSystemShutdownConnection to theMaster isterminated whenthe system is shutdownSystem has justbeen shut downVerify from theMaster side that aconnection to theSlave no longerexistsSessionSystem reads acustomersProgramSystem is on andnot servicing acustomerInsert a readableCode/ProgramProgram acceptedSessionSystem rejects anunreadableProgramSystem is on andnot servicing acustomerInsert anunreadableCode/ programProgram isrejected; Systemdisplays an errorscreen; System isready to start a newsesion
  46. 46. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 37 -Use CaseFunction BeingTestedInitial SystemStateInput Expected OutputSystemStartupMaster is startedwhen the switchis turned "on"Master is offActivate the"on" switchMaster ONSessionSystem acceptscustomersProgramSystem is askingfor entry ofRANGE ofcalculationEnter a RANGESystem gets theRANGESessionSystem breaksthe taskSystem isbreaking taskaccording toprocessing speedof Nodes.PerformdistributionAlgoSystem breaks task& write into a file.SessionSystem feeds thetask to Nodes forprocessingSystem feedstasks to thenodes forexecutionSend tasksSystem displays amenu of taskrunning on NodesSessionSession endswhen all nodesgives out outputSystem isgetting output ofall nodes &display theoutput & endsGet the outputfrom all nodes.System displaysthe output & quit.Table 7.2 Functional Test
  47. 47. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 38 -Cyclomatic Complexity:Control Flow Graph of a System:Fig 7.1 Cyclomatic Diagram for the systemCyclomatic complexity is a software metric (measurement) in computationalcomplexity theory. It was developed by Thomas McCabe and is used to measure thecomplexity of a program. It directly measures the number of linearly independentpaths through a programs source code.Computation of Cyclomatic Complexity:In the above flow graphE = no. of edges = 9N = no. of nodes = 7M = E – N + 2= 9 – 7 + 2= 4
  48. 48. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 39 -Console And Black Box Testing:CONSOLE TEST CASESSr.No.Test Procedure Pre - Condition Expected Result Actual Result1Testing in LinuxterminalTerminalvariables havedefault valuesXterm related toolsare disabledNo graphicalinformationdisplayed2Invalid no. ofargumentsAll nodes are up Error message Proper Usage given3Pop-up terminalsfor differentnodesAll nodes are upNo of pop-ups =no. of cores in alivenodesNo of pop-ups = no.of cores in alivenodes43D Rendering onsingle machineAll necessary filesin placeLive 3D renderingShows frame beingrendered53D Rendering onParallex system.All nodes are up Status of rendering Rendered video6 Mplayer testing Rendered framesAnimation in .aviformatRenderedvideo(.avi)Table 7.3 Console Test cases
  49. 49. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 40 -BLACK BOX TEST CASESSr.No.Test Procedure Pre - Condition Expected Result Actual Result1 New Node up Node is DownStatus MessageDisplayed ByNetMon Tool.Message Node UP2 Node goes Down Nodes is UPStatus MessageDisplayed ByNetMon ToolMessage NodeDOWN3NodesInformationNodes are UPInternal Informationof NodesStatus, IP , MACaddr, RAM etc.4Main tasksubmissionApplication isCompiledNext module called(distribution algo)Processing speedof the nodes.5Main tasksubmission withfaulty input.Application isCompiledERRORDisplay error &EXIT6DistributionalgorithmGet RANGEBreak task accordingprocessing speed ofthe nodesBreaks TheRANGE &generates scripts7 Cluster feed script All nodes upTask sent toindividual machinesfor executionDisplay showstask executed oneach machine8 Result assemblyAll machines havereturned resultsFinal resultcalculationFinal resultdisplayed onscreen9 Fault toleranceMachine(s) goesdown in-betweenexecutionError recovery scriptis executedTask resent to allalive machinesTable 7.4 Black box Testing
  50. 50. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 41 -System Usage Specification outline:Fig 7.2 System Usage pattern :Fig 7.3 Histogram:
  51. 51. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 42 -Runtime BENCHMARK:Runtime Benchmark :Fig 7.4 One frame from Complex Rendering on Parallex: Simulation of an explosionThe following is the output comparison of same application with sameparameters being run on a Standalone Machine, Existing Beowulf Parallel Cluster,and Our Cluster System Parallex.Application: POVRAYHardware Specifications:NODE 0 P4 2.8 GHzNODE 1 Cor2DUO 2.8 GHzNODE 2 AMD 64, 2.01 GHzNODE 3 AMD 64, 1.80 GHzNODE 4 CELERON D,2.16 GHz
  52. 52. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 43 -Benchmark Results:Time SingleMachineExistingParallelSystems(4NODES)ParallexClusterSystem (4NODES)Real Time 14m 44.3 s 3m 41.61 s 3m 1.62 sUser Time 13m 33.2s 10m 4.67 s 9m 30.75 sSys Time 2m 2.26s 0m 2.26 s 0m 2.31sTable 7.5 Benchmark ResultsNote : User Time of Cluster is approximate sum of all per user system time per node.
  53. 53. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 44 -Chapter 8. Cost EstimationSince the growth of requirements of processing is far greater than the growthof CPU power, and since the silicon chip is fast approaching its full capacity, theimplementation of parallel processing at every level of computing becomes inevitable.Therefore we propose that in coming ages parallel processing and thealgorithms that sophisticate it, like the ones we have designed and implemented,would form the heart of modern computing. Not surprisingly, parallel processing hasalready begun to penetrate the modern computing marker directly in form of multicore processors such is Intel dual-core and quad-core processors.One of ours primary aims are simplistic implementation and leastadministrative overhead makes the implementation of Parallex simple and effective.Parallex can be easily deployed to all sectors of modern computing whereCPU intensive applications form an important part for its growth.While a system of n parallel processors is less efficient than one n times fasterprocessor, the Parallel System is often cheaper to build. Parallel computation is usedfor tasks which require very large amounts of computation, take a lot of time, and canbe divided into n independent subtasks. In recent years, most high performancecomputing systems, also known as supercomputers, have parallel architectures.Cost effectiveness is one of the major achievements of our Parallex system.We need no external or expensive hardware nor software, so price of our system is notbeen expensive. Our system is based on heterogeneous clusters in which power ofCPU is not an issue due to our mathematical distribution algorithm. Our systemefficiency will not drop by more than 5% due to fewer slower machines.So, we can say that we are using Silicon waste as challenge to our system,where we use out dated slower CPUs. Hence our system is Environment friendlydesign. One more feature of our system is that we are using diskless nodes which willreduce the total cost of system by approx. 20% as we are not using the storage devicesof nodes. Apart from separate storage device we will use a centralized storagesolution. Last but not the least our all software tools are Open source.Hence, we conclude that our Parallex system is one of the most cost effectivesystems in its genre.
  54. 54. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 45 -Chapter 9. User Manual9.1 Dedicated cluster setupFor the dedicated cluster with one master and many diskless slaves, all the user has todo is install the RPMs supplied in the installation disk on the master. The BProcconfiguration file will then be found at /etc/bproc/config.9.1.1 BProc ConfigurationMain configuration file:/etc/bproc/config• Edit with favorite text editor• Lines consist of comments (starting with #)• Rest are keyword followed by arguments• Specify interface:interface eth0 10.0.4.1 255.255.255.0• eth0 is interface connected to nodes• IP of master node is 10.0.4.1• Netmask of master node is 255.255.255.0• Interface will be configured when BProc is startedSpecify range of IP addresses for nodes:iprange 0 10.0.4.10 10.0.4.14• Start assigning IP addresses at node 0
  55. 55. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 46 -• First address is 10.0.4.10, last is 10.0.4.14• The size of this range determines the number of nodes in the cluster• Next entries are default libraries to be installed on nodes• Can explicitly specify libraries or extract library information from anexecutable• Need to add entry to install extra librarieslibrariesfrombinary /bin/ls /usr/bin/gdb• The bplib command can be used to see libraries that will be loadedNext line specifies the name of the phase 2 imagebootfile /var/bproc/boot.img• Should be no need to change this• Need to add a line to specify kernel command line• kernelcommandline apm=off console=ttyS0,19200• Turn APM support off (since these nodes don’t have any)• Set console to use ttyS0 and speed to 19200• This is used by beoboot command when building phase 2 imageFinal lines specify Ethernet addresses of nodes, examples given#node 0 00:50:56:00:00:00#node 00:50:56:00:00:01• Needed so node can learn its IP address from master• First 0 is optional, assign this address to node 0• Can automatically determine and add ethernet addresses using thenodeadd command
  56. 56. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 47 -• We will use this command later, so no need to change now• Save file and exit from editorOther configuration files/etc/bproc/config.boot• Specifies PCI devices that are going to be used by the nodes at boot time• Modules are included in phase 1 and phase 2 boot images• By default the node will try all network interfaces it can find/etc/bproc/node_up.conf• Specifies actions to be taken in order to bring a node up• Load modules• Configure network interfaces• Probe for PCI devices• Copy files and special devices out to node9.1.2 Bringing up BProcCheck BProc will be started at boot time# chkconfig --list clustermatic• Restart master daemon and boot server# service bjs stop# service clustermatic restart# service bjs start• Load the new configuration
  57. 57. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 48 -• BJS uses BProc, so needs to be stopped first• Check interface has been configured correctly# ifconfig eth0• Should have IP address we specified in config file9.1.3 Build a Phase 2 Image• Run the beoboot command on the master# beoboot -2 -n --plugin mon• -2 this is a phase 2 image• -n image will boot over network• --plugin add plugin to the boot image• The following warning messages can be safely ignoredWARNING: Didn’t find a kernel module called gmac.oWARNING: Didn’t find a kernel module called bmac.o• Check phase 2 image is available# ls -l /var/clustermatic/boot.img9.1.4 Loading the Phase 2 Image• Two Kernel Monte is a piece of software which will load a newLinux kernel replacing one that is already running
  58. 58. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 49 -• This allows you to use Linux as your boot loader!• Using Linux means you can use any network that Linux supports.• There is no PXE bios or Etherboot support for Myrinet, Quadrics or Infiniband• “Pink” network boots on Myrinet which allowed us to avoid buying a 1024port ethernet network• Currently supports x86 (including AMD64) and Alpha9.1.5 Using the Clusterbpsh• Migrates a process to one or more nodes• Process is started on front-end, but is immediately migrated onto nodes• Effect similar to rsh command, but no login is performed and no shell isstarted• I/O forwarding can be controlled• Output can be prefixed with node number• Run date command on all nodes which are up# bpsh -a -p date• See other arguments that are available# bpsh -hbpcp• Copies files to a node• Files can come from master node, or other nodes• Note that a node only has a ram disk by default• Copy /etc/hosts from master to /tmp/hosts on node 0
  59. 59. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 50 -# bpcp /etc/hosts 0:/tmp/hosts# bpsh 0 cat /tmp/hosts9.1.6 Managing the Clusterbpstat• Shows status of nodes• up node is up and available• down node is down or can’t be contacted by master• boot node is coming up (running node_up)• error an error occurred while the node was booting• Shows owner and group of node• Combined with permissions, determines who can start jobs on the node• Shows permissions of the node---x------ execute permission for node owner------x--- execute permission for users in node group---------x execute permission for other usersbpctl• Control a nodes status• Reboot node 1 (takes about a minute)# bpctl -S 1 –R• Set state of node 0# bpctl -S 0 -s groovy• Only up, down, boot and error have special meaning, everything elsemeans not down
  60. 60. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 51 -• Set owner of node 0# bpctl -S 0 -u nobody• Set permissions of node 0 so anyone can execute a job# bpctl -S 0 -m 111bplib• Manage libraries that are loaded on a node• List libraries to be loaded# bplib –l• Add a library to the list# bplib -a /lib/libcrypt.so.1• Remove a library from the list# bplib -d /lib/libcrypt.so.19.1.7 Troubleshooting techniques• The tcpdump command can be used to check for node activity during and after anode has booted• Connect a cable to serial port on node to check console output for errors in bootprocess• Once node reaches node_up processing, messages will be logged in/var/log/bproc/node.N (where N is node number)
  61. 61. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 52 -9.2 Shared Cluster SetupOnce you have the basic installation completed, youll need to configure the system.Many of the tasks are no different for machines in a cluster than for any other system.For other tasks, being part of a cluster impacts what needs to be done. The followingsubsections describe the issues associated with several services that require specialconsiderations.9.2.1 DHCPDynamic Host Configuration Protocol (DHCP) is used to supply networkconfiguration parameters, including IP addresses, host names, and other informationto clients as they boot. With clusters, the head node is often configured as a DHCPserver and the compute nodes as DHCP clients. There are two reasons to do this. First,it simplifies the installation of compute nodes since the information DHCP can supplyis often the only thing that is different among the nodes. Since a DHCP server canhandle these differences, the node installation can be standardized and automated. Asecond advantage of DHCP is that it is much easier to change the configuration of thenetwork. You simply change the configuration file on the DHCP server, restart theserver, and reboot each of the compute nodes.The basic installation is rarely a problem. The DHCP system can be installed as a partof the initial Linux installation or after Linux has been installed. The DHCP serverconfiguration file, typically /etc/dhcpd.conf, controls the information distributed tothe clients. If you are going to have problems, the configuration file is the most likelysource.The DHCP configuration file may be created or changed automatically when somecluster software is installed. Occasionally, the changes may not be done optimally oreven correctly so you should have at least a reading knowledge of DHCPconfiguration files. Here is a heavily commented sample configuration file thatillustrates the basics. (Lines starting with "#" are comments.)
  62. 62. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 53 -# A sample DHCP configuration file.# The first commands in this file are global,# i.e., they apply to all clients.# Only answer requests from known machines,# i.e., machines whose hardware addresses are given.deny unknown-clients;# Set the subnet mask, broadcast address, and router address.option subnet-mask 255.255.255.0;option broadcast-address 172.16.1.255;option routers 172.16.1.254;# This section defines individual cluster nodes.# Each subnet in the network has its own section.subnet 172.16.1.0 netmask 255.255.255.0 {group {# The first host, identified by the given MAC address,# will be named node1.cluster.int, will be given the# IP address 172.16.1.1, and will use the default router# 172.16.1.254 (the head node in this case).host node1{hardware ethernet 00:08:c7:07:68:48;fixed-address 172.16.1.1;option routers 172.16.1.254;
  63. 63. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 54 -option domain-name "cluster.int";}host node2{hardware ethernet 00:08:c7:07:c1:73;fixed-address 172.16.1.2;option routers 172.16.1.254;option domain-name "cluster.int";}# Additional node definitions go here.}}# For servers with multiple interfaces, this entry says to ignore requests# on specified subnets.subnet 10.0.32.0 netmask 255.255.248.0 { not authoritative; }As shown in this example, you should include a subnet section for each subnet onyour network. If the head node has an interface for the cluster and a second interfaceconnected to the Internet or your organizations network, the configuration file willhave a group for each interface or subnet. Since the head node should answer DHCPrequests for the cluster but not for the organization, DHCP should be configured sothat it will respond only to DHCP requests from the compute nodes.9.2.2 NFSA network filesystem is a filesystem that physically resides on one computer (the fileserver), which in turn shares its files over the network with other computers on thenetwork (the clients). The best-known and most common network filesystem isNetwork File System (NFS). In setting up a cluster, designate one computer as yourNFS server. This is often the head node for the cluster, but there is no reason it has to
  64. 64. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 55 -be. In fact, under some circumstances, you may get slightly better performance if youuse different machines for the NFS server and head node. Since the server is whereyour user files will reside, make sure you have enough storage. This machine is alikely candidate for a second disk drive or raid array and a fast I/O subsystem. Youmay even what to consider mirroring the filesystem using a small high-availabilitycluster.Why use an NFS? It should come as no surprise that for parallel programming youllneed a copy of the compiled code or executable on each machine on which it will run.You could, of course, copy the executable over to the individual machines, but thisquickly becomes tiresome. A shared filesystem solves this problem. Anotheradvantage to an NFS is that all the files you will be working on will be on the samesystem. This greatly simplifies backups. (You do backups, dont you?) A sharedfilesystem also simplifies setting up SSH, as it eliminates the need to distribute keys.(SSH is described later in this chapter.) For this reason, you may want to set up NFSbefore setting up SSH. NFS can also play an essential role in some installationstrategies.If you have never used NFS before, setting up the client and the server are slightlydifferent, but neither is particularly difficult. Most Linux distributions come with mostof the work already done for you.9.2.2.1 Running NFSBegin with the server; you wont get anywhere with the client if the server isntalready running. Two things need to be done to get the server running. The file/etc/exports must be edited to specify which machines can mount which directories,and then the server software must be started. Here is a single line from the file/etc/exports on the server amy:/home basil(rw) clara(rw) desmond(rw) ernest(rw) george(rw)This line gives the clients basil, clara, desmond, ernest, and george read/write accessto the directory /home on the server. Read access is the default. A number of other
  65. 65. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 56 -options are available and could be included. For example, the no_root_squash optioncould be added if you want to edit root permission files from the nodes.Had a space been inadvertently included between basil and (rw), read access wouldhave been granted to basil and read/write access would have been granted to all othersystems. (Once you have the systems set up, it is a good idea to use the commandshowmount -a to see who is mounting what.)Once /etc/exports has been edited, youll need to start NFS. For testing, you can usethe service command as shown here[root@fanny init.d]# /sbin/service nfs startStarting NFS services: [ OK ]Starting NFS quotas: [ OK ]Starting NFS mountd: [ OK ]Starting NFS daemon: [ OK ][root@fanny init.d]# /sbin/service nfs statusrpc.mountd (pid 1652) is running...nfsd (pid 1666 1665 1664 1663 1662 1661 1660 1657) is running...rpc.rquotad (pid 1647) is running...(With some Linux distributions, when restarting NFS, you may find it necessary toexplicitly stop and restart both nfslock and portmap as well.) Youll want to changethe system configuration so that this starts automatically when the system is rebooted.For example, with Red Hat, you could use the serviceconf or chkconfig commands.
  66. 66. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 57 -For the client, the software is probably already running on your system. You just needto tell the client to mount the remote filesystem. You can do this several ways, but inthe long run, the easiest approach is to edit the file /etc/fstab, adding an entry for theserver. Basically, youll add a line to the file that looks something like this:amy:/home /home nfs rw,soft 0 0In this example, the local system mounts the /home filesystem located on amy as the/home directory on the local machine. The filesystems may have different names. Youcan now manually mount the filesystem with the mount command[root@ida /]# mount /homeWhen the system reboots, this will be done automatically.When using NFS, you should keep a couple of things in mind. The mount point,/home, must exist on the client prior to mounting. While the remote directory ismounted, any files that were stored on the local system in the /home directory will beinaccessible. They are still there; you just cant get to them while the remote directoryis mounted. Next, if you are running a firewall, it will probably block NFS traffic. Ifyou are having problems with NFS, this is one of the first things you should check.File ownership can also create some surprises. User and group IDs should beconsistent among systems using NFS, i.e., each user will have identical IDs on allsystems. Finally, be aware that root privileges dont extend across NFS shared systems(if you have configured your systems correctly). So if, as root, you change thedirectory (cd) to a remotely mounted filesystem, dont expect to be able to look atevery file. (Of course, as root you can always use su to become the owner and do allthe snooping you want.) Details for the syntax and options can be found in the nfs(5),exports(5), fstab(5), and mount(8) manpages.9.2.3 SSH
  67. 67. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 58 -To run software across a cluster, youll need some mechanism to start processes oneach machine. In practice, a prerequisite is the ability to log onto each machine withinthe cluster. If you need to enter a password for each machine each time you run aprogram, you wont get very much done. What is needed is a mechanism that allowslogins without passwords.This boils down to two choices—you can use remote shell (RSH) or secure shell(SSH). If you are a trusting soul, you may want to use RSH. It is simpler to set up withless overhead. On the other hand, SSH network traffic is encrypted, so it is safe fromsnooping. Since SSH provides greater security, it is generally the preferred approach.SSH provides mechanisms to log onto remote machines, run programs on remotemachines, and copy files among machines. SSH is a replacement for ftp, telnet, rlogin,rsh, and rcp. A commercial version of SSH is available from SSH CommunicationsSecurity (http://www.ssh.com), a company founded by Tatu Ylönen, an originaldeveloper of SSH. Or you can go with OpenSSH, an open source version fromhttp://www.openssh.org.OpenSSH is the easiest since it is already included with most Linux distributions. Ithas other advantages as well. By default, OpenSSH automatically forwards theDISPLAY variable. This greatly simplifies using the X Window System across thecluster. If you are running an SSH connection under X on your local machine andexecute an X program on the remote machine, the X window will automatically openon the local machine. This can be disabled on the server side, so if it isnt working,that is the first place to look.There are two sets of SSH protocols, SSH-1 and SSH-2. Unfortunately, SSH-1 has aserious security vulnerability. SSH-2 is now the protocol of choice. This discussionwill focus on using OpenSSH with SSH-2.Before setting up SSH, check to see if it is already installed and running on yoursystem. With Red Hat, you can check to see what packages are installed using thepackage manager.[root@fanny root]# rpm -q -a | grep ssh
  68. 68. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 59 -openssh-3.5p1-6openssh-server-3.5p1-6openssh-clients-3.5p1-6openssh-askpass-gnome-3.5p1-6openssh-askpass-3.5p1-6This particular system has the SSH core package, both server and client software aswell as additional utilities. The SSH daemon is usually started as a service. As youcan see, it is already running on this machine.[root@fanny root]# /sbin/service sshd statussshd (pid 28190 1658) is running...Of course, it is possible that it wasnt started as a service but is still installed andrunning. You can use ps to double check.[root@fanny root]# ps -aux | grep sshroot 29133 0.0 0.2 3520 328 ? S Dec09 0:02 /usr/sbin/sshd...Again, this shows the server is running.With some older Red Hat installations, e.g., the 7.3 workstation, only the clientsoftware is installed by default. Youll need to manually install the server software. Ifusing Red Hat 7.3, go to the second install disk and copy over the fileRedHat/RPMS/openssh-server-3.1p1-3.i386.rpm. (Better yet, download the latest
  69. 69. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 60 -version of this software.) Install it with the package manager and then start theservice.[root@james root]# rpm -vih openssh-server-3.1p1-3.i386.rpmPreparing... ########################################### [100%]1:openssh-server ########################################### [100%][root@james root]# /sbin/service sshd startGenerating SSH1 RSA host key: [ OK ]Generating SSH2 RSA host key: [ OK ]Generating SSH2 DSA host key: [ OK ]Starting sshd: [ OK ]When SSH is started for the first time, encryption keys for the system are generated.Be sure to set this up so that it is done automatically when the system reboots.Configuration files for both the server, sshd_config, and client, ssh_config, can befound in /etc/ssh, but the default settings are usually quite reasonable. You shouldntneed to change these files.9.2.3.1 Using SSHTo log onto a remote machine, use the command ssh with the name or IP address ofthe remote machine as an argument. The first time you connect to a remote machine,you will receive a message with the remote machines fingerprint, a string thatidentifies the machine. Youll be asked whether to proceed or not. This is normal.[root@fanny root]# ssh amy
  70. 70. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 61 -The authenticity of host amy (10.0.32.139) cant be established.RSA key fingerprint is 98:42:51:3e:90:43:1c:32:e6:c4:cc:8f:4a:ee:cd:86.Are you sure you want to continue connecting (yes/no)? yesWarning: Permanently added amy,10.0.32.139 (RSA) to the list of known hosts.root@amys password:Last login: Tue Dec 9 11:24:09 2003[root@amy root]#The fingerprint will be recorded in a list of known hosts on the local machine. SSHwill compare fingerprints on subsequent logins to ensure that nothing has changed.You wont see anything else about the fingerprint unless it changes. Then SSH willwarn you and query whether you should continue. If the remote system has changed,e.g., if it has been rebuilt or if SSH has been reinstalled, its OK to proceed. But if youthink the remote system hasnt changed, you should investigate further before loggingin.Notice in the last example that SSH automatically uses the same identity whenlogging into a remote machine. If you want to log on as a different user, use the -loption with the appropriate account name.You can also use SSH to execute commands on remote systems. Here is an exampleof using date remotely.[root@fanny root]# ssh -l sloanjd hector datesloanjd@hectors password:
  71. 71. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 62 -Mon Dec 22 09:28:46 EST 2003Notice that a different account, sloanjd, was used in this example.To copy files, you use the scp command. For example,[root@fanny root]# scp /etc/motd george:/root/root@georges password:motd 100% |*****************************| 0 00:00Here file /etc/motd was copied from fanny to the /root directory on george.In the examples thus far, the system has asked for a password each time a commandwas run. If you want to avoid this, youll need to do some extra work. Youll need togenerate a pair of authorization keys that will be used to control access and then storethese in the directory ~/.ssh. The ssh-keygen command is used to generate keys.[sloanjd@fanny sloanjd]$ ssh-keygen -b1024 -trsaGenerating public/private rsa key pair.Enter file in which to save the key (/home/sloanjd/.ssh/id_rsa):Enter passphrase (empty for no passphrase):Enter same passphrase again:Your identification has been saved in /home/sloanjd/.ssh/id_rsa.Your public key has been saved in /home/sloanjd/.ssh/id_rsa.pub.
  72. 72. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 63 -The key fingerprint is:2d:c8:d1:e1:bc:90:b2:f6:6d:2e:a5:7f:db:26:60:3f sloanjd@fanny[sloanjd@fanny sloanjd]$ cd .ssh[sloanjd@fanny .ssh]$ ls -a. .. id_rsa id_rsa.pub known_hostsThe options in this example are used to specify a 1,024-bit key and the RSAalgorithm. (You can use DSA instead of RSA if you prefer.) Notice that SSH willprompt you for a pass phrase, basically a multi-word password.Two keys are generated, a public and a private key. The private key should never beshared and resides only on the client machine. The public key is distributed to remotemachines. Copy the public key to each system youll want to log onto, renaming itauthorized_keys2.[sloanjd@fanny .ssh]$ cp id_rsa.pub authorized_keys2[sloanjd@fanny .ssh]$ chmod go-rwx authorized_keys2[sloanjd@fanny .ssh]$ chmod 755 ~/.sshIf you are using NFS, as shown here, all you need to do is copy and rename the file inthe current directory. Since that directory is mounted on each system in the cluster, itis automatically available.If you used the NFS setup described earlier, roots homedirectory/root, is not shared. If you want to log in as root
  73. 73. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 64 -without a password, manually copy the public keys to the targetmachines. Youll need to decide whether you feel secure settingup the root account like this.You will use two utilities supplied with SSH to manage the login process. The first isan SSH agent program that caches private keys, ssh-agent. This program stores thekeys locally and uses them to respond to authentication queries from SSH clients. Thesecond utility, ssh-add, is used to manage the local key cache. Among other things, itcan be used to add, list, or remove keys.[sloanjd@fanny .ssh]$ ssh-agent $SHELL[sloanjd@fanny .ssh]$ ssh-addEnter passphrase for /home/sloanjd/.ssh/id_rsa:Identity added: /home/sloanjd/.ssh/id_rsa (/home/sloanjd/.ssh/id_rsa)(While this example uses the $SHELL variable, you can substitute the actual name ofthe shell you want to run if you wish.) Once this is done, you can log in to remotemachines without a password.This process can be automated to varying degrees. For example, you can add the callto ssh-agent as the last line of your login script so that it will be run before you makeany changes to your shells environment. Once you have done this, youll need to runssh-add only when you log in. But you should be aware that Red Hat console loginsdont like this change.You can find more information by looking at the ssh(1), ssh-agent(1), and ssh-add(1)manpages. If you want more details on how to set up ssh-agent, you might look atSSH, The Secure Shell by Barrett and Silverman, OReilly, 2001. You can also find

×