Parallex - The Supercomputer

  • 1,121 views
Uploaded on

Bachelor thesis - Parallex - The Supercomputer. Memorable days.

Bachelor thesis - Parallex - The Supercomputer. Memorable days.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,121
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
2
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. TThheeSSuuppeerr CCoommppuutteerr
  • 2. PPAARRAALLLLEEXX –– TTHHEE SSUUPPEERR CCOOMMPPUUTTEERRA PROJECT REPORTSubmitted byMr. AMIT KUMARMr. ANKIT SINGHMr. SUSHANT BHADKAMKARin partial fulfillment for the award of the degreeOfBACHELOR OF ENGINEERINGINCOMPUTER SCIENCEGUIDE: MR. ANIL KADAMAISSMS’S COLLEGE OF ENGINEERING, PUNEUNIVERSITY OF PUNE2007 - 2008
  • 3. CERTIFICATECertified that this project report “Parallex - The Super Computer” isthe bonafide work ofMr. AMIT KUMAR (Seat No. :: B3*****7)Mr. ANKIT SINGH (Seat No. :: B3*****8)Mr. SUSHANT BHADKAMKAR (Seat No. :: B3*****2)who carried out the project work under my supervision.Prof. M. A. Pradhan Prof. Anil KadamHEAD OF DEPARTMENT GUIDE
  • 4. AcknowledgmentThe success of any project is never limited to an individual undertakingthe project. It is the collective effort of people around the individual thatspell success. There are some key personalities involved whose role hasbeen very vital to pave way for the success of the project. We take theopportunity to express our sincere thanks and gratitude to them.We would like to thank all the faculties (teaching & non-teaching) ofComputer Engineering Department of AISSMS College of Engineering,Pune. Our project guide Prof. Anil Kadam was very generous in histime and knowledge with us. We are grateful to Mr. ShasikantAthavale who was the source of constant motivation and inspiration forus. We are very thankful and obliged by the valuable suggestionsconstantly given by Prof. Nitin Talhar and Ms. Sonali Nalamwarwhich proved to be very helpful for the success of our project. Ourdeepest gratitude to Prof. M. A. Pradhan for her thoughtful commentsaccompanied with her gentle support during the academics.We would like to thank the college authorities for providing us with fullsupport regarding lab, network and related software.
  • 5. AbstractParallex is a parallel processing cluster consisting of control nodes andexecution nodes. Our implementation removes all the requirements of kernel levelmodification and kernel patches to run a Beowulf cluster system. There can be manycontrol nodes in a typical Parallex cluster. The many control nodes will no longer justmonitor but will also take part in execution if resources permit. We have removed allthe restrictions of kernel, architecture and platform dependencies making out clustersystem work with completely different sets of CPU powers, operating systems, andarchitectures, that too without the use of any existing parallel libraries, such as MPIand PVM.With a radically new perspective of how parallel system is supposed to be, wehave implemented our own distribution algorithms and parallel algorithms aimed atease of administration and simplicity of usage, without compromising the efficiency.With a fully modular 7-step design we attack the traditional complications anddeficiencies in existing parallel system, such as redundancy, scheduling, clusteraccounting and parallel monitoring.A typical Parallex cluster may consist of a few old-386 running NetBSD,some ultra modern Intel – Dual Core running Linux, and some server class MIPSprocessor running IRIX, all working in parallel with full homogeneity.
  • 6. Table of ContentsChapter No. Title Page No.LIST OF FIGURES ILIST OF TABLES II1. A General Introduction1.1 Basic concepts 11.2 Promises and Challenges 51.2.1 Processing technology 61.2.2 Networking technology 61.2.3 Software tools and technology 71.3 Current scenario 81.3.1 End user perspectives 81.3.2 Industrial perspective 81.3.3 Developers, researchers & scientists perspective 91.4 Obstacles and Why we don’t have 10 GHz today 91.5 Myths and Realities: 2 x 3 GHz < 6GHz 101.6 The problem statement 111.7 About PARALLEX 111.8 Motivation 121.9 Feature of PARALLEX 131.10 Why our design is “alternative” to parallel system 131.11 Innovation 142. REQURIREMENT ANALYSIS 162.1 Determining the overall mission of Parallex 162.2 Functional requirement for Parallex system 162.3 Non-functional requirement for system 173. PROJECT PLAN 19
  • 7. 4. SYSTEM DESIGN 215. IMPLEMENTATION DETAIL 245.1 Hardware architecture 245.2 Software architecture 265.3 Description for software behavior 285.3.1 Events 325.3.2 States 326. TECNOLOGIES USED 336.1 General terms 337. TESTING 358. COST ESTIMATION 449. USER MANUAL 459.1 Dedicated cluster setup 459.1.1 BProc Configuration 459.1.2 Bringing up BProc 479.1.3 Build phase 2 image 489.1.4 Loading phase 2 image 489.1.5 Using the cluster 499.1.6 Managing the cluster 509.1.7 Troubleshooting techniques 519.2 Share cluster setup 529.2.1 DHCP 529.2.2 NFS 549.2.2.1 Running NFS 559.2.3 SSH 579.2.3.1 Using SSH 609.2.4 Host file and name service 659.3 Working with PARALLEX 65
  • 8. 10. CONCLUSION 6711. FUTURE ENHANCEMENT 6812. REFERENCE 69APPENDIX A 70 – 77APPENDIX B 78 – 88GLOSSARY 89 – 92MEMORABLE JOURNEY (PHOTOS) 93 – 95PARALLEX ACHIEVEMENTS 96 - 97
  • 9. I. LIST OF FIGURES:1.1 High-performance distributed system.1.2 Transistor vs. Clock Speed4.1 Design Framework4.2 Parallex Design5.1 Parallel System H/W Architecture5.2 Parallel System S/W Architecture7.1 Cyclomatic Diagram for the system7.2 System Usage pattern7.3 Histogram7.4 One frame from Complex Rendering on Parallex: Simulation of anexplosionII. LIST OF TABLES:1.1 Project Plan7.1 Logic/ coverage/decidion Testing7.2 Functional Test7.3 Console Test cases7.4 Black box Testing7.5 Benchmark Results
  • 10. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 1 -Chapter 1. A General Introduction1.1 BASIC CONCEPTSThe last two decades spawned a revolution in the world of computing; a move awayfrom central mainframe-based computing to network-based computing. Today,servers are fast achieving the levels of CPU performance, memory capacity, and I/Obandwidth once available only in mainframes, at cost orders of magnitude below thatof a mainframe. Servers are being used to solve computationally intensive problemsin science and engineering that once belonged exclusively to the domain ofsupercomputers. A distributed computing system is the system architecture that makesa collection of heterogeneous computers, workstations, or servers act and behave as asingle computing system. In such a computing environment, users can uniformlyaccess and name local or remote resources, and run processes from anywhere in thesystem, without being aware of which computers their processes are running on.Distributed computing systems have been studied extensively by researchers, and agreat many claims and benefits have been made for using such systems. In fact, it ishard to rule out any desirable feature of a computing system that has not been claimedto be offered by a distributed system [24]. However, the current advances inprocessing and networking technology and software tools make it feasible to achievethe following advantages:• Increased performance. The existence of multiple computers in a distributed systemallows applications to be processed in parallel and thus improves application andsystem performance. For example, the performance of a file system can be improvedby replicating its functions over several computers; the file replication allows severalapplications to access that file system in parallel. Furthermore, file replicationdistributes network traffic associated with file access across the various sites and thusreduces network contention and queuing delays.• Sharing of resources. Distributed systems are cost-effective and enable efficientaccess to all system resources. Users can share special purpose and sometimes
  • 11. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 2 -expensive hardware and software resources such as database servers, compute servers,virtual reality servers, multimedia information servers, and printer servers, to namejust a few.• Increased extendibility. Distributed systems can be designed to be modular andadaptive so that for certain computations, the system will configure itself to include alarge number of computers and resources, while in other instances, it will just consistof a few resources. Furthermore, limitations in file system capacity and computingpower can be overcome by adding more computers and file servers to the systemincrementally.• Increased reliability, availability, and fault tolerance. The existence of multiplecomputing and storage resources in a system makes it attractive and cost-effective tointroduce fault tolerance to distributed systems. The system can tolerate the failure inone computer by allocating its tasks to another available computer. Furthermore, byreplicating system functions and/or resources, the system can tolerate one or morecomponent failures.• Cost-effectiveness. The performance of computers has been approximately doublingevery two years, while their cost has decreased by half every year during the lastdecade. Furthermore, the emerging high speed network technology [e.g., wave-division multiplexing, asynchronous transfer mode (ATM)] will make thedevelopment of distributed systems attractive in terms of the price/performance ratiocompared to that of parallel computers. These advantages cannot be achieved easilybecause designing a general purpose distributed computing system is several orders ofmagnitude more difficult than designing centralized computing systems—designing areliable general-purpose distributed system involves a large number of options anddecisions, such as the physical system configuration, communication network andcomputing platform characteristics, task scheduling and resource allocation policiesand mechanisms, consistency control, concurrency control, and security, to name justa few. The difficulties can be attributed to many factors related to the lack of maturityin the distributed computing field, the asynchronous and independent behavior of the
  • 12. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 3 -systems, and the geographic dispersion of the system resources. These aresummarized in the following points:• There is a lack of a proper understanding of distributed computing theory—the fieldis relatively new and we need to design and experiment with a large number ofgeneral-purpose reliable distributed systems with different architectures before we canmaster the theory of designing such computing systems. One interesting explanationfor the lack of understanding of the design process of distributed systems was givenby Mullender. Mullender compared the design of a distributed system to the design ofa reliable national railway system that took a century and half to be fully understoodand mature. Similarly, distributed systems (which have been around forapproximately two decades) need to evolve into several generations of differentdesign architectures before their designs, structures, and programming techniques canbe fully understood and mature.• The asynchronous and independent behavior of the system resources and/or(hardware and software) components complicate the control software that aims atmaking them operate as one centralized computing system. If the computers arestructured in a master–slave relationship, the control software is easier to develop andsystem behavior is more predictable. However, this structure is in conflict with thedistributed system property that requires computers to operate independently andasynchronously.• The use of a communication network to interconnect the computers introducesanother level of complexity. Distributed system designers not only have to master thedesign of the computing systems and system software and services, but also have tomaster the design of reliable communication networks, how to achievesynchronization and consistency, and how to handle faults in a system composed ofgeographically dispersed heterogeneous computers. The number of resourcesinvolved in a system can vary from a few to hundreds, thousands, or even hundreds ofthousands of computing and storage resources.Despite these difficulties, there has been limited success in designing special-purposedistributed systems such as banking systems, online transaction systems, and point-of-sale systems. However, the design of a general purpose reliable distributed system
  • 13. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 4 -that has the advantages of both centralized systems (accessibility, management, andcoherence) and networked systems (sharing, growth, cost, and autonomy) is still achallenging task. Kleinrock makes an interesting analogy between the human-madecomputing systems and the brain. He points out that the brain is organized andstructured very differently from our present computing machines. Nature has beenextremely successful in implementing distributed systems that are far more intelligentand impressive than any computing machines humans have yet devised. We havesucceeded in manufacturing highly complex devices capable of high speedcomputation and massive accurate memory, but we have not gained sufficientunderstanding of distributed systems; our systems are still highly constrained andrigid in their construction and behavior. The gap between natural and man-madesystems is huge, and more research is required to bridge this gap and to design betterdistributed systems. In the next section we present a design framework to betterunderstand the architectural design issues involved in developing and implementinghigh performance distributed computing systems. A high-performance distributedsystem (HPDS) (Figure 1.1) includes a wide range of computing resources, such asworkstations, PCs, minicomputers, mainframes, supercomputers, and other special-purpose hardware units. The underlying network interconnecting the system resourcescan span LANs, MANs, and even WANs, can have different topologies (e.g., bus,ring, full connectivity, random interconnect), and can support a wide range ofcommunication protocols.
  • 14. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 5 -Fig. 1.1 High-performance distributed system.1.2 PROMISES AND CHALLENGES OF PARALLEL ANDDISTRIBUTED SYSTEMSThe proliferation of high-performance systems and the emergence of high speednetworks (terabit networks) have attracted a lot of interest in parallel and distributedcomputing. The driving forces toward this end will be(1) The advances in processing technology,(2) The availability of high-speed network, and(3) The increasing research efforts directed toward the development of softwaresupport and programming environments for distributed computing.Further, with the increasing requirements for computing power and the diversity inthe computing requirements, it is apparent that no single computing platform willmeet all these requirements. Consequently, future computing environments need tocapitalize on and effectively utilize the existing heterogeneous computing resources.Only parallel and distributed systems provide the potential of achieving such anintegration of resources and technologies in a feasible manner while retaining desiredusability and flexibility. Realization of this potential, however, requires advances on a
  • 15. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 6 -number of fronts: processing technology, network technology, and software tools andenvironments.1.2.1 Processing TechnologyDistributed computing relies to a large extent on the processing power of theindividual nodes of the network. Microprocessor performance has been growing at arate of 35 to 70 percent during the last decade, and this trend shows no indication ofslowing down in the current decade. The enormous power of the future generations ofmicroprocessors, however, cannot be utilized without corresponding improvements inmemory and I/O systems. Research in main-memory technologies, high-performancedisk arrays, and high-speed I/O channels are, therefore, critical to utilize efficientlythe advances in processing technology and the development of cost-effective highperformance distributed computing.1.2.2 Networking TechnologyThe performance of distributed algorithms depends to a large extent on the bandwidthand latency of communication among work nodes. Achieving high bandwidth andlow latency involves not only fast hardware, but also efficient communicationprotocols that minimize the software overhead. Developments in high-speed networksprovide gigabit bandwidths over local area networks as well as wide area networks atmoderate cost, thus increasing the geographical scope of high-performance distributedsystems.The problem of providing the required communication bandwidth for distributedcomputational algorithms is now relatively easy to solve given the mature state offiber-optic and optoelectronic device technologies. Achieving the low latenciesnecessary, however, remains a challenge. Reducing latency requires progress on anumber of fronts. First, current communication protocols do not scale well to a high-speed environment. To keep latencies low, it is desirable to execute the entire protocolstack, up to the transport layer, in hardware. Second, the communication interface ofthe operating system must be streamlined to allow direct transfer of data from thenetwork interface to the memory space of the application program. Finally, the speed
  • 16. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 7 -of light (approximately 5 microseconds per kilometer) poses the ultimate limit tolatency. In general, achieving low latency requires a two-pronged approach:1. Latency reduction. Minimize protocol-processing overhead by using streamlinedprotocols executed in hardware and by improving the network interface of theoperating system.2. Latency hiding. Modify the computational algorithm to hide latency by pipeliningcommunication and computation. These problems are now perhaps most fundamentalto the success of parallel and distributed computing, a fact that is increasingly beingrecognized by the research community.1.2.3 Software Tools and EnvironmentsThe development of parallel and distributed applications is a nontrivial process andrequires a thorough understanding of the application and the architecture. Although aparallel and distributed system provides the user with enormous computing power anda great deal of flexibility, this flexibility implies increased degrees of freedom whichhave to be optimized in order to fully exploit the benefits of the distributed system.For example, during software development, the developer is required to select theoptimal hardware configuration for the particular application, the best decompositionof the problem on the hardware configuration selected, and the best communicationand synchronization strategy to be used, and so on. The set of reasonable alternativesthat have to be evaluated in such an environment is very large, and selecting the bestalternative among these is a nontrivial task. Consequently, there is a need for a set ofsimple and portable software development tools that can assist the developer inappropriately distributing the application computations to make efficient use of theunderlying computing resources. Such a set of tools should span the software lifecycle and must support the developer during each stage of application development,starting from the specification and design formulation stages, through theprogramming, mapping, distribution, scheduling phases, tuning, and debuggingstages, up to the evaluation and maintenance stages.
  • 17. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 8 -1.3 Current ScenarioThe current scenario of the Parallel Systems can be viewed under threeperspectives. A common concept that applies to all of the following is the idea ofTotal Ownership Cost (TOC). By far TOC is a common scale on which level ofcomputer processing is assessed worldwide. TOC is defined by the ratio of Total Costof Implementation and maintenance by the net throughput the parallel cluster delivers.TOTAL COST OF IMPLEMENTATION AND MAINTENANCETOC = ------------------------------------------------------------------------------------NETSYSTEM THROUGHPUT (IN FLOATING POINT / SEC)1.3.1 End user perspectivesVarious activities such as rendering, adobe Photoshop applications anddifferent processes come under this category. As there is increase in need ofprocessing power day by day it thereby increases hardware cost. From the end userprospective the Parallel Systems aims to reduce the expenses and avoid thecomplexities. At this stage we are trying to implement a Parallel System which ismore cost effective and user friendly. However, as the end user, TOC is less importantin most cases because Parallel Clusters could rarely be owned by a single user, and inthat case the net throughput of the Parallel System becomes the most crucial factor.1.3.2 Industrial PerspectiveIn Corporate Sectors Parallel Systems are extensively implemented. Such aParallel Systems consist of machines that have to handle millions of nodestheoretically not practically. From the industrial point of view the Parallel Systemaims at resource isolation, replacing large scale dedicated commodity hardware andMainframes. Corporate sectors often place TOC as the primary criteria at which aParallel Cluster is judged. With increase in scalability, the cost of owing ParallelClusters shoot up to unmanageable heights and our primary aim is this area is to bringdown the TOC as much as possible.
  • 18. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 9 -1.3.3 Developers, Researchers & Scientists PerspectiveScientific applications such as 3D simulations, high scale scientific rendering,intense numerical calculations, complex programming logic, and large scaleimplementation of algorithms (BLAS and FFT Libraries) require levels of processingand calculation that no modern day dedicated vector CPU could possibly meet.Consequently, the Parallel Systems are proven to be the only and the most efficientalternative in order to keep pace with modern day scientific advancements andresearch. TOC is rarely a matter of concern here.1.4 Obstacles and Why we don’t have 10 GHz today…Fig 1.2 Transistor vs. Clock SpeedCPU performance growth as we have known it hit a wallFigure graphs the history of Intel chip introductions by clock speed and number oftransistors. The number of transistors continues to climb, at least for now. Clockspeed, however, is a different story.
  • 19. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 10 -Around the beginning of 2003, you’ll note a disturbing sharp turn in the previoustrend toward ever-faster CPU clock speeds. We have added lines to show the limittrends in maximum clock speed; instead of continuing on the previous path, asindicated by the thin dotted line, there is a sharp flattening. It has become harder andharder to exploit higher clock speeds due to not just one but several physical issues,notably heat (too much of it and too hard to dissipate), power consumption (too high),and current leakage problems.Sure, Intel has samples of their chips running at even higher speeds in thelab—but only by heroic efforts, such as attaching hideously impractical quantities ofcooling equipment. You won’t have that kind of cooling hardware in your office anyday soon, let alone on your lap while computing on the plane.1.5 Myths and Realities: 2 x 3GHz < 6 GHzSo a dual-core CPU that combines two 3GHz cores practically offers 6GHz ofprocessing power. Right?Wrong. Even having two threads running on two physical processors doesn’tmean getting two times the performance. Similarly, most multi-threaded applicationswon’t run twice as fast on a dual-core box. They should run faster than on a single-core CPU; the performance gain just isn’t linear, that’s all.Why not? First, there is coordination overhead between the cores to ensurecache coherency (a consistent view of cache, and of main memory) and to performother handshaking. Today, a two- or four-processor machine isn’t really two or fourtimes as fast as a single CPU even for multi-threaded applications. The problemremains essentially the same even when the CPUs in question sit on the same die.Second, unless the two cores are running different processes, or differentthreads of a single process that are well-written to run independently and almost neverwait for each other, they won’t be well utilized. (Despite this, we will speculate thattoday’s single-threaded applications as actually used in the field could actually see aperformance boost for most users by going to a dual-core chip, not because the extracore is actually doing anything useful, but because it is running the ad ware and spyware that infest many users’ systems and are otherwise slowing down the single CPU
  • 20. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 11 -that user has today. We leave it up to you to decide whether adding a CPU to run yourspy ware is the best solution to that problem.)If you’re running a single-threaded application, then the application can onlymake use of one core. There should be some speedup as the operating system and theapplication can run on separate cores, but typically the OS isn’t going to be maxingout the CPU anyway so one of the cores will be mostly idle. (Again, the spy ware canshare the OS’s core most of the time.)1.6 The problem statementSo now let us summarize and define the problem statement:• Since the growth of requirements of processing is far greater than the growthof CPU power, and since the silicon chip is fast approaching its full capacity,the implementation of parallel processing at every level of computing becomesinevitable.• There is a need to have a single and complete clustering solution whichrequires minimum user interference but at the same time supportsediting/modifications to suit the user’s requirements.• There should be no need to modify the existing applications.• The parallel system must be able to support different platforms• The system should be able to fully utilize all the available hardware resourceswithout the need of buying any extra/special kind of hardware.1.7 About PARALLEXWhile the term parallel is often used to describe clusters, they are morecorrectly described as a type of distributed computing. Typically, the term parallelcomputing refers to tightly coupled sets of computation. Distributed computing isusually used to describe computing that spans multiple machines or multiplelocations. When several pieces of data are being processed simultaneously in the sameCPU, this might be called a parallel computation, but would never be described as adistributed computation. Multiple CPUs within a single enclosure might be used for
  • 21. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 12 -parallel computing, but would not be an example of distributed computing. Whentalking about systems of computers, the term parallel usually implies a homogenouscollection of computers, while distributed computing typically implies a moreheterogeneous collection. Computations that are done asynchronously are more likelyto be called distributed than parallel. Clearly, the terms parallel and distributed lie ateither end of a continuum of possible meanings. In any given instance, the exactmeanings depend upon the context. The distinction is more one of connotations thanof clearly established usage.Parallex is both a parallel and distributed cluster because it supports both ideasof multiple CPUs within a single enclosure as well as a heterogeneous collectionof computers.1.8 MotivationThe motivation behind this project is to provide a cheap and easy to usesolution to cater to the high performance computing requirements of organizationswithout the need to install any expensive hardware.In many organizations including our college, we have observed that when oldsystems are replaced by newer ones the older ones are generally dumped or sold atthrow away prices. We also wanted to find a solution to effectively use this “siliconwaste”. These wasted resources can be easily added to our system as the processingneed increases, because the parallel system is linearly scalable and hardwareindependent. Thus the intent is to have an environment friendly and effectivesolution that utilizes all the available CPU power to execute applications faster.1.9 Features of Parallex• Parallex simplifies the cluster setup, configuration and management process.• It supports machines with hard disks as well as diskless machines running atthe same time.• It is flexible in design and easily adaptable.• Parallex does not require any special kind of hardware.
  • 22. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 13 -• It is multi platform compatible.• It ensures efficient utilization of silicon waste (old unused hardware).• Parallex is scalable.How these features are achieved and details of design will be discussed in subsequentchapters.1.10 Why our design is “Alternative” to parallel system?Every renowned technology needs to evolve after a particular time as newgeneration enhances the sort come of the technology used earlier. So what weachieved is a bare bone line semantic of parallel system.When we were studying about the parallel and distributed system, theadvantage is that we were working on the latest technology. The parallel systemdesigned by scientist, no doubt were far more genius and intelligent than us. Oursystem is unique because we are actually splitting up the task according to processingpower of nodes instead of just load balancing. Hence a slow processing node will geta smaller task compared to a faster one and all nodes will show the output the samecalculated time on master node.We found some difficulties that how much task should be given to theheterogeneous system in order to get result at same time. We worked on this problemto find the solution and developed mathematical distribution algorithm which wassuccessfully implemented and functional. This algorithm breaks the task according tothe speed of the CPUs by sending a test application to all nodes and storing the returntime of each node into a file. Then we further worked on the automation of the entiresystem. We were using password less secure shell login and network file system. Wewere successful up to some extent but atomization was not possible to ssh and NFSconfiguration. Hence manually setting up of new nodes every time is a demerit of sshand NFS. To overcome this demerit we sorted the alternative solution which isBeowulf cluster, but after studying we concluded that it considered all nodes of sameconfiguration and send tasks equally to all nodes.To improve our system we think differently from Beowulf cluster. We tried tomake system more cost effective. We thought of diskless cluster concept in order getreed of hard disk to cut the cost and enhance the reliability of machine. The storage
  • 23. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 14 -device will affect the performance of entire system and increase the cost (due toreplacement of the disks) and increase the waste of time in searching the faults. So,we studied & patched the Beowulf server & Beowulf distributed process spaceaccording to our need for our system. We made a kernel images for running disklessclusters using RARP protocol. When clusters runs kernel image in its memory, itdemands for IP from master node or can also be called as server. The server assignsIP & node number of the clusters. By this, our diskless clusters system stands & readyto use for parallel computing. Then we modified our various codes including our owndistribution algorithm, according to our new design. The best part of our system wasthat there is no need for any authorization setup. Every thing is now automatic.Till now, we were working on CODE LEVEL PARALLELISM. In this, welittle bit modify code to run on our system just like MPI libraries are used to makecode parallely executable. Now, the challenge with us was that what if we didn’t getsource code instead of which we will get binary file to execute it on our parallelsystem. So, now we need to enhance our system by adding BINARY LEVELPARALLELISM. We studied Open Mosix. Once open Mosix is installed & all thenodes are booted, the Open Mosix nodes see each other in the cluster and startexchanging information about their load level and resource usage. Once the loadincreases beyond the defined level, the process migrates to any other nodes on thenetwork. There might be a situation where process demands heavy resource usage, itmay happen that the process may keep migrating from node to node without beenserviced. This is the major design flaw of the Open Mosix. And we are working out tofind the solution.So, Our Design is ALTERNATIVE to all problems in the world of parallelcomputing.1.11 InnovationFirstly our system does not require any additional hardware if the existingmachines are well connected in a network. Secondly, even in a heterogeneousenvironment, with few fast CPUs and a few slower ones, the efficiency of the systemdoes not drop by more than 1 to 5%, still maintaining an efficiency of around 80% forsuitably adapted applications. This is because the mathematical distribution algorithm
  • 24. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 15 -considers relative processing powers of the node distributing only the amount of loadthat a node can process in the calculated optimal time of the system. All the nodeswill process respective tasks and produce output at this calculated time. The mostimportant point about our system is the ability to use diskless nodes in cluster, therebyreducing hardware costs and space and the required maintenance. Also in case ofbinary executables (when source code is not available) our system exhibits almost20% performance gains.
  • 25. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 16 -Chapter 2. Requirement Analysis2.1 Determining the overall mission of Parallex• User base: Students, educational institutes, small to medium businessorganizations.• Cluster usage: There will be one part of the cluster fully dedicated to solve theproblem at hand and an optional part where computing resources fromindividual workstations are used. In the latter part, the parallel problems willbe having lower priorities.• Software to be run on cluster: Depends upon the user base. At the clustermanagement level, the system software will be Linux.• Dedicated or shared cluster: As mentioned above it will be both.• Extent of the cluster: Computers that are all on the same subnet2.2 Functional Requirements for Parallex systemFunctional Requirement 1The PC’s must be connected in LAN so as to enable the system to be use without anyobstacles.Functional Requirement 2There will one master or controlling node which will distribute the task according tothe processing speed of the node.ServicesThree services are to be provided on the master.1. There is a Network Monitoring tool for resource discovery (e.g. IP address,MAC addresses, UP/DOWN Status etc.)2. The Distribution Algorithm will distribute the task according to the currentprocessing speed of the nodes.3. Parallex Master Script that will send the distributed task to the nodes and getback the result and integrate it and gives out the output.
  • 26. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 17 -Functional Requirement 3The final size of the executable code so be such that it should reside in the limitedmemory constraints on the machine.Functional Requirement 4This product will only be used to speed up the applications which are preexisting inthe enterprise.2.3 Non-Functional Requirements for system- PerformanceEven in a heterogeneous environment, with few fast CPUs and a few slower ones, theefficiency of the system does not drop by more than 1 to 5%, still maintaining anefficiency of around 80% for suitably adapted applications. This is because themathematical distribution algorithm considers relative processing powers of the nodedistributing only the amount of load that a node can process in the calculated optimaltime of the system. All the nodes will process respective tasks and produce output atthis calculated time. The most important point about our system is the ability to usediskless nodes in cluster, thereby reducing hardware costs and space and the requiredmaintenance. Also in case of binary executables (when source code is not available)our system exhibits almost 20% performance gains.- CostWhile a system of n parallel processors is less efficient than one n times fasterprocessor, the Parallel System is often cheaper to build. Parallel computation is usedfor tasks which require very large amounts of computation, take a lot of time, and canbe divided into n independent subtasks. In recent years, most high performancecomputing systems, also known as supercomputers, have parallel architectures.
  • 27. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 18 -- Manufacturing costsNo extra hardware required. Cost of setting up LAN.- BenchmarksThere are at least three reasons for running benchmarks. First, a benchmark willprovide us with a baseline. If we make changes to our cluster or if we suspectproblems with our cluster, we can rerun the benchmark to see if performance is reallyany different. Second, benchmarks are useful when comparing systems or clusterconfigurations. They can provide a reasonable basis for selecting betweenalternatives. Finally, benchmarks can be helpful with planning.For benchmarking we will use a 3D rendering tool named Povray (Persistence OfVision Ray tracer, please see the Appendix for more details).- Hardware requiredx686 Class PCs (Linux (2.6x Kernels) installed with intranet connection)Switch (100/10T)Serial port connectors100 BASE T LAN cable, RJ 45 connectors.- Software Resources RequiredLinux (2.6.x kernel)Intel Compiler suite (Noncommercial)LSB (Linux Standard Base) Set of GNU Kits with GNU CC/C++/F77/LD/ASGNU Krell monitorNumber of PC’s connected in LAN8 NODES in the LAN.
  • 28. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 19 -Chapter 3. Project PlanPlan of execution for the project was as follows:SerialNo.Activity SoftwareUsedNumber OfDays1 Project Planninga) Choosing domainb) Identifying Key areas ofworkc) Requirement analysis- 102 Basic Installation of LINUX. LINUX (2.6xKernel)33 Brushing up on C programming Skills - 54 Shell Scripting LINUX (2.6xKernel), GNUBASH125 C Programming in LINUX Environment GNU CCompilerSuite56 A Demo Project (Universal SudokuSolver)To familiarize with LINUXprogramming environment.GNU CCompilerSuite , INTELCompiler suite(Non-commercial)167 Study Advanced LINUX tools andInstallation of Packages & RED HATRPMs.Iptraf, mc, tar,rpm, awk, sed,GNU plot,strace, gdb, etc.10
  • 29. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 20 -8 Studying Networking Basics & Networkconfiguration in LINUX.- 89 Recompiling, Patching andanalyzing the system kernelLINUX (Kernel2.6x.x), GNU ccompiler310 Study & implementation of AdvancedNetworking Tools : SSH & NFSssh & Openssh,nfs711 a) Preparing the preliminary design ofthe total workflow of the project.b) Deciding the modules for overallexecution, and dividing the areas of theconcentration among the project group.c) Build Stage I prototypeAll of the above 1712 Build Stage II prototype(Replacing ssh by custom madeapplication)All of the above 1513 Build Stage III prototype(Making Diskless Cluster)All of the above 1014 Testing & Building Final Packages All of the above 10Table 1.1 Project Plan
  • 30. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 21 -Chapter 4. System DesignGenerally speaking, the design process of a distributed system involves three mainactivities:(1) designing the communication system that enables the distributed system resourcesand objects to exchange information,(2) defining the system structure (architecture) and the system services that enablemultiple computers to act as a system rather than as a collection of computers, and(3) defining the distributed computing programming techniques to develop paralleland distributed applications.Based on this notion of the design process, the distributed system design frameworkcan be described in terms of three layers:(1) network, protocol, and interface (NPI) layer,(2) system architecture and services (SAS) layer, and(3) distributed computing paradigms (DCP) layer. In what follows, we describe themain design issues to be addressed in each layer.Fig. 4.1 Design Framework
  • 31. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 22 -• Communication network, protocol, and interface layer. This layer describes themain components of the communication system that will be used for passing controland information among the distributed system resources. This layer is decomposedinto three sub layers: network type, communication protocols, and network interfaces.• Distributed system architecture and services layer. This layer represents thedesigner’s and system manager’s view of the system. SAS layer defines the structureand architecture and the system services (distributed file system, concurrency control,redundancy management, load sharing and balancing, security service, etc.) that mustbe supported by the distributed system in order to provide a single-image computingSystem.• Distributed computing paradigms layer. This layer represents the programmer(user) perception of the distributed system. This layer focuses on the programmingparadigms that can be used to develop distributed applications. Distributed computingparadigms can be broadly characterized based on the computation and communicationmodels. Parallel and distributed computations can be described in terms of twoparadigms: functional parallel and data parallel paradigms. In functional parallelparadigm, the computations are divided into distinct functions which are thenassigned to different computers. In data parallel paradigm, all the computers run thesame program, the same program multiple data (SPMD) stream, but each computeroperates on different data streams.With reference to Fig. 4.1, Parallex can be described as follows:
  • 32. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 23 -Fig. 4.2 Parallex Design
  • 33. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 24 -Chapter 5. Implementation DetailsThe goal of the project is to provide an efficient system that will handle processparallelism with the help of Clusters. This parallelism will thereby reduce the time ofexecution. Currently we form a cluster of 8 nodes. Using a single computer forexecution of any heavy process takes lot of time in execution. So here we are forminga cluster and executing those processes in parallel by dividing the process into numberof sub processes. Depending on the nodes in cluster we migrate the process to thosenode and when the execution is over then it brings back the output produced by themto the Master node. By doing this we are reducing the process execution time andincreasing the CPU utilization.5.1 Hardware ArchitectureWe have implemented a Shared Nothing Architecture of parallel system bymaking use of Coarse Grain Cluster structure. The inter-connect is ordinary 8-portswitch and an optionally a Class-B or Class-C network. It is 3 level architecture:1. Master topology2. Slave Topology3. Network interconnect1. Master is a Linux running machine with a 2.6.x or 2.4.x (both under testing)kernel. It runs the parallel-server and contains the application interface to drive theremaining machines. The master runs a network scanning script to detect all the slavesthat are alive and retrieves all the necessary information about each slave. Todetermine the load on each slave just before the processing of the main application,the master sends a small diagnostic application to the slave to estimate the load it cantake at the present moment. Having collected all the relevant information, it does allthe scheduling, implementing of parallel algorithms (distributing the tasks accordingprocessing power and current load), making use of CPU extensions (MMX, SSE,3DNOW) depending upon the slave architecture, and everything except the executionof the program itself. It accepts the input/task to be executed. It allocates the tasks to
  • 34. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 25 -underlying slave nodes constituting the parallel system, which execute the tasks inparallel and return the output to the Master. Master plays the role of watchdog, whichmay or may not participate in actual processing But manages the entire task.2. Slave is a single system cluster image (SSCI). It is basically dedicated forprocessing purpose. It accepts the sub-task along with the necessary library modulesexecutes them and returns the output back to the Master. In our case, the slaves wouldbe multi-boot capable systems, which could at one point of time be diskless clusterhosts, at other time they might behave as a general purpose cluster node and at someother time, they could act as normal CPU handling routine tasks of office and homes.In case of Diskless Machines, the slave will boot on Pre-created kernel image patchedappropriately.3. Network interconnection is to merge both Master and Slave topologies. It makesuse of an 8-port switch, RJ 45 connectors and serial CAT 5 cables. It is a Startopology where the Master and the Slaves are interconnected through the Switch.Fig. 5.1 Parallel System H/W ArchitectureCluster Monitoring: Each slave runs a server that collects the kernel processing / IO/ memory / CPU and all the related details from PROC VIRTUAL file system and
  • 35. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 26 -forwards it to the MASTER NODE (here acting as a slave to each server running oneach slave), and a user base programs plots it interactively on the Server screen thusshowing the CPU / MEMORY / IO details of each node separately.5.2 SOFTWARE ARCHITECTURE:-This architecture consists of two parts i.e.1. Master Architecture2. Slave ArchitectureMaster consists of following levels.1. Linux BIOS: Linux BIOS usually loads a Linux kernel.2. Linux: Platform on which Master runs.3. SSCI + Beoboot: This level extracts a single system cluster image used bySlave nodes.4. Fedora Core/ Red Hat: Actual Operating System running on Master.5. System Services: Essential Services running on Master. Eg. RARP ResolverDaemon.Slave inherits the Master with the following levels.1. Linux BIOS2. Linux3. SSCIFig 5.2 Parallel System S/W Architecture
  • 36. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 27 -Parallex is broadly divided in to following Modules:1. Scheduler: this is the heart of out system. With radically new approachtowards data and instruction level distribution, we have implemented acompletely optimal heterogeneous cluster technology. We do task allocationbased on the actual processing capability on each node and not on the giveGHz power on the manual of the system. The task allocation is dynamic andthe scheduling policy is based on POSIX scheduling implementation. We arealso capable of implementing preemption, which we right now do not do infavour of the fact that system such as Linux and FreeBSD are capable ofindustry level preemption.2. Job/instruction alligator: this is a set of remote fork like utility that allocatesthe jobs to then nodes. Unlike traditional cluster technology, this job allocatoris capable of doing execution in disconnected mode that means that thenetwork latency would substantially reduce due to temporary disconnection.3. Accounting: we have written a utility “remote cluster monitor” which iscapable of providing us samples of results from all the nodes, informationabout the CPU load, temperature, and memory statistics. We propose that withless than 0.2% of CPU power consumption, our network monitoring utility cansample over 1000 nodes in less than 3 seconds.4. Authentication: all transactions between the nodes are 128 bit encrypted anddo not require root privileges to run. Just a common user on all the standalonenode must exist. For the diskless part, we remove this restriction as well.5. Resource discovery: we run our own socket layered resource discoveryutility, which discovers any additional nodes. Also reports if the resource hasbeen lost. In case of any additional hardware capable of being used as part ofparallel system, such as an additional processor to a system, or a replacementof processor with dual core processor is also reported continually.
  • 37. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 28 -6. Synchronizer: the central balancing of the cluster. Since the cluster is capableof simultaneously running both the diskless, and standalone nodes as part ofthe same cluster, the synchronizer makes the result more reasonable in outputis queued in real time so that data is not mixed up. It does instructiondependency analysis, and also uses pipelines in the network to makeinterconnect more communicative.5.3 Description for software behaviorThe end user will submit the process/application to the administrator in casethe application is source based, and the Cluster administrator owns the responsibilityto explicitly parallelize the application for maximum exploitation of parallelarchitectures within the CPU and across the cluster nodes. In case the application isbinary ( non source), the user might himself/herself submit the code to Master nodeprogram acceptor, which in turn would run the application with somewhat lowerefficiency as compared to the source submissions to the administrator. Now the totalsystem is responsible for minimizing the time of processing which in turn increasesthe throughput and speed up the processing.
  • 38. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 29 -
  • 39. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 30 -
  • 40. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 31 -
  • 41. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 32 -5.3.1 Events1. System Installation2. Network initialization3. Server and host configuration4. Take input5. Parallel execution6. Send response5.3.2 States1. System Ready2. System Busy3. System Idle
  • 42. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 33 -Chapter 6. Technologies Used6.1 General termsWe will now briefly define the general terms that will be used in further descriptionsor are related to our system.Cluster: - Interconnection of large number of computers working together in closesynchronized manner to achieve higher performance, scalability and netcomputational power.Master: - Server machine which acts as the administrator of the entire parallel Clusterand executes task scheduling.Slave: - A client node which executes the task as given by the Master.SSCI: - Single System Cluster Image is a hypothetical idea of implementing clusternodes into an image, where the cluster nodes will behave as if it were an additionalprocessor; add on ram etc. into the controlling Master computer. This is the basetheory of cluster level parallelism. Example implementations are, Multi node NUMA(IBM/Sequent) Multi-quad computers, SGI ATIX Servers. However, the idea of trueSSCI remains unimplemented when it comes to heterogeneous clusters for parallelprocessing, except for Supercomputing clusters such as Thunder and Earth Stimulator.RARP: - Reverse Address ResolutionProtocol is a network layer protocol used to resolve an IP address from agiven hardware address (such as an Ethernet address / MAC Address).
  • 43. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 34 -BProc:-The Beowulf Distributed Process Space (BProc) is set of kernel modifications,utilities and libraries which allow a user to start processes on other machines in aBeowulf-style cluster. Remote processes started with this mechanism appear in theprocess table of the front end machine in a cluster. This allows remote processmanagement using the normal UNIX process control facilities. Signals aretransparently forwarded to remote processes and exit status is received using the usualwait() mechanisms.Having discussed the basic concepts of parallel and distributed systems, the problemsin this field, and an overview of Parallex, we now move forward with the requirementanalysis and design details of our system.
  • 44. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 35 -Chapter 7. TestingLogic Coverage/Decision Based: Test casesSINo.Test case name TestProcedurePre-conditionExpectedResultReferenceto DetailedDesign1. Initial_frame_fail Initial framenot definedNone Parallexshouldgive error& exitDistributionalgo2. Final_frame_fail Final frame notdefinedNone Parallexshouldgive error& exitDistributionalgo3. Initial_final_full Initial & Finalframe givenNone Parallexshoulddistributeaccordingtto speed.DistributionAlgo.4. Input_file_name_blankNo input filegivenNone Input filenot foundParallexMaster5. Input_parameters_blankNo parametersdefined atcommand lineNone Exit onerrorParallexMasterTable 7.1 Logic/ coverage/decidion Testing
  • 45. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 36 -Initial Functional Test Cases for ParallexUse CaseFunction BeingTestedInitial SystemStateInput Expected OutputSystemStartupMaster is startedwhen the switchis turned "on"Master is offActivate the"on" switchMaster ONSystemStartupNodes is startedwhen the switchis turned “on”Nodes is ONActivate the"on" switchNODES is ONSystemStartupNodes assignedIP by masterBootingGet boot Imagefrom MasterMaster shows thatnodes are UPSystemShutdownSystem is shutdown when theswitch is turned"off"System is on andnot servicing acustomerActivate the"off" switchSystem is offSystemShutdownConnection to theMaster isterminated whenthe system is shutdownSystem has justbeen shut downVerify from theMaster side that aconnection to theSlave no longerexistsSessionSystem reads acustomersProgramSystem is on andnot servicing acustomerInsert a readableCode/ProgramProgram acceptedSessionSystem rejects anunreadableProgramSystem is on andnot servicing acustomerInsert anunreadableCode/ programProgram isrejected; Systemdisplays an errorscreen; System isready to start a newsesion
  • 46. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 37 -Use CaseFunction BeingTestedInitial SystemStateInput Expected OutputSystemStartupMaster is startedwhen the switchis turned "on"Master is offActivate the"on" switchMaster ONSessionSystem acceptscustomersProgramSystem is askingfor entry ofRANGE ofcalculationEnter a RANGESystem gets theRANGESessionSystem breaksthe taskSystem isbreaking taskaccording toprocessing speedof Nodes.PerformdistributionAlgoSystem breaks task& write into a file.SessionSystem feeds thetask to Nodes forprocessingSystem feedstasks to thenodes forexecutionSend tasksSystem displays amenu of taskrunning on NodesSessionSession endswhen all nodesgives out outputSystem isgetting output ofall nodes &display theoutput & endsGet the outputfrom all nodes.System displaysthe output & quit.Table 7.2 Functional Test
  • 47. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 38 -Cyclomatic Complexity:Control Flow Graph of a System:Fig 7.1 Cyclomatic Diagram for the systemCyclomatic complexity is a software metric (measurement) in computationalcomplexity theory. It was developed by Thomas McCabe and is used to measure thecomplexity of a program. It directly measures the number of linearly independentpaths through a programs source code.Computation of Cyclomatic Complexity:In the above flow graphE = no. of edges = 9N = no. of nodes = 7M = E – N + 2= 9 – 7 + 2= 4
  • 48. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 39 -Console And Black Box Testing:CONSOLE TEST CASESSr.No.Test Procedure Pre - Condition Expected Result Actual Result1Testing in LinuxterminalTerminalvariables havedefault valuesXterm related toolsare disabledNo graphicalinformationdisplayed2Invalid no. ofargumentsAll nodes are up Error message Proper Usage given3Pop-up terminalsfor differentnodesAll nodes are upNo of pop-ups =no. of cores in alivenodesNo of pop-ups = no.of cores in alivenodes43D Rendering onsingle machineAll necessary filesin placeLive 3D renderingShows frame beingrendered53D Rendering onParallex system.All nodes are up Status of rendering Rendered video6 Mplayer testing Rendered framesAnimation in .aviformatRenderedvideo(.avi)Table 7.3 Console Test cases
  • 49. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 40 -BLACK BOX TEST CASESSr.No.Test Procedure Pre - Condition Expected Result Actual Result1 New Node up Node is DownStatus MessageDisplayed ByNetMon Tool.Message Node UP2 Node goes Down Nodes is UPStatus MessageDisplayed ByNetMon ToolMessage NodeDOWN3NodesInformationNodes are UPInternal Informationof NodesStatus, IP , MACaddr, RAM etc.4Main tasksubmissionApplication isCompiledNext module called(distribution algo)Processing speedof the nodes.5Main tasksubmission withfaulty input.Application isCompiledERRORDisplay error &EXIT6DistributionalgorithmGet RANGEBreak task accordingprocessing speed ofthe nodesBreaks TheRANGE &generates scripts7 Cluster feed script All nodes upTask sent toindividual machinesfor executionDisplay showstask executed oneach machine8 Result assemblyAll machines havereturned resultsFinal resultcalculationFinal resultdisplayed onscreen9 Fault toleranceMachine(s) goesdown in-betweenexecutionError recovery scriptis executedTask resent to allalive machinesTable 7.4 Black box Testing
  • 50. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 41 -System Usage Specification outline:Fig 7.2 System Usage pattern :Fig 7.3 Histogram:
  • 51. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 42 -Runtime BENCHMARK:Runtime Benchmark :Fig 7.4 One frame from Complex Rendering on Parallex: Simulation of an explosionThe following is the output comparison of same application with sameparameters being run on a Standalone Machine, Existing Beowulf Parallel Cluster,and Our Cluster System Parallex.Application: POVRAYHardware Specifications:NODE 0 P4 2.8 GHzNODE 1 Cor2DUO 2.8 GHzNODE 2 AMD 64, 2.01 GHzNODE 3 AMD 64, 1.80 GHzNODE 4 CELERON D,2.16 GHz
  • 52. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 43 -Benchmark Results:Time SingleMachineExistingParallelSystems(4NODES)ParallexClusterSystem (4NODES)Real Time 14m 44.3 s 3m 41.61 s 3m 1.62 sUser Time 13m 33.2s 10m 4.67 s 9m 30.75 sSys Time 2m 2.26s 0m 2.26 s 0m 2.31sTable 7.5 Benchmark ResultsNote : User Time of Cluster is approximate sum of all per user system time per node.
  • 53. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 44 -Chapter 8. Cost EstimationSince the growth of requirements of processing is far greater than the growthof CPU power, and since the silicon chip is fast approaching its full capacity, theimplementation of parallel processing at every level of computing becomes inevitable.Therefore we propose that in coming ages parallel processing and thealgorithms that sophisticate it, like the ones we have designed and implemented,would form the heart of modern computing. Not surprisingly, parallel processing hasalready begun to penetrate the modern computing marker directly in form of multicore processors such is Intel dual-core and quad-core processors.One of ours primary aims are simplistic implementation and leastadministrative overhead makes the implementation of Parallex simple and effective.Parallex can be easily deployed to all sectors of modern computing whereCPU intensive applications form an important part for its growth.While a system of n parallel processors is less efficient than one n times fasterprocessor, the Parallel System is often cheaper to build. Parallel computation is usedfor tasks which require very large amounts of computation, take a lot of time, and canbe divided into n independent subtasks. In recent years, most high performancecomputing systems, also known as supercomputers, have parallel architectures.Cost effectiveness is one of the major achievements of our Parallex system.We need no external or expensive hardware nor software, so price of our system is notbeen expensive. Our system is based on heterogeneous clusters in which power ofCPU is not an issue due to our mathematical distribution algorithm. Our systemefficiency will not drop by more than 5% due to fewer slower machines.So, we can say that we are using Silicon waste as challenge to our system,where we use out dated slower CPUs. Hence our system is Environment friendlydesign. One more feature of our system is that we are using diskless nodes which willreduce the total cost of system by approx. 20% as we are not using the storage devicesof nodes. Apart from separate storage device we will use a centralized storagesolution. Last but not the least our all software tools are Open source.Hence, we conclude that our Parallex system is one of the most cost effectivesystems in its genre.
  • 54. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 45 -Chapter 9. User Manual9.1 Dedicated cluster setupFor the dedicated cluster with one master and many diskless slaves, all the user has todo is install the RPMs supplied in the installation disk on the master. The BProcconfiguration file will then be found at /etc/bproc/config.9.1.1 BProc ConfigurationMain configuration file:/etc/bproc/config• Edit with favorite text editor• Lines consist of comments (starting with #)• Rest are keyword followed by arguments• Specify interface:interface eth0 10.0.4.1 255.255.255.0• eth0 is interface connected to nodes• IP of master node is 10.0.4.1• Netmask of master node is 255.255.255.0• Interface will be configured when BProc is startedSpecify range of IP addresses for nodes:iprange 0 10.0.4.10 10.0.4.14• Start assigning IP addresses at node 0
  • 55. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 46 -• First address is 10.0.4.10, last is 10.0.4.14• The size of this range determines the number of nodes in the cluster• Next entries are default libraries to be installed on nodes• Can explicitly specify libraries or extract library information from anexecutable• Need to add entry to install extra librarieslibrariesfrombinary /bin/ls /usr/bin/gdb• The bplib command can be used to see libraries that will be loadedNext line specifies the name of the phase 2 imagebootfile /var/bproc/boot.img• Should be no need to change this• Need to add a line to specify kernel command line• kernelcommandline apm=off console=ttyS0,19200• Turn APM support off (since these nodes don’t have any)• Set console to use ttyS0 and speed to 19200• This is used by beoboot command when building phase 2 imageFinal lines specify Ethernet addresses of nodes, examples given#node 0 00:50:56:00:00:00#node 00:50:56:00:00:01• Needed so node can learn its IP address from master• First 0 is optional, assign this address to node 0• Can automatically determine and add ethernet addresses using thenodeadd command
  • 56. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 47 -• We will use this command later, so no need to change now• Save file and exit from editorOther configuration files/etc/bproc/config.boot• Specifies PCI devices that are going to be used by the nodes at boot time• Modules are included in phase 1 and phase 2 boot images• By default the node will try all network interfaces it can find/etc/bproc/node_up.conf• Specifies actions to be taken in order to bring a node up• Load modules• Configure network interfaces• Probe for PCI devices• Copy files and special devices out to node9.1.2 Bringing up BProcCheck BProc will be started at boot time# chkconfig --list clustermatic• Restart master daemon and boot server# service bjs stop# service clustermatic restart# service bjs start• Load the new configuration
  • 57. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 48 -• BJS uses BProc, so needs to be stopped first• Check interface has been configured correctly# ifconfig eth0• Should have IP address we specified in config file9.1.3 Build a Phase 2 Image• Run the beoboot command on the master# beoboot -2 -n --plugin mon• -2 this is a phase 2 image• -n image will boot over network• --plugin add plugin to the boot image• The following warning messages can be safely ignoredWARNING: Didn’t find a kernel module called gmac.oWARNING: Didn’t find a kernel module called bmac.o• Check phase 2 image is available# ls -l /var/clustermatic/boot.img9.1.4 Loading the Phase 2 Image• Two Kernel Monte is a piece of software which will load a newLinux kernel replacing one that is already running
  • 58. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 49 -• This allows you to use Linux as your boot loader!• Using Linux means you can use any network that Linux supports.• There is no PXE bios or Etherboot support for Myrinet, Quadrics or Infiniband• “Pink” network boots on Myrinet which allowed us to avoid buying a 1024port ethernet network• Currently supports x86 (including AMD64) and Alpha9.1.5 Using the Clusterbpsh• Migrates a process to one or more nodes• Process is started on front-end, but is immediately migrated onto nodes• Effect similar to rsh command, but no login is performed and no shell isstarted• I/O forwarding can be controlled• Output can be prefixed with node number• Run date command on all nodes which are up# bpsh -a -p date• See other arguments that are available# bpsh -hbpcp• Copies files to a node• Files can come from master node, or other nodes• Note that a node only has a ram disk by default• Copy /etc/hosts from master to /tmp/hosts on node 0
  • 59. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 50 -# bpcp /etc/hosts 0:/tmp/hosts# bpsh 0 cat /tmp/hosts9.1.6 Managing the Clusterbpstat• Shows status of nodes• up node is up and available• down node is down or can’t be contacted by master• boot node is coming up (running node_up)• error an error occurred while the node was booting• Shows owner and group of node• Combined with permissions, determines who can start jobs on the node• Shows permissions of the node---x------ execute permission for node owner------x--- execute permission for users in node group---------x execute permission for other usersbpctl• Control a nodes status• Reboot node 1 (takes about a minute)# bpctl -S 1 –R• Set state of node 0# bpctl -S 0 -s groovy• Only up, down, boot and error have special meaning, everything elsemeans not down
  • 60. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 51 -• Set owner of node 0# bpctl -S 0 -u nobody• Set permissions of node 0 so anyone can execute a job# bpctl -S 0 -m 111bplib• Manage libraries that are loaded on a node• List libraries to be loaded# bplib –l• Add a library to the list# bplib -a /lib/libcrypt.so.1• Remove a library from the list# bplib -d /lib/libcrypt.so.19.1.7 Troubleshooting techniques• The tcpdump command can be used to check for node activity during and after anode has booted• Connect a cable to serial port on node to check console output for errors in bootprocess• Once node reaches node_up processing, messages will be logged in/var/log/bproc/node.N (where N is node number)
  • 61. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 52 -9.2 Shared Cluster SetupOnce you have the basic installation completed, youll need to configure the system.Many of the tasks are no different for machines in a cluster than for any other system.For other tasks, being part of a cluster impacts what needs to be done. The followingsubsections describe the issues associated with several services that require specialconsiderations.9.2.1 DHCPDynamic Host Configuration Protocol (DHCP) is used to supply networkconfiguration parameters, including IP addresses, host names, and other informationto clients as they boot. With clusters, the head node is often configured as a DHCPserver and the compute nodes as DHCP clients. There are two reasons to do this. First,it simplifies the installation of compute nodes since the information DHCP can supplyis often the only thing that is different among the nodes. Since a DHCP server canhandle these differences, the node installation can be standardized and automated. Asecond advantage of DHCP is that it is much easier to change the configuration of thenetwork. You simply change the configuration file on the DHCP server, restart theserver, and reboot each of the compute nodes.The basic installation is rarely a problem. The DHCP system can be installed as a partof the initial Linux installation or after Linux has been installed. The DHCP serverconfiguration file, typically /etc/dhcpd.conf, controls the information distributed tothe clients. If you are going to have problems, the configuration file is the most likelysource.The DHCP configuration file may be created or changed automatically when somecluster software is installed. Occasionally, the changes may not be done optimally oreven correctly so you should have at least a reading knowledge of DHCPconfiguration files. Here is a heavily commented sample configuration file thatillustrates the basics. (Lines starting with "#" are comments.)
  • 62. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 53 -# A sample DHCP configuration file.# The first commands in this file are global,# i.e., they apply to all clients.# Only answer requests from known machines,# i.e., machines whose hardware addresses are given.deny unknown-clients;# Set the subnet mask, broadcast address, and router address.option subnet-mask 255.255.255.0;option broadcast-address 172.16.1.255;option routers 172.16.1.254;# This section defines individual cluster nodes.# Each subnet in the network has its own section.subnet 172.16.1.0 netmask 255.255.255.0 {group {# The first host, identified by the given MAC address,# will be named node1.cluster.int, will be given the# IP address 172.16.1.1, and will use the default router# 172.16.1.254 (the head node in this case).host node1{hardware ethernet 00:08:c7:07:68:48;fixed-address 172.16.1.1;option routers 172.16.1.254;
  • 63. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 54 -option domain-name "cluster.int";}host node2{hardware ethernet 00:08:c7:07:c1:73;fixed-address 172.16.1.2;option routers 172.16.1.254;option domain-name "cluster.int";}# Additional node definitions go here.}}# For servers with multiple interfaces, this entry says to ignore requests# on specified subnets.subnet 10.0.32.0 netmask 255.255.248.0 { not authoritative; }As shown in this example, you should include a subnet section for each subnet onyour network. If the head node has an interface for the cluster and a second interfaceconnected to the Internet or your organizations network, the configuration file willhave a group for each interface or subnet. Since the head node should answer DHCPrequests for the cluster but not for the organization, DHCP should be configured sothat it will respond only to DHCP requests from the compute nodes.9.2.2 NFSA network filesystem is a filesystem that physically resides on one computer (the fileserver), which in turn shares its files over the network with other computers on thenetwork (the clients). The best-known and most common network filesystem isNetwork File System (NFS). In setting up a cluster, designate one computer as yourNFS server. This is often the head node for the cluster, but there is no reason it has to
  • 64. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 55 -be. In fact, under some circumstances, you may get slightly better performance if youuse different machines for the NFS server and head node. Since the server is whereyour user files will reside, make sure you have enough storage. This machine is alikely candidate for a second disk drive or raid array and a fast I/O subsystem. Youmay even what to consider mirroring the filesystem using a small high-availabilitycluster.Why use an NFS? It should come as no surprise that for parallel programming youllneed a copy of the compiled code or executable on each machine on which it will run.You could, of course, copy the executable over to the individual machines, but thisquickly becomes tiresome. A shared filesystem solves this problem. Anotheradvantage to an NFS is that all the files you will be working on will be on the samesystem. This greatly simplifies backups. (You do backups, dont you?) A sharedfilesystem also simplifies setting up SSH, as it eliminates the need to distribute keys.(SSH is described later in this chapter.) For this reason, you may want to set up NFSbefore setting up SSH. NFS can also play an essential role in some installationstrategies.If you have never used NFS before, setting up the client and the server are slightlydifferent, but neither is particularly difficult. Most Linux distributions come with mostof the work already done for you.9.2.2.1 Running NFSBegin with the server; you wont get anywhere with the client if the server isntalready running. Two things need to be done to get the server running. The file/etc/exports must be edited to specify which machines can mount which directories,and then the server software must be started. Here is a single line from the file/etc/exports on the server amy:/home basil(rw) clara(rw) desmond(rw) ernest(rw) george(rw)This line gives the clients basil, clara, desmond, ernest, and george read/write accessto the directory /home on the server. Read access is the default. A number of other
  • 65. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 56 -options are available and could be included. For example, the no_root_squash optioncould be added if you want to edit root permission files from the nodes.Had a space been inadvertently included between basil and (rw), read access wouldhave been granted to basil and read/write access would have been granted to all othersystems. (Once you have the systems set up, it is a good idea to use the commandshowmount -a to see who is mounting what.)Once /etc/exports has been edited, youll need to start NFS. For testing, you can usethe service command as shown here[root@fanny init.d]# /sbin/service nfs startStarting NFS services: [ OK ]Starting NFS quotas: [ OK ]Starting NFS mountd: [ OK ]Starting NFS daemon: [ OK ][root@fanny init.d]# /sbin/service nfs statusrpc.mountd (pid 1652) is running...nfsd (pid 1666 1665 1664 1663 1662 1661 1660 1657) is running...rpc.rquotad (pid 1647) is running...(With some Linux distributions, when restarting NFS, you may find it necessary toexplicitly stop and restart both nfslock and portmap as well.) Youll want to changethe system configuration so that this starts automatically when the system is rebooted.For example, with Red Hat, you could use the serviceconf or chkconfig commands.
  • 66. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 57 -For the client, the software is probably already running on your system. You just needto tell the client to mount the remote filesystem. You can do this several ways, but inthe long run, the easiest approach is to edit the file /etc/fstab, adding an entry for theserver. Basically, youll add a line to the file that looks something like this:amy:/home /home nfs rw,soft 0 0In this example, the local system mounts the /home filesystem located on amy as the/home directory on the local machine. The filesystems may have different names. Youcan now manually mount the filesystem with the mount command[root@ida /]# mount /homeWhen the system reboots, this will be done automatically.When using NFS, you should keep a couple of things in mind. The mount point,/home, must exist on the client prior to mounting. While the remote directory ismounted, any files that were stored on the local system in the /home directory will beinaccessible. They are still there; you just cant get to them while the remote directoryis mounted. Next, if you are running a firewall, it will probably block NFS traffic. Ifyou are having problems with NFS, this is one of the first things you should check.File ownership can also create some surprises. User and group IDs should beconsistent among systems using NFS, i.e., each user will have identical IDs on allsystems. Finally, be aware that root privileges dont extend across NFS shared systems(if you have configured your systems correctly). So if, as root, you change thedirectory (cd) to a remotely mounted filesystem, dont expect to be able to look atevery file. (Of course, as root you can always use su to become the owner and do allthe snooping you want.) Details for the syntax and options can be found in the nfs(5),exports(5), fstab(5), and mount(8) manpages.9.2.3 SSH
  • 67. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 58 -To run software across a cluster, youll need some mechanism to start processes oneach machine. In practice, a prerequisite is the ability to log onto each machine withinthe cluster. If you need to enter a password for each machine each time you run aprogram, you wont get very much done. What is needed is a mechanism that allowslogins without passwords.This boils down to two choices—you can use remote shell (RSH) or secure shell(SSH). If you are a trusting soul, you may want to use RSH. It is simpler to set up withless overhead. On the other hand, SSH network traffic is encrypted, so it is safe fromsnooping. Since SSH provides greater security, it is generally the preferred approach.SSH provides mechanisms to log onto remote machines, run programs on remotemachines, and copy files among machines. SSH is a replacement for ftp, telnet, rlogin,rsh, and rcp. A commercial version of SSH is available from SSH CommunicationsSecurity (http://www.ssh.com), a company founded by Tatu Ylönen, an originaldeveloper of SSH. Or you can go with OpenSSH, an open source version fromhttp://www.openssh.org.OpenSSH is the easiest since it is already included with most Linux distributions. Ithas other advantages as well. By default, OpenSSH automatically forwards theDISPLAY variable. This greatly simplifies using the X Window System across thecluster. If you are running an SSH connection under X on your local machine andexecute an X program on the remote machine, the X window will automatically openon the local machine. This can be disabled on the server side, so if it isnt working,that is the first place to look.There are two sets of SSH protocols, SSH-1 and SSH-2. Unfortunately, SSH-1 has aserious security vulnerability. SSH-2 is now the protocol of choice. This discussionwill focus on using OpenSSH with SSH-2.Before setting up SSH, check to see if it is already installed and running on yoursystem. With Red Hat, you can check to see what packages are installed using thepackage manager.[root@fanny root]# rpm -q -a | grep ssh
  • 68. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 59 -openssh-3.5p1-6openssh-server-3.5p1-6openssh-clients-3.5p1-6openssh-askpass-gnome-3.5p1-6openssh-askpass-3.5p1-6This particular system has the SSH core package, both server and client software aswell as additional utilities. The SSH daemon is usually started as a service. As youcan see, it is already running on this machine.[root@fanny root]# /sbin/service sshd statussshd (pid 28190 1658) is running...Of course, it is possible that it wasnt started as a service but is still installed andrunning. You can use ps to double check.[root@fanny root]# ps -aux | grep sshroot 29133 0.0 0.2 3520 328 ? S Dec09 0:02 /usr/sbin/sshd...Again, this shows the server is running.With some older Red Hat installations, e.g., the 7.3 workstation, only the clientsoftware is installed by default. Youll need to manually install the server software. Ifusing Red Hat 7.3, go to the second install disk and copy over the fileRedHat/RPMS/openssh-server-3.1p1-3.i386.rpm. (Better yet, download the latest
  • 69. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 60 -version of this software.) Install it with the package manager and then start theservice.[root@james root]# rpm -vih openssh-server-3.1p1-3.i386.rpmPreparing... ########################################### [100%]1:openssh-server ########################################### [100%][root@james root]# /sbin/service sshd startGenerating SSH1 RSA host key: [ OK ]Generating SSH2 RSA host key: [ OK ]Generating SSH2 DSA host key: [ OK ]Starting sshd: [ OK ]When SSH is started for the first time, encryption keys for the system are generated.Be sure to set this up so that it is done automatically when the system reboots.Configuration files for both the server, sshd_config, and client, ssh_config, can befound in /etc/ssh, but the default settings are usually quite reasonable. You shouldntneed to change these files.9.2.3.1 Using SSHTo log onto a remote machine, use the command ssh with the name or IP address ofthe remote machine as an argument. The first time you connect to a remote machine,you will receive a message with the remote machines fingerprint, a string thatidentifies the machine. Youll be asked whether to proceed or not. This is normal.[root@fanny root]# ssh amy
  • 70. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 61 -The authenticity of host amy (10.0.32.139) cant be established.RSA key fingerprint is 98:42:51:3e:90:43:1c:32:e6:c4:cc:8f:4a:ee:cd:86.Are you sure you want to continue connecting (yes/no)? yesWarning: Permanently added amy,10.0.32.139 (RSA) to the list of known hosts.root@amys password:Last login: Tue Dec 9 11:24:09 2003[root@amy root]#The fingerprint will be recorded in a list of known hosts on the local machine. SSHwill compare fingerprints on subsequent logins to ensure that nothing has changed.You wont see anything else about the fingerprint unless it changes. Then SSH willwarn you and query whether you should continue. If the remote system has changed,e.g., if it has been rebuilt or if SSH has been reinstalled, its OK to proceed. But if youthink the remote system hasnt changed, you should investigate further before loggingin.Notice in the last example that SSH automatically uses the same identity whenlogging into a remote machine. If you want to log on as a different user, use the -loption with the appropriate account name.You can also use SSH to execute commands on remote systems. Here is an exampleof using date remotely.[root@fanny root]# ssh -l sloanjd hector datesloanjd@hectors password:
  • 71. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 62 -Mon Dec 22 09:28:46 EST 2003Notice that a different account, sloanjd, was used in this example.To copy files, you use the scp command. For example,[root@fanny root]# scp /etc/motd george:/root/root@georges password:motd 100% |*****************************| 0 00:00Here file /etc/motd was copied from fanny to the /root directory on george.In the examples thus far, the system has asked for a password each time a commandwas run. If you want to avoid this, youll need to do some extra work. Youll need togenerate a pair of authorization keys that will be used to control access and then storethese in the directory ~/.ssh. The ssh-keygen command is used to generate keys.[sloanjd@fanny sloanjd]$ ssh-keygen -b1024 -trsaGenerating public/private rsa key pair.Enter file in which to save the key (/home/sloanjd/.ssh/id_rsa):Enter passphrase (empty for no passphrase):Enter same passphrase again:Your identification has been saved in /home/sloanjd/.ssh/id_rsa.Your public key has been saved in /home/sloanjd/.ssh/id_rsa.pub.
  • 72. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 63 -The key fingerprint is:2d:c8:d1:e1:bc:90:b2:f6:6d:2e:a5:7f:db:26:60:3f sloanjd@fanny[sloanjd@fanny sloanjd]$ cd .ssh[sloanjd@fanny .ssh]$ ls -a. .. id_rsa id_rsa.pub known_hostsThe options in this example are used to specify a 1,024-bit key and the RSAalgorithm. (You can use DSA instead of RSA if you prefer.) Notice that SSH willprompt you for a pass phrase, basically a multi-word password.Two keys are generated, a public and a private key. The private key should never beshared and resides only on the client machine. The public key is distributed to remotemachines. Copy the public key to each system youll want to log onto, renaming itauthorized_keys2.[sloanjd@fanny .ssh]$ cp id_rsa.pub authorized_keys2[sloanjd@fanny .ssh]$ chmod go-rwx authorized_keys2[sloanjd@fanny .ssh]$ chmod 755 ~/.sshIf you are using NFS, as shown here, all you need to do is copy and rename the file inthe current directory. Since that directory is mounted on each system in the cluster, itis automatically available.If you used the NFS setup described earlier, roots homedirectory/root, is not shared. If you want to log in as root
  • 73. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 64 -without a password, manually copy the public keys to the targetmachines. Youll need to decide whether you feel secure settingup the root account like this.You will use two utilities supplied with SSH to manage the login process. The first isan SSH agent program that caches private keys, ssh-agent. This program stores thekeys locally and uses them to respond to authentication queries from SSH clients. Thesecond utility, ssh-add, is used to manage the local key cache. Among other things, itcan be used to add, list, or remove keys.[sloanjd@fanny .ssh]$ ssh-agent $SHELL[sloanjd@fanny .ssh]$ ssh-addEnter passphrase for /home/sloanjd/.ssh/id_rsa:Identity added: /home/sloanjd/.ssh/id_rsa (/home/sloanjd/.ssh/id_rsa)(While this example uses the $SHELL variable, you can substitute the actual name ofthe shell you want to run if you wish.) Once this is done, you can log in to remotemachines without a password.This process can be automated to varying degrees. For example, you can add the callto ssh-agent as the last line of your login script so that it will be run before you makeany changes to your shells environment. Once you have done this, youll need to runssh-add only when you log in. But you should be aware that Red Hat console loginsdont like this change.You can find more information by looking at the ssh(1), ssh-agent(1), and ssh-add(1)manpages. If you want more details on how to set up ssh-agent, you might look atSSH, The Secure Shell by Barrett and Silverman, OReilly, 2001. You can also find
  • 74. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 65 -scripts on the Internet that will set up a persistent agent so that you wont need torerun ssh-add each time.9.2.4 Hosts file and name servicesLife will be much simpler in the long run if you provide appropriate name services.NIS is certainly one possibility. At a minimum, dont forget to edit /etc/hosts for yourcluster. At the very least, this will reduce network traffic and speed up some software.And some packages assume it is correctly installed. Here are a few lines from the hostfile for amy:127.0.0.1 localhost.localdomain localhost10.0.32.139 amy.wofford.int amy10.0.32.140 basil.wofford.int basil...Notice that amy is not included on the line with localhost. Specifying the host name asan alias for localhost can break some software.9.3 Working with ParallexOnce the master has been configured and all nodes are up, working with Parallex toutilize all your available resources is very easy. Follow these simple steps to use thepower of all nodes that are up.• Compile your code and place it in $PARALLEX_DIR/bin/You can use the Makefile to do this for you.# make main_app• After the application is compiled without any errors, first start the networkingmonitoring tool of Parallex
  • 75. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 66 -# netmon• Parallex will now know which machines are up and running in your cluster.To read information about the machines, run the following command:# parastat• To get a graphical representation about CPU usage and other stats about yourslave machines run the Gkrellm configuration script.# gkrllm_config• To run the main application on Parallex engine just run the master scriptfollowed by the full path of the executable binary that was compiled from yoursource application and a list of arguments that indicate the data set that is to beparallelized as follows:# parallex ../bin/my_app 1 99999999
  • 76. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 67 -Chapter 10. ConclusionThere exist many solutions for running applications on distributed/parallelsystems. Parallex however, is a single complete solution that takes care of allissues related to High Performance Computing right from cluster boot up, tomanagement of processes on remote machines.Parallex is also unique in the sense that it supports both dedicated and sharedcluster architecture. The ability of Parallex to efficiently utilize the availablecomputing resources means that the cluster does not require any special kind ofhardware, nor does it have to be homogenous i.e. of the same kind, thus resultingin significant cost savings.Parallex in its current state, is intended for use in educational institutes and smallto medium sized businesses. However, it can be easily adapted for a range ofapplications from mathematical, scientific, to 3D rendering.Hence because of its simplicity, adaptability, ease of use and relatively low cost ofownership we can conclude that Parallex is a poor man’s Super Computer
  • 77. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 68 -Chapter 11. Future EnhancementsHandling Binary-level parallelism: Given the source code, the master cansuccessfully break the application for processing in parallel, however to handle binaryexecutables we use the openMosix technology which is a Linux kernel extension forsingle-system image clustering. Processes originating from any one node, if that nodeis too busy compared to others, can migrate to any other node. OpenMosixcontinuously attempts to optimize the resource allocation. The distributed computingconcept is implemented by openMosix by extending the kernel and thus it istransparent to all applications. We are trying to include openMosix so that we can addload balancing into parallel processing.Compatibility with non-Unix platforms: At present Parallex can run on multipleplatforms with the only restriction that all should be Unix based (Linux, FreeBSD,NetBSD, Plan 9, Darwin etc.). Another restriction is that the applications needed to berun on Parallex should be compliant with all the above systems. To be able to workwith other platforms, one solution is to have a virtual machine running one of theabove supported platforms as guest OS.
  • 78. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 69 -Chapter 12. References[1] Parallel Computer Architectures: Hardware/Software Approach. Culler, David.Morgran Coffman Publishers. San Fransisco,CA.[2] High Performance Computing: 2nd Edition, Dowd Kavin and Charles.Sebastopol , CA : ORielly and Associates[3] Source Book of Parallel Computing: Dongara, Jack. Morgran CoffmanPublishers. San Fransisco,CA[4] High Performance Linux Clusters: Joseph Sloan, CA:O’ReillyMedia Inc.[5] Parallel Computing on Heterogeneous Networks by Alexey L. Lastovetsky[6] Designing and Building Parallel Programs: Ian Foster[7] Tools and environments for Parallel and Distributed Computing: Salim Hariri,Manish Parashar[8] Performance tuning techniques for Clusters: Troy Baer[9] Introduction to Parallel Computing: Los Alamos National Laboratory[10] http://bproc.sourceforge.net/bproc.html: BProc Homepage[11] www.beowulf.org: Homepage of the Beowulf project[12] Beowulf Cluster Computing with Linux, Second Edition: William Gropp, EwingLusk and Thomas Sterling[13] Parallel I/O for High Performance Computing: John M. May[14] High Performance Computing and Beowulf Clusters: R.J. Allan, S.J. Andrewsand M.F. Guest[15] www.kernel.org: Kernel Sources
  • 79. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 70 -APPENDIX- A. BProcBProc (Beowulf Distributed Process Space )The Beowulf Distributed Process Space (BProc) is set of kernel modifications,utilities and libraries which allow a user to start processes on other machines in aBeowulf-style cluster. Remote processes started with this mechanism appear in theprocess table of the front end machine in a cluster. This allows remote processmanagement using the normal UNIX process control facilities. Signals aretransparently forwarded to remote processes and exit status is received using the usualwait() mechanisms.BProc:-• Manages a single process-space across machine• Responsible for process startup and management• Provides commands for starting processes, copying files to nodes, etc.BProc is a Linux kernel modification which provides:-• A single system image for process control in a cluster• Process migration for creating processes in a clusterIn a BProc cluster, there is a single master and many slaves• Users (including root) only log into the master• The master’s process space is the process space for the cluster• All processes in the cluster are• Created from the master• Visible on the master• Controlled from the master
  • 80. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 71 -A1.0 Motivationrsh and rlogin are a lousy way to interact with the machines in a cluster.Being able to log into any machine in the cluster instantly necessitates a large amountof software and configuration be present on the machine. You will need things likeshells for people to log in. You will need an up to date password database. Youll needall the little programs that people expect to see on a UNIX system for people to becomfortable using the system. Youll probably also need all the setup scripts andassociated configuration information to get the machines up to the point where theyreactually usable by the users. That sucks. Theres an awful lot of configuration there.With a large number of machines, its also very easy for the users to make a mess.Runaway processes are a problem.The goal of BProc is to change to model of the cluster from a pile of PCs tosingle machine with a collection of network attached compute resources. And, ofcourse, to do away with rsh and rlogin in the cluster environment.Once we do away with the interactive logins, we get two basic needs. We needa way to start processes on remote machines and most importantly, we need a way tomonitor and control whats going on the remote machines.BProc provides process migration mechanisms which allow a process to placecopies of itself on remote machines via a remote fork system call. When creatingremote processes via this mechanism, the child processes are all visible in the frontends process tree.The central idea in BProc is the idea of a distributed process ID (PID) space.Every instance of Linux has a process space - a pool of process IDs and a process tree.BProc takes the process space of the front end machine and allows portions of it toexist on the other machines in the cluster. The machine distributing pieces of itsprocess space is the master machine and the machines accepting pieces of it to run arethe slave machines.
  • 81. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 72 -A2.0 Process Migration• BProc provides a process migration system to place processes onother nodes in the cluster• Process migration on BProc is not• Transparent• Preemptive• A process must call the migration system call in order to move• Process migration on BProc is• Very fast (1.9s to place a 16MB process on 1024 nodes)• Scalable• It can create many copies for the same process (e.g. MPI startup) very efficiently• O(log #copies)A2.1 Process migration does preserve• The contents of memory and memory related metadata• CPU State (registers)• Signal handler state• Process migration does not preserve• Shared memory regions• Open files• SysV IPC resources• Just about anything else that isn’t “memory”A3.0 Running on a Slave Node• BProc is a process management system• All other system calls are handled locally on the slave node• BProc does not impose any extra overhead on non-process relatedsystem calls• File and Network I/O are always handled locally• Calling open() will not cause contact with the master node
  • 82. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 73 -• This means network and file I/O are as fast as they can beA4.0 Implementation:BProc consists of four basic pieces. On the master node, there are "ghost processes"which are place holders in the process tree that represent remote processes. There isalso the master daemon which is the message router for the system and is also thepiece which maintains state information about which processes exist where. On theslave nodes there is process ID masquerading which is a system of lying to processesthere so that they appear (to themselves) to be in the masters process space. There isalso a simple daemon on the slave side which is mostly just a message pipe betweenthe slaves kernel and the network.A4.1 Ghost ProcessesCode reuse is good. BProc tries to recycle of as much of the kernels existing processinfrastructure as possible. The UNIX process model is well thought out and certainlywell understood. All the details of the UNIX model have been hammered out and itworks well. Rather than try and change or simplify it for BProc, BProc tries to keep itentirely. Rather than creating some new kind of semi-bogus process tree, BProc usesthe existing tree and fills the places which represent remote processes with lightweight "ghost" processes.Ghost processes are normal processes except that they lack a memory space and openfiles. They resemble kernel threads like kswapd and kflushd. It is possible for ghoststo wake up and run on the front end. They have their own status (i.e. sleeping,running) which is independent of the remote processes they represent. Most of thetime, however, they sleep and wait for the remote process to request one of the fewoperations which are performed on their behalf.Ghost processes mirror portions of the status of the remote process. The status includeinformation such as the process state and the amount of CPU time that it has used sofar. This aternate status is what gets presented to user space in the procfs filesystem.This status gets updated on demand (via a request to the real process) and no moreoften than every 5 seconds.
  • 83. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 74 -Ghosts catch and forward signals to the remote process. Since ghosts are kernelthreads (not running in user space), they can catch and forward SIGKILL andSIGSTOP. There is no way to get rid of ghost process without the remote processexiting.Ghosts perform certain operations on behalf of the real processes they represent. Inparticular they do fork() and wait(). If a process on a remote machine decides to fork,a new process ID must be allocated for it in the masters process space. Also, weshould see a new ghost on the front end when the remote process forks. Having theghost call fork() accomplishes both of these nicely. Likewise, the ghost process willalso clean up the process tree on the front end by performing wait()s when necessary.Finally, the ghost will exit() with the appropriate status when the remote process itrepresents exits. Since the ghost is a kernel thread, it can accurately reflect the exitstatus of the remote process including states such as killed by a signal and coredumped.A4.2 Process ID MasqueradingThe slave nodes accept pieces of the masters process space. The problem here isalthough a process might move to a different machine, it should not appear (to thatprocess) that its left the process space of the front end. That means things like theprocess ID cant change and system calls like kill() should function as if the processwas still on the front end. That is we shouldnt be able to send signals across processspaces to the other processes on the slave node.Since the slave doesnt control the process space of the processes its accepting, not alloperations can be handled entirely locally either. fork() is a good example.The solution that BProc uses is to ignore the process ID that a process gets when itscreated on the slave side. BProc attaches a second process ID to the process andmodifies the process ID related system calls to essentially lie to the process aboutwhat its ID is.
  • 84. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 75 -Having this extra tag also allows the slave daemon to differentiate the process fromthe other processes on the system when performing process ID related system calls.A4.3 The DaemonsThe master and slave daemons are the glue connecting the ghosts and the realprocesses together.A4.4 Design PrinciplesBProcs design is based on the following basic principles.A4.5 Code reuse is goodBProc uses place holders called ghosts in the normal UNIX process tree on the frontend to represent remote processes. The parent child relationships are a no-brainer thatway and so is handling signals, wait, etc.A4.6 Code reuse is really goodCode reuse is even more important in user space since things seem to change soregularly. To avoid having to write our own set of process viewing utilities like ps andtop. BProc presents all the information about remote processes in the procfs filesystem just like the system does for normal processes. As long as we keep up withchanges in the procfs file system, all existing and future process viewing/controlutilities will continue work for all time.This is especially important in user space since user space programs seem to changevery often.A4.7 The System must be bullet proof! (from user space)Processes cant escape or confuse the management system. Ghosts need to properlyforward all signals including SIGKILL and SIGSTOP. There is no way for a ghost toexit without the process it represents also exiting.
  • 85. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 76 -A4.8 Kernels shouldnt talk on the network.The kernel is a very very bad place to screw up. Try and keep as much as possibleoutside of kernel space. This includes message routing and all the information aboutthe current state of the machine.A4.9 Minimum knowledgeIf a piece of the system doesnt really need to know something dont let it know. Themaster daemon is the only piece that knows where the processes actually exist. Thekernel layers only have a notion of processes that are here or not here. Slaves dontknow what node number they are.In Brief :• All processes are started from the master with process migration• All processes remain visible on the master• No runaways• Normal UNIX process control works for ALL processes in theCluster• No need for direct interaction• There is no need to log into a node to control what is running there• No software is required on the nodes except the BProc slaveDaemon• ZERO software maintenance on the nodes!• Diskless nodes without NFS root• Reliable nodesA4.10 Screen ShotsEvery self respecting piece of software provides a screen shot of some kind. ForBProc we have a shot of top. Note the CPU states line. cpumunch is a stupid littleprogram that just eats up CPU time on remote nodes.3:08pm up 2:25, 7 users, load average: 0.13, 0.07, 0.07
  • 86. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 77 -175 processes: 46 sleeping, 129 running, 0 zombie, 0 stoppedCPU states: 12798.7% user, 8.3% system, 0.0% nice, 0.0% idleMem: 128188K av, 57476K used, 70712K free, 23852K shrd, 17168K buffSwap: 130748K av, 0K used, 130748K freePID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIMECOMMAND1540 hendriks 0 0 0 0 0 RW 0 99.9 0.0 21:35 cpumunch1541 hendriks 0 0 0 0 0 RW 0 99.9 0.0 21:35 cpumunch1542 hendriks 0 0 0 0 0 RW 0 99.9 0.0 21:35 cpumunch1543 hendriks 0 0 0 0 0 RW 0 99.9 0.0 21:35 cpumunch1544 hendriks 0 0 0 0 0 RW 0 99.9 0.0 21:35 cpumunch1545 hendriks 0 0 0 0 0 RW 0 99.9 0.0 21:35 cpumunch1546 hendriks 0 0 0 0 0 RW 0 99.9 0.0 21:34 cpumunch1547 hendriks 0 0 0 0 0 RW 0 99.9 0.0 21:34 cpumunch1548 hendriks 0 0 0 0 0 RW 0 99.9 0.0 21:34 cpumunch1549 hendriks 0 0 0 0 0 RW 0 99.9 0.0 21:34 cpumunch1550 hendriks 0 0 0 0 0 RW 0 99.9 0.0 21:34 cpumunch1551 hendriks 0 0 0 0 0 RW 0 99.9 0.0 21:34 cpumunch1552 hendriks 0 0 0 0 0 RW 0 99.9 0.0 21:34 cpumunch1553 hendriks 0 0 0 0 0 RW 0 99.9 0.0 21:34 cpumunchThe processes here appear swapped because the ghosts dont have a memory spaceand procfs doesnt mirror remote memory sizes.
  • 87. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 78 -APPENDIX- B. POV - RayB1.0 What is POV-Ray?POV-RayTMis short for the Persistence of VisionTMRaytracer, a tool for producinghigh-quality computer graphics. POV-RayTMis copyrighted freeware, that is to say,we, the authors, retain all rights and copyright over the program, but that we permityou to use it for no charge, subject to the conditions stated in our license, whichshould be in the documentation directory as povlegal.doc.Without a doubt, POV-Ray is the worlds most popular raytracer. From our websitealone we see well over 100,000 downloads per year, and this of course doesnt countthe copies that are obtained from our mirror sites, other internet sites, on CD-ROM, orfrom persons sharing their copies.The fact that it is free helps a lot in this area, of course, but theres more to it than that.There are quite a few other free ray tracers and renderers available. What makes thisprogram different?The answers are too numerous to detail in full here. Suffice it to say that POV-Rayhas the right balance of power and versatility to satisfy extremely experienced andcompetent users, while at the same time not being so intimidating as to completelyscare new users off.Of course, the most important factor is image quality, and in the right hands, POV-Ray has it. We, the developers, have seen images that were rendered using oursoftware that we at first thought were photographs - they were that realistic. (Note thatphoto-realism is an advanced skill; one that takes some practice to develop).B1.1 What is POV-Ray for Unix?
  • 88. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 79 -POV-Ray for Unix is essentially a version of the POV-Ray rendering engine preparedfor running on a Unix or Unix-like operating system (such as GNU/Linux). It containsall the features of POV-Ray described in chapters 2 and 3 of the documentation, plusa few others specific to UNIX and GNU/Linux systems. These additional features donot affect the core rendering code. They only make the program suitable for runningunder an Unix-based system, and provide the user with Unix-specific displayingcapabilities. For instance, POV-Ray for UNIX can use the X Window System todisplay the image it is rendering. On GNU/Linux machines, it can also display theimage directly on the console screen using the SVGA library.POV-Ray for Unix uses the same scheme as the other supported platforms to createray-traced images. The POV-Ray input is platform-independent, as it is using textfiles (POV-Ray scripts) to describe the scene: camera, lights, and various objects.B2.0 Available distributionsThere are two official distributions of POV-Ray for Unix available:• Source package: this package contains all the source files and Makefilesrequired for building POV-Ray. Building the program from source shouldwork on most Unix systems. The package uses a configuration mechanism todetect the adequate settings in order to build POV-Ray on your own platform.All required support libraries are included in the package. See the INSTALLfile of the source package for details.• Linux binary package: this package contains a compiled version of POV-Rayfor x86-compatible platforms running the GNU/Linux operating system. Ashell script for easy installation is also included. Further details are given inthe README file of this package.Both distributions are available for download at the POV-Ray website and on thePOV-Ray FTP server (ftp.povray.org).
  • 89. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 80 -B3.0 ConfigurationAll official versions of POV-Ray for Unix come with procedures for correctlyinstalling and configuring POV-Ray. These explanations are for reference.B3.1.1 The I/O Restrictions configuration fileWhen POV-Ray starts it reads the configuration for the I/O Restriction feature fromthe povray.conf files. See the I/O Restrictions Documentation for a description ofthese files.B3.1.2 The main POV-Ray INI fileWhen starting, POV-Ray for UNIX searches for an INI file containing defaultconfiguration options. The details can be found in the INI File Documentation.B3.1.3 Starting a Render JobStarting POV-Ray rendering any scene file is as simple as running povray from acommand-line with the scene file name as an argument. This will work with either aPOV file or an INI file (as long as it has an associated POV file). See UnderstandingFile Types. The scene is rendered with the current POV-Ray 3 options (seeUnderstanding POV-Ray Options).Note: One of the more common errors new users make is turning off the displayoption. The Display option (+d) is ON by default. If you turn this OFF in the INIfile or on the command line, POV-Ray will not display the file as you render.Please also note that POV-Ray for Unix will write the output file to a .png by default.There is no way to save the render window after rendering is completed. If youturned file output off before starting the render, and change your mind, you will haveto start the rendering all over again. We recommend that you just leave file output onall the time.
  • 90. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 81 -B3.2.1 X Window displayWhen the X Window display is used, the rendered image is displayed in a graphicswindow. During rendering, the window will be updated after every scanline has beenrendered, or sooner if the rendering is taking a long time. To update it sooner you canclick any mouse button in the window or press (almost) any key. Pressing <CTRL-R>or <CTRL-L> during rendering will refresh the whole screen. If you have theExit_Enable or +X flag set, pressing q or Q at any time during the rendering willstop POV-Ray rendering and exit. The rendering will pause when complete if thePause_When_Done (or +P) flag is set. To exit at this point, press the q or Q key orclick any mouse button in the window.POV-Ray 3.6 includes a color icon in the program if it was compiled with libXpm(which is available on most platforms where the X Window System is installed). Ifthis icon is used for the render view window depends on the window manager beingused (KDE, Gnome, fvwm, ...). POV-Ray also comes with a separate color icon(xpovicon.xpm) for use with the window managers that can use external icons. Forinstance, to have fvwm use this icon, copy the icon file to one of the directoriespointed to by PixmapPath (or ImagePath) which is defined in your $HOME/.fvwmrc.Then, add the following line in $HOME/.fvwmrc:Style "Povray" Icon xpovicon.xpmand re-start the X Window server (re-starting fvwm will not be enough). Using thisicon with another window manager may use a different procedure.Documentation of the special command line options to configure the X Windowdisplay can be found in Special Command-Line Options.B3.2.2 SVGAlib displayFor GNU/Linux systems that dont have the X Window System installed, or for thoseLinux users who prefer to run on the console, it is possible to use the SVGA library todisplay directly to the screen. For SVGAlib display, the povray binary must be
  • 91. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 82 -installed as a setuid root executable. If POV-Ray does not use SVGAlib display, firsttry (as root):chown root.root povraychmod 4755 povrayNote: Doing this may have serious security implications. Running POV-Ray asroot or through sudo might be a better idea.If it still doesnt work then make sure SVGAlib is installed on your machine andworks properly. Anything that can at least use the 320x200x256 mode (ie regularVGA) should be fine, although modes up to 1280x1024x16M are possible. If you donot have root privileged or cant have the system admin install POV-Ray, then youmust use the X Window or text display which do not require any special systempriviledges to run. If you are using a display resolution that is lower than what you arerendering, the display will be scaled to fit as much of the viewing window as possible.B3.3.0 Output file formatsThe default output file format of POV-Ray for Unix is PNG (+fn). This can bechanged at runtime by setting the Output_File_Type or +fx option. Eventually, thedefault format can be changed at compile time by settingDEFAULT_FILE_FORMAT in the config.h file located in the unix/ directory.Other convenient formats on Unix systems might be PPM (+fp) and TGA (+ft). Formore information about output file formats see File Output Options.If you are generating histogram files (See CPU Utilization Histogram) in the CSVformat (comma separated values), then the units of time are in tens of microseconds(10 x 10-6s), and each grid block can store times up to 12 hours.To interrupt a rendering in progress, you can use CTRL-C (SIGINT), which willallow POV-Ray to finish writing out any rendered data before it quits. When graphicsdisplay mode is used, you can also press the q or Q keys in the rendering previewwindow to interrupt the trace if the Test_Abort (or +X) flag is set.
  • 92. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 83 -B4.0 Rendering the Sample ScenesPOV-Ray for Unix comes with a set of shell scripts to automatically render thesample scenes coming with POV-Ray.These shell scripts are usually installed in /usr/local/share/povray-3.6/scripts. Theyrequire a bash compatible shell. There are three scripts that are supposed to be calledby the user.• allscene.sh:renders all stills. The syntax is:allscene.sh [log] [all] [-d scene_directory] [-o output_directory] [-h html_file]If html_file is specified a HTML listing of the rendered scenes is generated. ifImageMagick is installed the listing will also contain thumbnails of therendered images.• allanim.sh:renders all animations. The syntax is:allanim.sh [log] [-d scene_directory] [-o output_directory] [-h html_file]If ffmpeg is installed the script will compile mpeg files from the renderedanimations.• portfolio.sh:renders the portfolio. The syntax is:portfolio.sh [log] [-d scene_directory] [-o output_directory]The portfolio is a collection of images illustrating the POV-Ray features andinclude files coming with the package.If the option log is specified, a log file with the complete text output from POV-Ray iswritten (filename log.txt)
  • 93. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 84 -If scene_directory is specified, the sample scenes in this directory are rendered,otherwise the scene directory is determined form the main povray ini file (usually/usr/local/share/povray-3.6/scenes).If output_directory is specified, all images are written to this directory; if it is notspecified the images are written into the scene file directories. If the directories arenot writable, the images are written in the current directory. All other files (html files,thumbnails) are written here as well.To determine the correct render options the scripts analyze the beginning of the scenefiles. They search for a comment of the form// -w320 -h240 +a0.3in the first 50 lines of the scene. The animation script possibly also uses an INI filewith the same base name as the scene file. The allscene.sh has the additional alloption which - if specified - renders also scenes without such an options comment(using default options then).B5.0 POV-Ray for Unix TipsB5.1 Automated executionPOV-Ray for Unix is well suited for automated execution, for example, for renderingdiagrams displaying statistical data on a regular basis or similar things.POV-Ray can also write its image output directly to stdout. Therefore the image datacan be piped in another program for further processing. To do this the special outputfilename - needs to be specified. For instance:povray -iscene.pov +fp -o- | cjpeg > scene.jpgwill pass the image data to the cjpeg utility which writes the image in the JPEGformat.The text output of POV-Ray is always written to stderr, it can be redirected to a filewith (using a Bourne-compatible shell):
  • 94. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 85 -povray [Options] 2> log.txtFor remote execution of POV-Ray, as for example in a rendering service on the web,make sure you read and comply with the POV-Ray Legal Document.B6.0 Understanding File TypesB6.1 POV FilesPOV-Ray for Unix works with two types of plain text files. The first is the standardPOV-Ray scene description file. Although you may give files of this type anylegitimate file name, it is easiest if you give them the .pov extension. In this Help file,scene description files are referred to as POV files.The second type, the initialization file, is new to POV-Ray 3. Initialization filesnormally have .ini extensions and are referred to in this help file as INI files.B6.2 INI FilesAn INI file is a text file containing settings for what used to be called POV-Raycommand-line options. It replaces and expands on the functions of the DEF filesassociated with previous versions of POV-Ray. You can store a default set of optionsin the main POV-Ray INI file which is searched for at the following locations:• The place defined by the POVINI environment variable. When you want touse an INI file at a custom location you can set this environment variable.• ./povray.ini• $HOME/.povray/3.6/povray.ini• PREFIX/etc/povray/3.6/povray.ini (PREFIX by default is /usr/local)For backwards compatibility with version 3.5, POV-Ray 3.6 also attempts to read themain INI file from the old locations when none is found at the places above:• $HOME/.povrayrc
  • 95. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 86 -• PREFIX/etc/povray.ini (PREFIX by default is /usr/local)Note: Use of these locations is deprecated; they will not be available in futureversions.Any other INI file can be specified by passing the INI file name on the command line.One of the options you can set in the INI file is the name of an input file. You canspecify the name of a POV file here. This way you can customize POV-Ray settingsfor any individual scene file.For instance, if you have a file called scene.pov, you can create a file scene.ini tocontain settings specific for scene.pov. If you include the optionInput_File_Name=scene.pov in scene.ini, and then run povray scene.ini, POV-Raywill process scene.pov with the options specified in scene.ini.Remember, though, that any options set at the command line when you activate anINI file override any corresponding options in the INI file (see Understanding POV-Ray Options). Also, any options you do not set in the INI file will be taken as last setby any other INI file or as originally determined in povray.ini.You can instruct POV-Ray to generate an INI file containing all the options active atthe time of rendering. This way, you can pass a POV file and its associated INI file onto another person and be confident that they will be able to generate the scene exactlythe same way you did. See the section titled Using INI Files for more informationabout INI files.B6.2.1 INI File SectionsSections are not files in themselves; they are portions of INI files. Sections are ameans of grouping multiple sets of POV-Ray options together in a single INI file, byintroducing them with a section label. Consider the following INI file, taken from thePOV-Ray 3 documentation:; RES.INI
  • 96. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 87 -; This sample INI file is used to set resolution.+W120 +H100 ; This section has no label.; Select it with "RES"[Low]+W80 +H60 ; This section has a label.; Select it with "RES[Low]"[Med]+W320 +H200 ; This section has a label.; Select it with "RES[Med]"[High]+W640 +H480 ; Labels are not case sensitive.; "RES[high]" works[Really High]+W800 +H600 ; Labels may contain blanksIf you select this INI file, the default resolution setting will be 120 x 100. As soon asyou select the [High] section, however, the resolution becomes 640 x 480.B7.0 Special Command-Line OptionsPOV-Ray for Unix supports several special command-line options not recognized byother versions. They follow the standards for programs that run under the X WindowSystem.-display <display_name>Display preview on display_name rather than the default display. This ismeant to be used to change the display to a remote host. The normal dispayoption +d is still valid.-geometry [WIDTHxHEIGHT][+XOFF+YOFF]
  • 97. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 88 -Render the image with WIDTH and HEIGHT as the dimensions, and locatethe window XOFF from the left edge, and YOFF from the top edge of thescreen (or if negative the right and bottom edges respectively). For instance: -geometry 640x480+10+20 creates a display for a 640x480 image placed at(10, 20) pixels from the top-left corner of the screen. The WIDTH andHEIGHT, if given, override any previous +Wn and +Hn settings.-helpDisplay the X Window System-specific options. Use -H by itself on thecommand-line to output the general POV-Ray options.-iconStart the preview window as an icon.-title <window_title>Override the default preview window title with window_title.-visual <visual_type>Use the deepest visual of visual_type, if available, instead of the automaticallyselected visual. Valid visuals are StaticGray, GrayScale, StaticColor,PseudoColor, TrueColor, or DirectColor.Note: if you are supplying a filename with spaces in it, you will need to enclosethe filename itself within quotes.
  • 98. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 89 -Glossary of Terms and Acronyms3D RENDERINGCreating 3D animations or 3D scenes.BEOWULF CLUSTERHigh Performance cluster built with commodity off the shelfhardware.BINARY-LEVEL PARALLELISMParallelism at instruction level.BLASBasic Linear Algebra Subprograms (BLAS) is a de factoapplication programming interface standard for publishinglibraries to perform basic linear algebra operations such asvector and matrix multiplication.BPROCThe Beowulf Distributed Process Space (BProc) is set of kernelmodifications, utilities and libraries which allow a user to startprocesses on other machines in a Beowulf-style cluster.CAT 5 CABLESCategory 5 cable, commonly known as Cat 5 or "Cable andTelephone", is a twisted pair cable type designed for highsignal integrity. This type of cable is often used in structuredcabling for computer networks such as Ethernet, and is alsoused to carry many other signals such as basic voice services,token ring, and ATM.DHCPDynamic host control protocol. It is used to assign IP leases toclient machine.DISTRIBUTED COMPUTINGDistributed computing is a form of computing for a collectionof independent machines that appears to its users as a singlecoherent system.
  • 99. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 90 -FREEBSDFreeBSD is a Unix-like free operating system descended fromAT&T UNIX via the Berkeley Software Distribution (BSD)branch through the 386BSD and 4.4BSD operating systems.FreeBSD has been characterized as "the unknown giant amongfree operating systems."[2] It is not a clone of UNIX, but workslike UNIX, with UNIX-compliant internals and system APIs.IRIXIRIX is a operating system by Silicon Graphics Inc.MIPSMIPS (originally an acronym for Microprocessor withoutInterlocked Pipeline Stages) is a RISC microprocessorarchitecture developed by MIPS Technologies. MIPS designsare currently primarily used in many embedded systems such asthe Series2 TiVo, Windows CE devices, Cisco routers, Foneras,and video game consoles like the Nintendo 64 and SonyPlayStation, PlayStation 2, and PlayStation Portable handheldsystem.MPIMessage Parsing Interface. MPI is a library specification formessage-passing, proposed as a standard by a broadly basedcommittee of vendors, implementers, and users.NETBSDNetBSD is a freely redistributable, open source version of theUnix-derivative BSD computer operating system. Noted for itsportability and quality of design and implementation, it is oftenused in embedded systems and as a starting point for theporting of other operating systems to new computerarchitectures
  • 100. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 91 -NFSNetwork File System. A network filesystem is a filesystem thatphysically resides on one computer (the file server), which inturn shares its files over the network with other computers onthe network (the clients).PLAN 9Plan 9 from Bell Labs is a distributed operating system,primarily used as a research vehicle. It was developed as theresearch successor to Unix by the Computing SciencesResearch Center at Bell Labs. Plan 9 is most notable forrepresenting all system interfaces, including those required fornetworking and the user-interface, through the filesystem ratherthan specialized interfaces.POVRAYPersistence Of Vision Ray tracer. A 3D rendering Tool.PVMParallel Virtual machine. A tool used to make run applicationsin parallelRARPReverse address resolution protocol. It is used to resolve IPaddress from MAC address.RPMRPM Package Manager (originally Red Hat Package Manager,abbreviated RPM) is a package management system.[1] Thename RPM refers to two things: a software package file format,and software packaged in this format. RPM was intendedprimarily for Linux distributions; the file format RPM is thebaseline package format of the Linux Standard Base.RSHRemote shell protocol. It is used for remote login into clientmachines.
  • 101. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 92 -SSCISingle System Image (SSI) Clustering. Presenting the collectionof machines that make up a cluster as a single machine.SSHSecure Shell Protocol. It is an encrypted version of RSH. Thisis used in connecting with remote machine in network or logininto the remote machine using its password.TCP/IPTransmission control Protocol/Internet Protocol. This Protocolis used in transmission of messages safely between computersor different networks.
  • 102. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 93 -“Parallex – The Super Computer” Memorable JourneyParallex’s First Prototype with two machines“Parallex – The Super Computer” with Diskless Machines
  • 103. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 94 -Display of Parallex MasterAt Parallex Stall with our Project Guide Prof. Anil J. Kadam(Representing Computer Department)
  • 104. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 95 -All smiles: Chief Guest and Guest of Honour of Engineeting Today2008 at Parallex StallExplaining our “Parallex – Super computer”(Our HOD madam at extreme right)
  • 105. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 96 -“Parallex – The Super Computer” AchievementsFIRST in Intercollegiate National Level Event“EXCELSIOR 08” project competition & exhibition.FIRST in National Level Students Technical Symposiumand Exposition “AISSMS Engineering Today 2008” ProjectCompetition.SECOND in National Level Students Technical Symposiumand Exposition “AISSMS Engineering Today 2008” TechnicalPaper Presentation.FIRST in National level Technical event “Zion 2008” projectcompetition.Finalist in many National level project competitions.Letter of Recommendation from our Head of Department andsupport for setting up “High Performance Computing”laboratory (Letter attached on next page).
  • 106. The SupeThe SupeThe SupeThe Super Computerr Computerr Computerr ComputerAISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering”AISSMS “College Of Engineering” - 97 -Letter of Recommendation