Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PeerToPeerComputing (1)


Published on

  • Be the first to comment

  • Be the first to like this

PeerToPeerComputing (1)

  1. 1. ‘ CHAPTER 1 INTRODUCTION 1.​ ​INTRODUCTION The design of complex system like aircrafts and space vehicles requires a very large amount of computational resources .The same remark can be made in the context of services like telecommunication, meteorology, weather forecasting, climate research, molecular modelling and physical simulation which require highly calculation intensive tasks such as huge mathematical problems and large amount of data or signal processing .the most popular solution uses supercomputers that are composed of hundreds thousands of processors connected by a local high speed computer bus however, super computers are very expensive and are only located in research laboratories and organizations funded by governments and big industrial enterprises. Thus super computer are out of considerations. Recently, peer-to-peer applications have known great developments. These P2P applications are capable of increasing computational power of discrete CPU's in network by so as to create a big pool of computational power. With P2P, we may think of building a virtual super computer with a bunch of computers in a network. This is practically very difficult before because of limitations in terms of processing power, cost of computers and bandwidth available at that time. However peer-to-peer is far from a new technology. The servers in many old technologies cooperate in a peer-to-peer manner to exchange required information. News, Email and IRC all fall into this category. In fact IRC takes it a step further with clients on the network being able to connect to each other directly to exchange resources. When we are working concurrently on our computer, for example listening to music while performing text editing tasks and also downloading some of the files, even then we are using 2-3% of CPU's total capacity and is almost idle .A normal user only utilizes this much of CPU’s thus the expensive CPU cycles are being wasted. To the best of team knowledge, in the peer-to-peer world there are different categorizations of this technology that range from completely centralized to completely decentralized. The protocols and topologies of the centralized peer-to-peer technologies aren't remarkably exciting or complex. They operate on 1
  2. 2. the client-server design with file transfers occurring on the client-client level. As a result there isn't very much of interest to discuss. On the other end of the extreme are the completely distributed architectures, which have very interesting and quite often complex topologies and protocols and will be discussed to a much greater degree. Most of the existing P2P environments for the cause of high performance are based on a centralized architecture where the centralized server may lead to single failure point of the system. Few systems consider connected problems where there are frequent communications between tasks like applications solved by parallel or distributed iterative algorithms ​[2]​ . ​1.1 Problem Statement In the traditional client-server applications, as the number of simultaneous client request to a given server increases, the server can become overloaded .The client server paradigm lacks the robustness .The client -server system is exposed to a single failure point and under a critical server failure, client's request cannot be fulfilled Also when the computational task is very large which if done on a single processor it takes lot of time as well as requires heavy CPU usage. When we are working concurrently on our computer, even then we are using 2-3% of CPU's total capacity and is almost idle .A normal user only utilizes this much of CPU’s thus the expensive CPU cycles are being wasted .The Peer-to-Peer networks are the alternatives to the client-server applications but the centralized architecture domain of such network suppress the same bottle-neck. ​1.2 Objectives ▪ To design and develop a pure peer-to-peer Computing System using Socket Programming to facilitate parallel computation of undersigned problems. ▪ Development of a system that can perform heavier mathematical calculations in short duration of time using processing capabilities of available peers in the network. ▪ To design and develop analysed and tested features capable of removing scalability and security issues. ▪ To manage the stated tasks using the suggested features like administrator group, query manager group, task dispatcher and task processer group. ​1.3 Scope The system will be designed to minimize time required to compute high calculation intensive tasks by providing tools to distribute the task among number of peers, which would otherwise have to be performed on a single heavy computer or supercomputer.By minimizing time and increasing efficiency the system will meet the end-user needs while remaining easy to understand and use. More specifically, system is designed to allow an 2
  3. 3. end-user to manage and communicate with a number of peers .The software will facilitate communication through Socket Programming using JAVA. 1.4 Platform specification 1.4.1 Hardware 1.4.2 Software: Rational Rose Rational Rose is a tool set produced and marketed by Rational Software Corporation (now owned by IBM). Rose is an operational tool set that uses the Unified Modelling Language (UML) as its means for facilitating the capture of domain semantics and architecture/design intent. UML has a number of different notations, allowing the specification of the artefacts of design from many different perspectives and for different objectives during the computer engineering life cycle. Most of these notations are directly supported through the Rose tool set. Eclipse Eclipse is a multi-language software development environment ​comprising an integrated development environment ​(IDE) and an extensible plugin ​system. It is written mostly in ​Java. It can be used to develop applications in Java and, by means of various plug-ins, other programming languages ​including Ada, C, C++, COBOL, Fortran, Haskell, Perl, PHP, Python, ​Ruby ​(including Ruby on Rails ​framework), Scalab​le, Clojure, Groovy, and Scheme. It can also be used to develop packages for the software Mathematica. Development environments include the Eclipse Java development tools (JDT) for Java, Eclipse CDT for C/C++ and Eclipse PDT for PHP, among others. 3
  4. 4. Socket Programming The endpoint in an interprocess communication is called a socket, or a ​network socket​ for disambiguation. Since most communication between computers is based on the Internet Protocol, an almost equivalent term is Internet socket. The data transmission between two sockets is organised by ​communications protocols​, usually implemented in the operating system of the participating computers. Application programs write to and read from these sockets. Therefore, network programming is essentially socket programming. CHAPTER 2 SYSTEM ANALYSIS 2. SYSTEM ANALYSIS 2.1 Identification of need P2P networks are the alternative to the traditional client–server applications by replacing them by peer interactions, where peers can serve as clients, servers and edges peers thus providing much more flexible architecture. In P2P network’s overall bandwidth can be computed as the sum of the bandwidth of every node in the network under consideration .In P2P networks, resources are usually distributed among many nodes in the network ​[1]​ . The drawbacks of the traditional client-server paradigm is the basis of the evolution of P2P networks: • In the client–server application, as the number of simultaneous clients to a given server increases, the server can become overloaded, while in P2P networks aggregated bandwidth is actually increases as the nodes are added. • The client-server paradigm lacks the robustness while a P2P network enforces a good robustness. 4
  5. 5. • In the client–server application, a critical server failure will not allow client request completion but in case of P2P networks as resources are distributed among many nodes such situation can be handled through the other nodes. • When the computation task is large it will take very large time to complete if done on a single processor as well as requires heavy CPU usage. Above scenario hints that there is a need for a P2P system which facilitates: • Allow sharing of resources between peers without a central server. • Use idle cycles of desktop machines for solving complex problems. 2.2 Preliminary Investigation The peer-to-peer application structure was popularised by the file sharing systems like Napster. Napster is an online music store. It was originally founded as a pioneering peer-to peer file sharing Internet service that emphasized sharing audio files that were typically digitally encoded music as MP3 format files. Peer-to-Peer systems are usually implement an abstract overlay network, built at Application layer, on the top of the physical network topology. Such sorts of overlays are used for peer indexing and make peer-to-peer system independent from the physical network topology. A similar project that employs the same concept of using the CPU cycles of number of peers is SETI@Home project. The Search For Extraterrestrial Intelligence (SETI) is the collective name for a number of activities undertaken to search for intelligent extraterrestrial life, a most well known project is run by SETI institute. The problem that still persists is that it is hybrid P2P application means centralized server is still required. There are also a number of applications, which are de-centralized peer-to-peer systems. Some of them are FreeNet, GNUtella and FastTrack – KaZaA. Freenet Freenet's architecture is completely decentralised and distributed, meaning that there are no central servers and that all computations and interactions happen between clients. GNUtella GNUtella's architecture is similar to Freenet's in that it is completely decentralised and distributed, meaning that there are no central servers and that all computations and interactions happen between clients 5
  6. 6. FastTrack - KaZaA FastTrack is a recent arrival to the peer-to-peer scene and with its coming it brings a new, more scalable, architecture that still follows a decentralised design. All such kind of limitations provide the basis for the development of a ​pure Peer-to-Peer system​[3]​ . CHAPTER 3 FEASIBILITY STUDY 3. FEASIBILITY STUDY After the analysis of the requirement from the proposed system and specification of the proposed system a feasibility study of the projected system is conducted. The feasibility study is done to find whether the system is beneficial to user and organization or not. The feasibility study is carried out to select the best system that meets performance requirements. The feasibility study includes the investigation of the information needs of the end user and objectives, constraints, basic resource requirement and cost benefits. The main and prime objective of feasibility study is not to solve the problem, but to acquire a sense of its scope. 6
  7. 7. Based on this the feasibility of the proposed system can be evaluated in terms of following major categories. 3.1 Technical feasibility It deals with identifying the technology viable options for implementing the functionalities in the scope of the project. We need the following resources: • Pentium IV or Above. ​ ​1 GB RAM or Above. • Win XP or any Java Supportable OS. • Eclipse • Rational Rose 7.0 enterprise. We can strongly say that it is technically feasible, since there will not be much difficulty in getting required resources for the development and maintaining the system as well. All the resources needed for the development of the software as well as the maintenance of the same is available in the organization. Here, we are utilizing the resources that are available already. 3.2 Economical Feasibility The system can be developed technically and that will be a good investment for the organization/user. Financial benefits must equal or exceed the costs. The following points prove the proposed project financially feasible: • At the developer side, all the hardware and software requirements are already present, hence no financial investment is required, and so the proposed system is financially feasible to be undertaken for development. • The target audience of the proposed system are expected to hold all the hardware and software required for running the system; hence the system is again financially feasible for the target users. • The proposed system will increase the work efficiency of the target audience, which will in turn increase their profit and reduce their investment of time and energy. Development of this application is highly economically feasible .The organization need not spend much money for the development of the system already available. The only thing is to be done is making an environment for the development with an effective supervision. If we are doing so, we can attain the maximum usability of the corresponding resources. Even after the development, the organization will not be in condition to invest more in the system. Therefore, the system is economically feasible. 3.3 Operational feasibility 7
  8. 8. This project is functioning correctly in the mentioned condition. The project meets condition for successful operation it should accomplish the hardware and software requirements, then the execution of the project is even. It is the process of assessing the degree to which a proposed system solves business problems or takes advantage of business opportunities. The Operational Feasibility of our Application is as follows: • User Friendly Interface. • All the important operations are supported. • Easy to add new modules. • Growing needs of user can be easily taken care of. Once developed, it can be easily implemented in all organizations where sharing of resources between peers is required. It is fully functional under all circumstances, and the benefits would not be undermined under any situation. CHAPTER 4 LITERATURE SURVEY 8
  9. 9. 4. Literature Survey ​4.1 Work done by others Decentralized Peer-to-Peer Freenet Freenet's architecture is completely decentralized and distributed, meaning that there are no central servers and that all computations and interactions happen between clients. On Freenet, all connections to the network are equal. Clients connecting to Freenet connect randomly to any clients available making an unorganized scattered topology. Communications on Freenet occur by sending a request to a client you are connected to, who in turn sends it on to another client they are connected to and so on. When a client receives a packet from another client they don't know whether the packet originated from the client who sent it to them or whether it originated elsewhere which lends itself to anonymity on Freenet. Freenet allows the functionality of being able to insert resources into the network and to search for and retrieve resources. GNUtella GNUtella's architecture is similar to Freenet's in that it is completely decentralized and distributed, meaning that there are no central servers and that all computations and interactions happen between clients. All connections on the network are equal. When a client wishes to connect to the network they run through a list of nodes that are most likely to be up or take a list from a website and then connect to how ever many nodes they want. This produces a random unstructured network topology. Routing in the network is accomplished through broadcasting. When a search request arrives into a client that client searches itself for the file and broadcasts the request to all its other connections. Broadcasts are cut off by a time to live that specifies how many hops they may cover before clients should drop them rather than broadcast them. There is a small degree of anonymity provided on GNUtella networks by this packet routing technique. Any client that receives a packet doesn't know if the client it has received the packet from is the original sender or just another link in the chain. This is somewhat undermined however by the fact that nearly all packets on the network start with a TTL (time to live) of 7 and therefore if you receive a packet with a TTL of 7 you can be nearly certain that the packet has originated from your immediate upstream neighbour. GNUtella allows the functionality of being able to search for files. To download a file the client creates a direct connection to the client with the file it wants and sends a HTTP packet requesting the file. The client with the file interprets this and sends a standard HTTP response. However this removes any anonymity in the system as there is no way to anonymously publish or consume resources. FastTrack - KaZaA FastTrack is a recent arrival to the peer-to-peer scene and with its coming it brings a new, more scalable, architecture that still follows a decentralized design. The FastTrack protocol 9
  10. 10. is currently used by two file sharing applications, KaZaA and Morpheus. The FastTrack architecture follows a 2-tier system in which the first tier consists of fast connections to the network (Cable/DSL and up) and the second tier consists of slower connections to the network (modem and slower). Clients on the first tier are known as Super-Nodes and clients on the second tier are known as Nodes. Upon connection to the network what happens is that the client decides whether you are suitable to become a Super-Node or not. If you can become a Super-Node you connect to other Super-Nodes and start taking connections from ordinary Nodes. If you become a Node you find a Super-Node that will allow you to connect to them and connect. This produces a two-tier topology in which the nodes at the centre of the network are faster and therefore produce a more reliable and stable backbone. This allows more messages to be routed than if the backbone were slower and therefore allows greater scalability. Routing on FastTrack is accomplished by broadcasting between the Super-Nodes. Downloading on FastTrack is the same as on GNUtella ​[2]​ . 4.2 Benefits o The CPU cycles will be provided to ones, which are in need of it, and therefore efficient use of CPU cycles will be done. o With the sharing of tasks and parallel computation, problems will be solved quickly. o The chances of failure or inefficient working are less due to decentralized architecture and presence of more than one peer in each group. o Provides better performance, increases robustness and cost effectiveness. o Easy to expand and better utilization of bandwidth, resources and processing power. 4.3 Proposed solution Our project aim is designing an environment for the implementation of peer-to-per networks facilitating high performance computing. We are in favour of applications under "Distributed Computing" domain, where the task is split into chunks and performed by the number of computers forming a network. Our environment is going to be built on a decentralized architecture whereby peers can communicate directly to each other. Many hidden aspects like the scalability, robustness, self-organization and resource collection are considered. We have followed a classical approach for the design of distributed computing environment; indeed we have to use a self-adoptive communication protocol, JXTA. Our approach is to develop in java language that is more efficient for high performance computing applications. The users who are willing to provide their CPU's will be allowed to participate in the computation of the large task which has been divided in the number of parts. Such users will 10
  11. 11. be termed as peers. This type of computation will therefore leads to parallel processing, which is required. Initially, a request for the willingness to work as a administrator or as a task dispatcher or a task processer will be sent by the peer. If there already exist an administrator, then it may allow the peer to work as an administrator or dispatcher or processor, otherwise it will be the first one to work as the administrator. In such a way, the administrator will assign all the peers a particular role. Every single peer will be a member particular group of dispatcher or processor. When the peer makes the request, the administrator will first receive it. Its job is to select a particular task dispatcher, which is not currently loaded with any other tasks. The chosen task dispatcher will then scan the various task processer that are working under it. After this, the chosen group of task processors will then divide the task and start the computational work. Thus all the available peers will be managed in an efficient manner. CHAPTER 5 ​ CONCLUSION & DISCUSSION 7. CONCLUSION AND DISCUSSION 7.1 Limitation of Project 1. Limitation on use: ​The system can only be used for dealing with limited number of specified tasks and only with limited number of peers. 2. Security problems: ​In the system, if a malicious peer is involved then it will harm the data that will be there for a particular job as this peer has access to the data. 3. Implicit assumptions: The assumption is that there will be number of peers who are willing to donate there CPU cycles. If there comes no peer with such interest then task is performed in the usual way. 4. Parallel operation: The job is performed by dividing and distributing it to a number of peers. Therefore the computation speed of a given job is directly depends on the number of 11
  12. 12. peers available in the system at that time. Less number of peers will produce slow computation speed and thus large time to completion. 7.2 Difficulties Encountered During the analysis phase there were some issues, which we faced which are due to the fact that the peer-to-peer technology is new and still evolving so it is quite complex and requires focus to understand the work done by others so that same concepts can be implemented with the same precision. 7.3 Future Enhancement Suggestions • This project may be scaled to work with any sort of linearly solvable tasks. • We may introduce some sort of security features. • We may think of some ways to deal with scalability problems. • We may introduce some features to stop free riders. CHAPTER 6 BIBLIOGRAPHY & REFERENCES 8. BIBLIOGRAPHY & REFERENCES 8.1 Reference Books [1] Mastering JXTA :Building java Peer-to-Peer Applications by Joseph D.Gradeki, John Wiley and sons,528 pages,September 2002. [2] JXTA Java Standard Edition v 2.5 Programmers Guide, Sun Microsystems.Inc, 210 pages, 2007. [3] The JXTA Java Standard Edition Implementation v2.7 Programmers Guide by Jerome Verstrynge, Sun Microsystems.Inc, 171 pages, March 2011. 12
  13. 13. 8.2 Other documents & Resources [1] Distributes Computing ​ [2] P2P ​ [3] SETI@home [4] JXTA ​ [5] JXSE ​ [6] Rational Rose ​ [7] Eclipse ​ 13