Your SlideShare is downloading. ×
Java Abs   Peer To Peer Design & Implementation Of A Tuple S
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Java Abs Peer To Peer Design & Implementation Of A Tuple S


Published on

final Year Projects, Final Year Projects in Chennai, Software Projects, Embedded Projects, Microcontrollers Projects, DSP Projects, VLSI Projects, Matlab Projects, Java Projects, .NET Projects, IEEE …

final Year Projects, Final Year Projects in Chennai, Software Projects, Embedded Projects, Microcontrollers Projects, DSP Projects, VLSI Projects, Matlab Projects, Java Projects, .NET Projects, IEEE Projects, IEEE 2009 Projects, IEEE 2009 Projects, Software, IEEE 2009 Projects, Embedded, Software IEEE 2009 Projects, Embedded IEEE 2009 Projects, Final Year Project Titles, Final Year Project Reports, Final Year Project Review, Robotics Projects, Mechanical Projects, Electrical Projects, Power Electronics Projects, Power System Projects, Model Projects, Java Projects, J2EE Projects, Engineering Projects, Student Projects, Engineering College Projects, MCA Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, Wireless Networks Projects, Network Security Projects, Networking Projects, final year projects, ieee projects, student projects, college projects, ieee projects in chennai, java projects, software ieee projects, embedded ieee projects, "ieee2009projects", "final year projects", "ieee projects", "Engineering Projects", "Final Year Projects in Chennai", "Final year Projects at Chennai", Java Projects, ASP.NET Projects, VB.NET Projects, C# Projects, Visual C++ Projects, Matlab Projects, NS2 Projects, C Projects, Microcontroller Projects, ATMEL Projects, PIC Projects, ARM Projects, DSP Projects, VLSI Projects, FPGA Projects, CPLD Projects, Power Electronics Projects, Electrical Projects, Robotics Projects, Solor Projects, MEMS Projects, J2EE Projects, J2ME Projects, AJAX Projects, Structs Projects, EJB Projects, Real Time Projects, Live Projects, Student Projects, Engineering Projects, MCA Projects, MBA Projects, College Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, M.Sc Projects, Final Year Java Projects, Final Year ASP.NET Projects, Final Year VB.NET Projects, Final Year C# Projects, Final Year Visual C++ Projects, Final Year Matlab Projects, Final Year NS2 Projects, Final Year C Projects, Final Year Microcontroller Projects, Final Year ATMEL Projects, Final Year PIC Projects, Final Year ARM Projects, Final Year DSP Projects, Final Year VLSI Projects, Final Year FPGA Projects, Final Year CPLD Projects, Final Year Power Electronics Projects, Final Year Electrical Projects, Final Year Robotics Projects, Final Year Solor Projects, Final Year MEMS Projects, Final Year J2EE Projects, Final Year J2ME Projects, Final Year AJAX Projects, Final Year Structs Projects, Final Year EJB Projects, Final Year Real Time Projects, Final Year Live Projects, Final Year Student Projects, Final Year Engineering Projects, Final Year MCA Projects, Final Year MBA Projects, Final Year College Projects, Final Year BE Projects, Final Year BTech Projects, Final Year ME Projects, Final Year MTech Projects, Final Year M.Sc Projects, IEEE Java Projects, ASP.NET Projects, VB.NET Projects, C# Projects, Visual C++ Projects, Matlab Projects, NS2 Projects, C Projects, Microcontroller Projects, ATMEL Projects, PIC Projects, ARM Projects, DSP Projects, VLSI Projects, FPGA Projects, CPLD Projects, Power Electronics Projects, Electrical Projects, Robotics Projects, Solor Projects, MEMS Projects, J2EE Projects, J2ME Projects, AJAX Projects, Structs Projects, EJB Projects, Real Time Projects, Live Projects, Student Projects, Engineering Projects, MCA Projects, MBA Projects, College Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, M.Sc Projects, IEEE 2009 Java Projects, IEEE 2009 ASP.NET Projects, IEEE 2009 VB.NET Projects, IEEE 2009 C# Projects, IEEE 2009 Visual C++ Projects, IEEE 2009 Matlab Projects, IEEE 2009 NS2 Projects, IEEE 2009 C Projects, IEEE 2009 Microcontroller Projects, IEEE 2009 ATMEL Projects, IEEE 2009 PIC Projects, IEEE 2009 ARM Projects, IEEE 2009 DSP Projects, IEEE 2009 VLSI Projects, IEEE 2009 FPGA Projects, IEEE 2009 CPLD Projects, IEEE 2009 Power Electronics Projects, IEEE 2009 Electrical Projects, IEEE 2009 Robotics Projects, IEEE 2009 Solor Projects, IEEE 2009 MEMS Projects, IEEE 2009 J2EE P

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. PEER TO PEER DESIGN & IMPLEMENTATION OF A TUPLE SPACE Coordination between nodes within distributed systems is a complex problem and a current focus of research. It needs to take into account issues of performance, scalability, dependability and heterogeneity. One interesting method of coordination that can be utilised is a decentralised tuple space layer built on top of a peer to peer network. Potentially this solution could be more efficient, flexible, robust and scalable than other coordination implementations. The goal of this project is to investigate this assertion by implementing a distributed and fully decentralised tuple space co-ordination layer on top of a peer to peer network. It will provide a virtual shared space that can be accessed by any computer node within a peer to peer network regardless of its physical location. Nodes within the network should be able to post data to the shared space and retrieve data from the shared space based on the content of the data The overall goal of this project is to create a distributed system which implements a tuple space over a peer to peer network. Tuple spaces are a major area of research within the field of distributed computing at the present moment. Their main primary concern is the coordination of multiple heterogeneous computers in geographically remote locations in order to achieve a common task. Communication is achieved through the exchange of tuples in the tuple space, rather than direct communication between nodes. This is known as asynchronous and decoupled communication. This could be useful, for example, in an mobile environment, where there are not guarantees of an ‘always-on’ service. It is also concerned with achieving these interactions in a scalable, robust and efficient way. PROJECT TERMINOLOGY A tuple space is an example of shared associative memory that provides a repository for bags of tuples (a tuple is a typed set of values - see figure 1.1). Unlike physical memory where data is stored by its address, a tuple space is associative in that tuples are stored and retrieved by its content or by its type. An important distinction is a that it is logically shared memory rather than physically shared. This means that the tuples could be distributed over a set of nodes. The tuple space simply provides the necessary abstraction for higher level applications. A tuple space provides decoupled asynchronous communication between nodes in a network i.e. for a node to communicate data to another node, it does not have to establish a permanent connection. Tanenbaum (2002) states that a “distributed system is a collection of independent computers that appears to its users as a single coherent system”[1].
  • 2. This definition works well within the context of the tuple space paradigm as a tuple space provides a single entry point into a distributed system; higher level applications do not need to concern themselves with the implementation of the distributed system underneath Problem definition Developing a tuple space over an underlying peer to peer network provides a number of interesting challenges, namely: • How does the tuple space decide where the tuples are stored within the system in such a way they can reliably and efficiently be retrieved? • How can a flexible solution be provided that will adapt well to many different application level problems? • How to separate the different concerns of the system into various components? AIMS OF PROJECT The primary aim of this project is to develop a tuple space layer coordination layer and to investigate the potential robustness of this solution. The scalability of the tuple space implementation could also be investigated. However this is considered to be outside the scope of the project as it would need considerable time and resources. A secondary aim is to investigate how this implementation can be mapped on top of a peer to peer network, more specifically a Chord open network overlay. The final aim is to investigate how this implementation can be integrated into the Gridkit architecture as a plug-in for the interaction framework. Primary aims • To investigate the role of peer to peer technology in supporting decentralised tuple space operations. • To determine an efficient mapping between tuple spaces and Chord like distributed hash table data structures. • To investigate issues of flexibility within the system: i.e. how to provide a flexible solution to the application level without sacrificing other factors. • To investigate issues performance of the system. • To investigate how multi-dimensional data can be efficiently retrieved from the tuple space. Secondary aims Consideration will be given to these aims during design, implementation and evaluation of the system. However they may not necessarily be covered in depth due to the time constraints of the project. • To investigate a component based approach to lie within the Gridkit middleware architecture. • Use this to determine what this can provide it in configurability and re-configurability i.e. how the system can be adapted to the application-level’s needs. • To investigate the scalability and robustness of the solution
  • 3. Existing Tuple Space Systems Existing tuple space systems can be classed into two different types : client-server based and peer to peer based Existing peer to peer based systems, various tuple space implementations (both client-server based and peer to peer based) currently available, different methods available for constructing a peer to peer tuple space implementations and provides a look into the Gridkit and Open COM architecture. Peer to Peer systems To first understand the requirements of this project, a look into existing peer to peer networks and their properties will be needed. The motivation for using peer to peer technology as a method of developing this system will also be considered. A peer to peer network is one in which all nodes (known as ‘peers’) in the network are equal; there is no single point of failure. Research into peer to peer networks is focused on how to both store and find data within the networks. There are two main schools of thought, structured and unstructured peer to peer networks. Gnutella Gnutella is an open source, fully decentralised peer to peer network, originally developed by Nullsoft. It is an example of an unstructured peer to peer network providing methods for distributed searches. Gnutella used a method called query flooding, which although provided scalability, was inefficient and did not provide guaranteed lookup results. Gnutella also provided high fault tolerance, due to its method of sending queries out to every active node that it is connected to. This ensures that the query propagates its way though the network even if connected nodes have failed. Distributed Hash Tables (DHT’S) Distributed Hash Table have been designed to provide efficient and guaranteed lookups and reliable resource discovery whilst providing the scalability of solutions such as Gnutella. They work by partitioning a set of keys and their respective values over a number of nodes within a network. DHT’s can efficiently route messages to a unique owner of a particular key. Most DHT’s use consistent hashing to map keys to nodes. For example a key is mapped using a certain hash function to a certain ID, then some mechanism is used to route that key to the node that is responsible for it. The value of that key could then be retrieved by hashing on the key, as it will produce exactly the same ID. To route messages in a DHT, a routing table is used, which contains a set of links of nodes that are close to it, these in turn form an overlay network. There are many different DHT implementations examples being Chord[3] and CAN[4]. Chord Stoica, Morris et al presented a DHT implementation called Chord(2001). Chord envisaged the nodes in the overlay network as being conceptually joined in a circle using a type of doubly-linked list. Chord provided lookups in the network using only log(n) messages, n being the number of nodes in the network. Chord introduced the notion of successors and predecessors. The node in which has an ID that succeeds the key is responsible for providing storage for that key. If that node was to leave the network, it would be moved to the next successor. This method ensures a high level of robustness whilst at the same time minimising the load placed on nodes, the network adapts itself and distributes the keys to the changing topology of the network(i.e. joining and leaving nodes). Each node maintains a routing table with details of nodes logically close to it, for routing Chord messages. This makes it practical to scale to many nodes. Figure 2.1 shows an example Chord identifier circle with 3 nodes. Key 1 is located at node 1, key 2 is located at node 2 and key 6 is located at node 0.
  • 4. Motivation for using DHT peer to peer technology Peer to peer networks have a number of interesting properties over traditional communication models such as client-server. They can be more scalable then their client-server counterparts as there is no single bottleneck i.e. a central server for the peers in the system to communicate with. They also can be more robust in terms of both searching data and the storage of data. This is due to the decentralised operation of servers and possibilities of distributed replication of files across the network. These factors potentially provide a system with greater availability than existing approaches. The DHT variant can provide this functionality combined with efficient guaranteed lookups. Existing Tuple Space Systems Existing tuple space systems can be classed into two different types : client-server based and peer to peer based. This section will present the motivations behind investigating a peer to peer approach. Linda Spaces[5] The tuple space concept was first introduced by Gelernter(1985). It was developed with the concept of coordination within parallel programming in mind and was designed as an extension to existing programming languages. It pioneered the concept of using a logical shared associative memory space to store operations and the use of the three tuple operations to write, read and destructively read tuples from the tuple space. More recently the concept has been adapted for use in coordination within distributed environments. It also developed the concept of using ‘template tuples’ to provide lookups within the tuple space. Template tuples can provide all or some of the values required to retrieve tuples from the tuple space. They also specify the use of wildcard and range searches to provide flexibility for retrieving tuples. Client-Server based Systems Many of the Tuple Space systems currently available are based on the client-server model. Java Spaces[6] within the JINI technology platform and TSpaces[7] from IBM are examples of this approach. The advantage of this model is in its simplicity, it does not have the problems of coordinating the system over a set of distributed nodes. The primary disadvantage of the client-server model is that it provides a single point of failure and may place a high load on the server. This two problems affect the respective systems potential of scalability, something in which decentralised tuple space systems are being designed to address. Motivation for developing a decentralised peer to peer tuple space The previous section detailed some of the reasons for using peer to peer technology in this project. Notably due the potential of greater scalability, performance and robustness. What needs to be considered is the motivation for developing a tuple space over peer to peer technology. The tuple space paradigm, is at its most useful when used in environments with a large number of geographically dispersed nodes which have intermittent availability. Therefore the traditional client-server paradigm does not make sense as it would be difficult to enable this sort of functionality. Peer to peer technology lends it self well to this functionality, and combined with its other characteristics it makes for an interesting platform. EXISTING PEER TO PEER TUPLE SPACE SYSTEMS More recently researchers have been looking to decentralise the operation of tuple spaces as a way of improving aspects of availability and scalability. This section will detail some of the approaches that have been considered.
  • 5. Comet - Li and Parashar(2005) present a system called Comet. Comet makes use of a Hilbert Space Filling curve mapping to map tuples onto underlying chord nodes. Space filling curves will be described in more detail following this section, however they basically provide a multi- dimension to singular-dimension mapping. The Hilbert Space Filling curve is a locality preserving function in that contextually similar tuple are grouped together. This improves the performance of the system when looking up data and performing range or wildcard searches as similar tuples should be grouped together on similar peers. PeerGameSpace - Wang, Hsiao et al(2005) present a method of using ‘shortcuts’ within a Chord network to point to various applicable tuples. However it is not made clear how many shortcuts would be needed to retrieve tuples in multi-dimensional context or how efficient and flexible range queries could be implemented. They have developed a simple peer to peer game to run on top of this implementation. Panda - Christian, Durate et al(2004) present an entirely different method in the Panda system. Panda uses a two-tier approach to storing tuples with a tuple space, tuples are stored in the underlying node that is responsible for the ‘signature’ of its tuple. The signature of the tuple being the complete type of the tuple: an ordered list of the tuple type fields. Inside the hash table in the node, the tuples are stored by a key that is hashed on the result of the ‘content’ of the data. Pier - Although not a tuple space in the traditional sense; Pier presents a method of using ‘distributed joins’ and providing database querying techniques to lookup tuples in a layer implemented on top of a DHT. It currently uses CAN as an underlying DHT overlay. The method given is designed to be scalable to massively distributed systems. SPACE FILLING CURVES The previous section detailed an approach of using space filling curves to map tuple attributes to th Chord nodes. Space filling curves[12] were first described by 19 century mathematician Peano; examples being Hilbert Space Filling curves and Z-Order curves. They are ‘curves whose ranges contain the entire 2-dimensional unit square’ or in the case of many dimensions, the ‘n-dimensional hypercube’. Space filling curves have been presented as a method for mapping multi-dimensional data (i.e. n-dimensional tuples) into 1-dimension. This makes them incredibly useful in a distributed context, as it provides a method of mapping tuples onto a one-dimension Chord node. Z-Order curves Further research indicated the use of Z-Order curves for indexing and querying multi-dimensional data in a database. This approach was pioneered by Tropf and Herzog(1981)[13]. Binary search trees were presented as an efficient method of looking up data; supporting both exact and range searching facilities. More recently Chawathe, LaMarca et al(2005)[14] have proposed a method of using Z-Order curves to map 2-dimensional geographical information tuples onto a Distributed Hash Table. Binary prefix tries(a type of binary tree where each subsequent node searched leads to a certain piece of data) are used to store and lookup data; tuples are only stored in leaf nodes. GRIDKIT / OPENCOM ARCHITECTURE Gridkit Coulson, Grace, Blair et al(2004)[16] describe a middleware called Gridkit being developed for use within grid computing. Grid middleware acts as an intermediary between the different components within a distributed system; therefore allowing potential grid applications to make use of this functionality without having to understand the underlying complexity. Gridkit is a component based architecture that supports a number of middleware services such as interaction types, resource discovery, resource management and security. These middleware services are built on top of an ‘open overlays’ layer which in turn abstracts over the underlying communications layer to provide various support e.g. peer to peer communication.
  • 6. The main concerns for this project are interaction types and overlays. Overlay networks are ‘virtual’ networks layered over an underlying physical network. An example of an overlay network is the Chord distributed hash table described previously. Interaction types are built on top the overlay networks to provide a particular service desirable to higher level applications. Publish- Subscribe and Tuple Spaces are examples of interaction types that could be used within the Gridkit framework. Component architecture Gridkit makes use of OpenCOM v2(2004) [17] as its component object model. The main concept that needs to be understood is that of interfaces, receptacles and connections. Interfaces describe a unit of service provision, receptacles describe a unit of service requirement and connections allow for the binding between components with receptacles and interfaces. This component model will be used within the development of this system as it will provide a method of integrating the tuple space system within Gridkit and the existing Chord/DHT components. OpenCOM however does not take into account the distribution of the system (i.e. different components situated on different nodes) therefore the Chord layer makes use of Java RMI, to provide a method of invocating methods on components situated on different nodes. Motivation for using component architecture There are many advantages as to the use of a Gridkit/component based architecture as described above. Namely it creates the possibility of configurable/reconfigurable solution as per to the aims. The tuple space does not have to be necessarily dependent on a single overlay network(such as Chord), a different component could be selected depending on the needs of the system. Similarly, differing components could also be created to represent a variety of tuple space algorithms which could be adaptive to the needs of the application-level.