General-Purpose, Internet-Scale Distributed Computing with Linked Process
Upcoming SlideShare
Loading in...5
×
 

General-Purpose, Internet-Scale Distributed Computing with Linked Process

on

  • 2,289 views

There are many distributed computing protocols in existence today. Some serve as a solution for scientific computing, some as a middleware solution to large- scale systems engineering, and others as ...

There are many distributed computing protocols in existence today. Some serve as a solution for scientific computing, some as a middleware solution to large- scale systems engineering, and others as an “easy-to-use” service solution on the Web. What most of these protocols have in common is that they require a strong “handshake” between the machines utilizing each other’s resources. This coupling has rendered many distributed protocols to only be useful for a collection of machines owned and operated by a single organization (e.g. MPI/PVM computing) or for use by foreign machines with a very specific use case (e.g. RPC/Web Services computing). The former allows for general-purpose distributed computing and the latter allows for Internet-scale distributed computing. What if both types of functionality were to be merged? What does a general-purpose, Internet-scale distributed computing protocol look like? Linked Process [ http://linkedprocess.org ]

Statistics

Views

Total Views
2,289
Views on SlideShare
2,275
Embed Views
14

Actions

Likes
2
Downloads
39
Comments
0

2 Embeds 14

http://wiki.github.com 12
http://www.slideshare.net 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

General-Purpose, Internet-Scale Distributed Computing with Linked Process General-Purpose, Internet-Scale Distributed Computing with Linked Process Presentation Transcript

  • General-Purpose, Internet-Scale Distributed Computing with Linked Process Linked Process Marko A. Rodriguez T-5/Center for Nonlinear Studies, Los Alamos National Laboratory http://markorodriguez.com September 10, 2009
  • 1 Abstract There are many distributed computing protocols in existence today. Some serve as a solution for scientific computing, some as a middleware solution to large- scale systems engineering, and others as an “easy-to-use” service solution on the Web. What most of these protocols have in common is that they require a strong “handshake” between the machines utilizing each other’s resources. This coupling has rendered many distributed protocols to only be useful for a collection of machines owned and operated by a single organization (e.g. MPI/PVM computing) or for use by foreign machines with a very specific use case (e.g. RPC/Web Services computing). The former allows for general-purpose distributed computing and the latter allows for Internet-scale distributed computing. What if both types of functionality were to be merged? What does a general-purpose, Internet-scale distributed computing protocol look like? Linked Process [ http://linkedprocess.org ] Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 2 A General-Purpose Requirement • General-purpose: it is required that the code executed is not necessarily defined by the executing device, but instead can be defined by the requesting device. Language-agnostic: it is required that distributed code can, in principle, be written in any computer language. Safe: it is required that the execution of code be confined by clearly specified permissions on the executing device. Accessible: it is required that various types of computing resources be accessible when permissions allow. The notion of “general-purpose” is not defined according to a single dimension as there are various general-purpose approaches which each attain certain types of generality. Please be generous in your interpretation of this term for the time being. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 3 An Internet-Scale Requirement • Internet-scale: it is required that any device with an Internet connection (from a cell phone to a supercomputer) be able contribute and leverage computing resources. Decentralized: it is required that the computing resources are not centralized or controlled by any one party. Discoverable: it is required that devices be discoverable by other devices needing to leverage their resources. Transient: it is required that devices coming online and offline are easily incoporated and removed. The extreme notion of “Internet-scale” goes beyond the 32-bit addresses of the IP protocol. There are more than 4,294,967,296 devices on the Internet. Thus, at the extreme, “Internet-scale” refers to all devices that can communicate and be communicated with through the Internet. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • Outline LoP 4 • An Introduction to Other Distributed Computing Protocols General-Purpose Distributed Computing with MPI Internet-Scale Distributed Computing with Web Services • An Introduction to the Linked Process Protocol Internet-Scale Distributed Computing with Linked Process General-Purpose Distributed Computing with Linked Process • An Introduction to the Linked Process Protocol Implementation • Current and Future State of Linked Process Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • Outline LoP 5 • An Introduction to Other Distributed Computing Protocols General-Purpose Distributed Computing with MPI Internet-Scale Distributed Computing with Web Services • An Introduction to the Linked Process Protocol Internet-Scale Distributed Computing with Linked Process General-Purpose Distributed Computing with Linked Process • An Introduction to the Linked Process Protocol Implementation • Current and Future State of Linked Process Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 6 A Short Note on Other Protocol Discussions • The symbol “ ” means that this feature is good with respect to the two previous requirements. • The symbol “ ” means this feature is bad with respect to the two previous requirements. There are always tradeoffs in computing. These are not “objective” valuations of the protocols discussed next. Valuations are in terms of the requirements set forth for the design of Linked Process. Linked Process won’t solve all problems—it is “yet another distributed computing protocol” that has a collection of unique features that make it useful for problems with the aforementioned requirements. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • Outline LoP 7 • An Introduction to Other Distributed Computing Protocols General-Purpose Distributed Computing with MPI Internet-Scale Distributed Computing with Web Services • An Introduction to the Linked Process Protocol Internet-Scale Distributed Computing with Linked Process General-Purpose Distributed Computing with Linked Process • An Introduction to the Linked Process Protocol Implementation • Current and Future State of Linked Process Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 8 General-Purpose: Distributed Computing with MPI • The Message Passing Interface (MPI) is a language agnostic protocol for inter-process communication. • Processes (i.e. threads of execution) communicate by passing data between each other (i.e. messages). send(&x, p2): send data pointed to by x to process p2. recv(&y, p1): receive data from process p2 and store it at y . process 1 process 2 int x[100]; int y[100]; ... ... time ... ... send(&x, 2); recv(&y, 1); Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 9 General-Purpose: Distributed Computing with MPI marko> more hosts.txt machine1 machine2 marko> mpirun -machinefile hosts.txt -np=3 myProgram spawning myProgram on machine1... spawning myProgram on machine2... spawning myProgram on machine1... Executing... Done. Thank you, compute again. marko> Within myProgram the code branches depending on which “rank” its process is (e.g., with respect to the above example, rank is either 1, 2, or 3). This way, each processor is doing a task particular to its self/rank. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 10 General-Purpose: Distributed Computing with MPI MPI has been around since the early 1990’s and is a thoroughly applied protocol with various language ports (however, MPI tends to be more “C/Fortran”-ish as its intended use if high-performance computing). [Language-agnostic] MPI developers have access to all machine resources—the limiting factor being the operating system. [Accessible] MPI implementations have large libraries of useful distributed computing patterns (e.g. scatter/gather, broadcast, reduce, etc.). What you can think of is what you can do. [General-purpose] Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 11 General-Purpose: Distributed Computing with MPI MPI requires the MPI agent mpirun to have “ssh” access to the physical devices that processes will be spawned on (i.e. the operating system becomes the security manager). [Safe] MPI processes have low-level access to the computing resources of the underlying machine and thus, introduces a security risk for foreign/unknown code. In short, you most likely own all your machines. [Decentralized] MPI requires a set of machines and the compilation of all code before execution. [Transient] Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 12 An Artist’s Interpretation of MPI .3 0.0 .1 0.0 .2 .0.0 . . 127 127 127 [1,2,7,9] ['c','b'] [1] [42,31] Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • Outline LoP 13 • An Introduction to Other Distributed Computing Protocols General-Purpose Distributed Computing with MPI Internet-Scale Distributed Computing with Web Services • An Introduction to the Linked Process Protocol Internet-Scale Distributed Computing with Linked Process General-Purpose Distributed Computing with Linked Process • An Introduction to the Linked Process Protocol Implementation • Current and Future State of Linked Process Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 14 Internet-Scale: Distributed Computing with Web Services “Web services are frequently just Internet Application Programming Interfaces (API) that can be accessed over a network, such as the Internet, and executed on a remote system hosting the requested services.” —from Wikipedia’s Web Services article. • A Web Service is like an API. • A Web Service is hosted by a remote device and can be accessed by anyone over the network. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 15 Internet-Scale: Distributed Computing with Web Services • REST-based (REpresentational State Transfer) Web Services make use of simple HTTP-based APIs. REST “verbs” are GET, PUT, POST, and DELETE. resource http://chart.apis.google.com/chart? cht=p3&chd=t:60,40&chs=250x100&chl=LANL|Sandia parameters Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 16 Internet-Scale: Distributed Computing with Web Services • RPC-based (Remote Procedure Call) Web Services perform a set of functions with a specification for sending process requests and receiving process results (e.g. Web Service Description Language – WSDL). Web Service 127.0.0.2 boolean aMethod(String x, int y); double bMethod(double z); Service Requestor 127.0.0.1 void cMethod() { Object[] params = {"marko", 29}; Stub s = new Stub("aMethod", params); boolean b = s.execute(); } Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 17 Internet-Scale: Distributed Computing with Web Services Most modern languages have libraries to support the various Web Services models. They usually make use of “Web” protocols (e.g. HTTP) and encodings (e.g. XML, SOAP, JSON). [Language-agnostic] Limited functionality and strict interfaces ensures that underlying devices can not be compromised. [Safe] Web Service models have discovery mechanisms to locate services that perform a particular function and take particular types of inputs and produce particular types of outputs (e.g. UDDI). [Discoverability] Web Services are web addressable and intended for use by anyone (not just the developer of the service). [Internet-scale] Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 18 Internet-Scale: Distributed Computing with Web Services Web Services are defined for particular use cases and thus, the computing resources offered by a Web Service are defined by the developer of the Web Service. [General-purpose] • e.g. Google Charts codebase is defined and can be used, but only for what it was created for (namely, to make graphical charts). Web Services are tied to the Internet Protocol for device addressing and thus, reduces the types of devices that can offer services. [Internet-scale] • i.e. its difficult to run a typical HTTP-based Web Service off my cell phone without some intermediary gateway mechanism. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 19 An Artist’s Interpretation of Web Services f(x) f(object) .2 .0.0 object 127 .1 .0.0 127 g(object) g(x) object .0.0.3 127 Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • Outline LoP 20 • An Introduction to Other Distributed Computing Protocols General-Purpose Distributed Computing with MPI Internet-Scale Distributed Computing with Web Services • An Introduction to the Linked Process Protocol Internet-Scale Distributed Computing with Linked Process General-Purpose Distributed Computing with Linked Process • An Introduction to the Linked Process Protocol Implementation • Current and Future State of Linked Process Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 21 Core Features of Linked Process • Linked Process entities are not addressed by IP addresses. Their addressing scheme is location independent. Implication: Any device with an Internet connection can support or leverage a Linked Process cloud. Linked Process clouds support Internet-scale distributed computing. • Linked Process allows users to execute any code on a remote device as long as the code does not violate set security permissions. Implication: Code is migrated to remote devices for execution. Linked Process clouds support general-purpose distributed computing. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 22 An Artist’s Interpretation of Linked Process .0 .0.2 127 .1 0.0 1 27. .3 .0.0 127 Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • Outline LoP 23 • An Introduction to Other Distributed Computing Protocols General-Purpose Distributed Computing with MPI Internet-Scale Distributed Computing with Web Services • An Introduction to the Linked Process Protocol Internet-Scale Distributed Computing with Linked Process General-Purpose Distributed Computing with Linked Process • An Introduction to the Linked Process Protocol Implementation • Current and Future State of Linked Process Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 24 Internet-Scale: XMPP Communication Protocol • Linked Process rides atop the eXtensible Messaging and Presence Protocol. This is what gives Linked Process its Internet-scale quality. • XMPP was developed as an open protocol for Instance Messaging (GTalk, iChat, Jabber, etc.). Servers to cells phones can send and receive chat messages. • Interesting aspects of XMPP that make it useful for Internet-scale distributed computing. XMPP creates a communication layer of abstraction above IP. XMPP servers are XML packet routers between XMPP clients. XMPP is an asynchronous message passing protocol. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 25 Internet-Scale: XMPP creates a communication layer of abstraction above IP • XMPP clients are identifier by Jabber IDs (JID). an example XMPP client JID is marko@lanl.gov XMPP clients are IP independent. • XMPP clients log into XMPP servers. an example XMPP server JID is lanl.gov XMPP servers are IP dependent. • XMPP clients maintain the same JID irrespective of their physical location (i.e. IP address). Think of how your IM chat client operates. marko@lanl.gov is my JID irrespective of its logged into the XMPP server from a Los Alamos IP, New York IP, or Swedish IP. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 26 Internet-Scale: XMPP servers are XML packet routers between XMPP clients marko@lanl.gov/1234 josh@rpi.edu/5678 127.0.0.1 127.0.0.4 Client Client <packet t=1 <packet t=3 to="josh@rpi.edu" to="josh@rpi.edu" from="marko@lanl.gov" /> from="marko@lanl.gov" /> 127.0.0.2 127.0.0.3 Server <packet t=2 Server to="josh@rpi.edu" from="marko@lanl.gov" /> lanl.gov rpi.edu marko@lanl.gov/1234 and josh@rpi.edu/5678 are fully-qualified client JIDs. Many clients (i.e. applications) can exist off the same bare JID (e.g. marko@lanl.gov). Also, addresses can be fully-qualified to route the packet to a particular client. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 27 Internet-Scale: XMPP is an asynchronous message passing protocol marko@lanl.gov/1234 josh@rpi.edu/5678 127.0.0.1 127.0.0.4 Client Client <stream> <stream> ... ... <stream> <stream> ... ... <stream> 127.0.0.2 127.0.0.3 ... Server Server <stream> ... lanl.gov rpi.edu Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 28 Internet-Scale: XMPP is an asynchronous message passing protocol • My outgoing stream from my marko@lanl.gov XMPP client to the lanl.gov XMPP server. Note that any XML can be sent between clients. This is what makes XMPP “extensible.” <stream> <!-- here is a packet --> <message from="marko@lanl.gov" to="josh@rpi.edu"> <body>It is a near must that you read my blog.</body> </message> <!-- here is a packet --> <iq from="marko@lanl.gov" to="farm@rpi.edu"> <spawn_vm vm_species="groovy" vm_id="ABCD" /> </iq> <!-- here is a packet --> <message from="marko@lanl.gov to="vadas@lanl.gov"> <body>What is up with that Mike guy?</body> </message> </stream> Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • Outline LoP 29 • An Introduction to Other Distributed Computing Protocols General-Purpose Distributed Computing with MPI Internet-Scale Distributed Computing with Web Services • An Introduction to the Linked Process Protocol Internet-Scale Distributed Computing with Linked Process General-Purpose Distributed Computing with Linked Process • An Introduction to the Linked Process Protocol Implementation • Current and Future State of Linked Process Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 30 General-Purpose: Language-Agnostic Code Migration • Linked Process supports the migration of code (i.e. software, computing instructions, etc.) between devices. • Migrated code is intended to make use of the computing resources of the device (e.g. clock cycles, software APIs, hardware components, data sets, etc.) • Migrated code can be in any computer language as long as the executing device maintains an appropriate virtual machine to execute code in that language. • Devices in Linked Process serve as “computing sandboxes” that can be leverage for the execution of any code as long as the code does not violate set security permissions. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 31 General-Purpose: Code Permissions permission description type job timeout milliseconds for which a job may execute text-single vm time to live milliseconds for which a virtual machine may exist text-single shutdown farm exit the farm process boolean execute program execute a program boolean read file read from a file list-multi write file write to a file list-multi delete file delete from a file list-multi open connection open a socket connection boolean listen connection wait for a connection request boolean access print job initiate a print job request boolean ... ... ... This set of permissions is not exhaustive. The Linked Process specification has a collection of REQUIRED and RECOMMENDED permissions. Moreover, deployers may which to extend the collection to support environment specific conditions (e.g. database access). Finally, these permissions are made available through disco#info service discovery (XEP-0030). Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 32 General-Purpose: Linked Process Hierarchy Registry Cloud Countryside Farm Virtual Machine Job Villein Linked Process entities contain/maintain/manage/etc. other Linked Process entities. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 33 General-Purpose: Linked Process Entities LoP Cloud: A top-level construct which groups all farms, registries, and virtual machines to which a villein has access. Countryside: Many entities can exist on a single countryside (a bare JID). Farm: A farm is the gateway to the device’s resources and exists on a countryside. In general, there is one farm for each device. [SUPPORTS A CLOUD] Virtual Machine: A virtual machine is spawned from a farm and is the primary engine of computation in a cloud. [SUPPORTS A CLOUD] Villein: A villein is an application that leverages a cloud for computational resources (e.g. clock cycles, software, data sets, etc.). [LEVERAGES A CLOUD] Registry: A registry is responsible for maintaining a roster of countrysides and publishing only those countrysides that have active farms on them (based on <presence/>). Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 34 General-Purpose: Linked Process Packets • <spawn vm/>: create a virtual machine of a particular species (i.e. language such as JavaScript, Ruby, Python, Groovy, etc.). • <submit job/>: execute the provided instructions/expressions. • <ping job/>: determine the status of a previously submitted job. • <abort job/>: cancel a previously submitted job. • <manage bindings/>: set or get virtual machine variables. • <terminate vm/>: destroy the virtual machine process. NOTE: This presentation does not discuss interactions with registries, just farms and virtual machines. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 35 General-Purpose: A Villein and Farm marko@lanl.gov/1234 test_countryside@lanl.linkedprocess.org/LoPFarm/60KES71Y This is a screenshot from the LoPSideD GUI package. In practice, villein and farms usually don’t have this GUI front-end. This package was developed to make it easier for developers to debug their Linked Process code. However, for farm providers, its a way to see villeins communicating with their farm and to inspect the flow of packets and the state of existing virtual machines. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 36 General-Purpose: A Villein and Farm marko@lanl.gov/1234 test_countryside@lanl.linkedprocess.org/LoPFarm/60KES71Y This villein and farm are on different physical devices. The villein is made aware of the farm because the villein is subscribed to the farm’s countryside. Thus, all <presence/> packets coming from the countryside are delivered to the villein. The subscriptions of the villein are available in its roster. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 37 General-Purpose: Basic Communication Sequence marko@lanl.gov/1234 test_countryside@...60KES71Y f472fb16... 127.0.0.1 127.0.0.2 get <spawn_vm/> result <spawn_vm/> get <submit_job/> result <submit_job/> get <terminate_vm/> result <terminate_vm/> Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 38 General-Purpose: Spawning a Virtual Machine • GET sent from a villein to farm... <iq from="marko@lanl.gov/1234" to="test_countryside@...60KES71Y" type="get" id="xxxx"> <spawn_vm xmlns="http://linkedprocess.org/2009/06/Farm#" vm_species="javascript" /> </iq> • RESULT sent from the farm to the villein... <iq from=test_countryside@...60KES71Y" to="marko@lanl.gov/1234" type="result" id="xxxx"> <spawn_vm xmlns="http://linkedprocess.org/2009/06/Farm#" vm_id="f472fb16..." vm_species="javascript" /> </iq> Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 39 General-Purpose: Spawning a Virtual Machine marko@lanl.gov/1234 test_countryside@lanl.linkedprocess.org/LoPFarm/60KES71Y The use of disco#info (XEP-0030) allows a villein to discover what features and other information a farm supports. This is how the villein knows that the farm allows for the spawning of Python, JavaScript, Groovy, and Ruby virtual machines. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 40 General-Purpose: Spawning a Virtual Machine marko@lanl.gov/1234 test_countryside@lanl.linkedprocess.org/LoPFarm/60KES71Y The spawned virtual machine has an identifier that is unique to its parent farm. The virtual machine maintains a state that is altered through job submissions and binding updates. The virtual machine’s state is destroyed when the virtual machine is terminated. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 41 General-Purpose: On the Nature of Virtual Machines • Virtual machines are controlled by a farm—the farm serves as the “operating system” to control resource consumption and permissions of a virtual machine. • Virtual machines maintain their state throughout their lifetime. In other words, in general, the order in which jobs are executed matters. • Virtual machines are specific to a particular computer language and can be naturally thought of as an “XMPP-wrapped” runtime terminal. (e.g. groovy> 1 + 2;). Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 42 General-Purpose: Submitting a Job • GET sent from a villein to virtual machine (indirectly through the farm)... <iq from="marko@lanl.gov/1234" to="test_countryside@...60KES71Y" type="get" id="xxxx"> <submit_job xmlns="http://linkedprocess.org/2009/06/Farm#" vm_id="f472fb16..."> var temp=0; for(i=0; i&lt;10; i++) { temp = temp + 1; } temp; </submit_job> </iq> Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 43 • RESULT sent from the virtual machine to the villein (indirectly through the farm)... <iq from="test_countryside@...60KES71Y" to="marko@lanl.gov/1234" type="result" id="xxxx"> <submit_job xmlns="http://linkedprocess.org/2009/06/Farm# vm_id="f472fb16..."> 10.0 <submit_job/> </iq> Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 44 General-Purpose: Submitting a Job marko@lanl.gov/1234 test_countryside@lanl.linkedprocess.org/LoPFarm/60KES71Y Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 45 General-Purpose: Submitting a Job marko@lanl.gov/1234 test_countryside@lanl.linkedprocess.org/LoPFarm/60KES71Y Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 46 General-Purpose: On the Nature of Jobs • Jobs are executed synchronously where they are processed according to a FIFO (first in, first out) queue. • A job is a “chunk” of code in the language of the virtual machine. Jobs can be as simple as setting a variable (e.g. i = 1 + 2;) to as complex as a class definition or full program. • Jobs can make use of the software packages (APIs) existing on the device. For example, Groovy code can import Java classes made available by the farm and instantiate them. • If the expressions/code in a job violates the permissions of the virtual machine, that job is rejected with a permission denied error. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 47 General-Purpose: Terminating a Virtual Machine • GET sent from a villein to virtual machine (indirectly through the farm)... <iq from="marko@lanl.gov/1234" to="test_countryside@...60KES71Y" type="get" id="xxxx"> <terminate_vm xmlns="http://linkedprocess.org/2009/06/Farm#" vm_id="f472fb16..." /> </iq> • RESULT sent from the virtual machine to the villein (indirectly through the farm)... <iq from="test_countryside@...60KES71Y" to="marko@lanl.gov/1234" type="result" id="xxxx"> <terminate_vm xmlns="http://linkedprocess.org/2009/06/Farm#" vm_id="f472fb16..."/> </iq> Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 48 General-Purpose: Terminating a Virtual Machine marko@lanl.gov/1234 test_countryside@lanl.linkedprocess.org/LoPFarm/60KES71Y Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 49 General-Purpose: Terminating a Virtual Machine marko@lanl.gov/1234 test_countryside@lanl.linkedprocess.org/LoPFarm/60KES71Y A terminated virtual machine releases all of its resources. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • Outline LoP 50 • An Introduction to Other Distributed Computing Protocols General-Purpose Distributed Computing with MPI Internet-Scale Distributed Computing with Web Services • An Introduction to the Linked Process Protocol Internet-Scale Distributed Computing with Linked Process General-Purpose Distributed Computing with Linked Process • An Introduction to the Linked Process Protocol Implementation • Current and Future State of Linked Process Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 51 LoPSideD: A Java Implementation of the Linked Process Protocol • LoPSideD Farm: A farm that currently support JavaScript, Ruby, Python, and Groovy virtual machines. • LoPSideD Registry: A registry for advertising and locating farms. • LoPSideD Villein API (Application Programming Interface): Classes to build villeins that leverage a Linked Process cloud. • LoPSideD Farm/Villein GUI (Graphical User Interface): A user interface for managing a farm, for communicating with a farm, and generally useful for debugging (e.g. XMPP packet sniffing mechanisms). LoPSideD Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 52 LoPSideD Villein API • Commands: Provides support to spawn/terminate virtual machines, submit/ping/abort jobs, and manage bindings. • Proxies: Provides a collection of proxy data structures that makes the underlying XMPP protocol relatively invisible to the developer. • Patterns: Provides support for various distributed computing patterns such as asynchronous, synchronous, scatter/gather, etc. • Demos: Provides a collection of simple demos such as a distributed prime finding and distributed Web of Data analysis. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • Outline LoP 53 • An Introduction to Other Distributed Computing Protocols General-Purpose Distributed Computing with MPI Internet-Scale Distributed Computing with Web Services • An Introduction to the Linked Process Protocol Internet-Scale Distributed Computing with Linked Process General-Purpose Distributed Computing with Linked Process • An Introduction to the Linked Process Protocol Implementation • Current and Future State of Linked Process Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 54 The Current State of Linked Process • The Linked Process protocol specification is nearly complete for submission to the standards track of the XMPP Standards Foundation. This means that the protocol that has been presented is still in a relatively volatile state and various mechanics of the protocol may change through this standards process. • The LoPSideD implementation is nearly ready for a first version release. • An experiment demonstrating the use of Linked Process to distributed computing on the Web of Data is currently being conducted. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • 55 The Future State of Linked Process • Move the Linked Process specification through the standards process from experimental, to draft, and ultimately to standard status. • Develop implementations of the Linked Process Villein API in other languages (currently there are plans for a Ruby and Python port). • Add more virtual machine species to the LoPSideD Farm implementation: Scheme/Lisp, Tcl, PHP, SmallTalk, etc. • Work with projects that are in need of the distributed computing solution offered by Linked Process. • Work with more developers to expand the implementation base. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009
  • LoP Acknowledgements 56 • Joshua Shinavier (Rensselaer Polytechnic Institute): codesigner of the protocol and codeveloper of LoPSideD. • Peter Neubauer (Neo Technology): evangelist and tester of the LoPSideD codebase. • Mick Thompson (Santa Fe Complex): provided the machines for the deployment of the first Linked Process cloud. • Jack Moffitt and Peter Saint-Andre (XMPP Standards Foundation): for support through the standards process. Please visit Linked Process at http://linkedprocess.org. Center for Nonlinear Studies – Los Alamos, New Mexico – September 10, 2009