A Model-Driven Approach to Job/Task
 Composition in Cluster Computing

   (Slide background intentionally left blank.)

  ...
An “Outsider’s” View of Grid/
           Cluster/HPC Research
     quot;My personal reservation is that most of the Grid [...
Outline
•                            •
    About Model-Driven           XML Model Interchange
    Development             ...
Model-Driven
              Development
•   Approach: Express computational tasks in the form of a high-
    level domain m...
CN in a Nutshell
•   CN Server: CN Servers run on the various nodes of the cluster.

•   CN API: Client programs use the C...
Why UML?
• Standard modeling notation
• Wide tool support (e.g. Eclipse, NetBeans,
  Poseidon, Rational, and others)
• Too...
UML Diagrams
•   UML supports many dynamic diagrams (beyond sequence
    diagrams, the most common case)

•   An activity ...
Activity Diagrams and
 Parallel Computing
•   In parallel computing, a computational job typically
    consists of one or ...
CN/XML Job
Descriptors (Nutshell)
•   CN/XNL (CNX) is a job/task composition language inspired by CSP.

•   Task attribute...
XML Metadata
               Interchange
•   XMI is an XML based format for persisting a UML model for possible exchange
  ...
Tagged Values
•   Tagged values allow metadata to be included in the UML activity diagram (model or
    domain-specific inf...
Activity Diagram for Guiding Example using Explicit
         Concurrency (Static Composition)
UML for Guiding Example
(Dynamic Configuration Based on Runtime Parameter)
Model Transformation
•   The UML model for the CN computation is
    created in the form of an activity diagram

•   The U...
Transformation Pipeline
XMI Fragment (from UML Activity Diagram)
  ! UML:ActionState xmi.id = 'a89' name = 'TCTask2'
  <
  ! isSpecification = 'fa...
CN/XML Descriptor Generated from XMI
        (Activity Diagram)
 ! ?xml version=quot;1.0quot;?>
 <
 ! cn2>
 <
 ! <client c...
Conclusions
•   This work is at a preliminary stage. CN has a mature implementation but is still
    not a prevalent appro...
Futures
•   CN is an active and ongoing research effort.

•   Explore other aspects of UML, especially for modeling
    in...
Upcoming SlideShare
Loading in …5
×

CN/UML at IPDPS JavaPDC 2007

466 views

Published on

Computational Neighborhood/UML Presentation at International Parallel and Distributed Processing Symposium, Java PDC Workshop.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
466
On SlideShare
0
From Embeds
0
Number of Embeds
42
Actions
Shares
0
Downloads
29
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

CN/UML at IPDPS JavaPDC 2007

  1. 1. A Model-Driven Approach to Job/Task Composition in Cluster Computing (Slide background intentionally left blank.) George K. Thiruvathukal (presenter) Neeraj Mehta,Yogesh Kanitkar, and Konstantin Läufer Loyola University Chicago Computer Science / Emerging Technologies Laboratory http://www.etl.luc.edu http://code.google.com/p/neighborhood
  2. 2. An “Outsider’s” View of Grid/ Cluster/HPC Research quot;My personal reservation is that most of the Grid [and cluster] researchers are operating at the level of threads, sockets, and files. I believe the Grid will be a federation of applications and data servers – the DataGrid in the taxonomy above rather than a number cruncher ComputerGrid. This data-centric view means that files (and FTP) are the wrong metaphor. Classes, methods, and objects (encapsulated data) are the right metaphor.quot; Jim Gray, Microsoft and Grid Computing, 2002 http://research.microsoft.com/~Gray/papers/ Microsoft_and_Grid_Computing.doc
  3. 3. Outline • • About Model-Driven XML Model Interchange Development (XMI) • • CN in a Nutshell Model Transformation (XMI to CNX) • UML Overview • Tagged Values • UML Activity Diagrams • for Task Composition Conclusions • • CN/XML Job Descriptors Future Plans
  4. 4. Model-Driven Development • Approach: Express computational tasks in the form of a high- level domain model and generate executable code from model automatically • Goals/motivation: • Much easier to adapt high-level model to changing requirements. • Hide low-level details of mapping tasks to computational resources and inter-task communication. • Independence of specific cluster configuration.
  5. 5. CN in a Nutshell • CN Server: CN Servers run on the various nodes of the cluster. • CN API: Client programs use the CN API to execute and exploit the various resources of the cluster. • CN Intelligent Object Editor: The user could specify the details required to generate the Client program using this graphical use interface. • CNX: CNX (XML) is a compositional language that captures the details of the client program. • CNX2Java: CNX2Java is an XSLT document that translates CNX to compilable JAVA code. • XMI2CNX: XMI2CNX is an XSLT document that translates UML model in XMI format to CNX.
  6. 6. Why UML? • Standard modeling notation • Wide tool support (e.g. Eclipse, NetBeans, Poseidon, Rational, and others) • Tool interoperability through XMI (XML Metadata Interchange) • XMI itself is transformable to other formats via XSLT or ad hoc transformers.
  7. 7. UML Diagrams • UML supports many dynamic diagrams (beyond sequence diagrams, the most common case) • An activity diagram is a visual representation of an activity graph. • An activity graph is a state machine whose states represent actions or subactivities and where transitions out of states are triggered by the completion of the corresponding actions or subactivities. • Activity diagrams are thus intended for modeling computations whose control flow is driven by internal processing.
  8. 8. Activity Diagrams and Parallel Computing • In parallel computing, a computational job typically consists of one or more concurrent tasks whose dependencies form a directed acyclic graph • In CN, a client is composed from one or more such tasks. While the details of jobs and tasks are implemented modularly in the target programming language, the jobs and tasks need to be “glued” together outside of the implementation itself.
  9. 9. CN/XML Job Descriptors (Nutshell) • CN/XNL (CNX) is a job/task composition language inspired by CSP. • Task attributes: • name: user-friendly name • jar/archive: where to find the code (most CN jobs are Java, so you can pack the job into a .jar file) • class: entry point • dependencies: predecessors • Task requirements: • memory: memory needed to run Task • run model: e.g. create new VM or run threads within current VM • Task parameters: Properties (key/value pairs) • Tasks may be composed recursively using parallel or sequential CSP-like constructs. Task dependencies (and IDs) generated automatically.
  10. 10. XML Metadata Interchange • XMI is an XML based format for persisting a UML model for possible exchange among differnet UML tools. • CNX is a rich descriptor for CN Jobs and Tasks (transformable to XMI) • UML's XML Model Interchange is a bare bones descriptor framework. Annotations can be made using tagged values. • <UML:ActionState>: Each CN task is represented as an ActionState. • <UML:StateVertex.outgoing>: The tasks to be notified when a particular task has completed. • <UML:StateVertex.incoming>: The tasks that must be completed before this task can run. • Jobs and tasks are nameless in CN. In XMI, tasks are translated into nodes, which have document-wide unique identifiers.
  11. 11. Tagged Values • Tagged values allow metadata to be included in the UML activity diagram (model or domain-specific information) • Example of CN-specific metadata included as UML “tagged values”: • jar: tctask.jar • class: org.jhpc.cn2.trnsclsrtask.TCTask • memory: 1000 • runmodel: RUN AS THREAD IN TM • ptype0: java.lang.Integer • pvalue0: 2
  12. 12. Activity Diagram for Guiding Example using Explicit Concurrency (Static Composition)
  13. 13. UML for Guiding Example (Dynamic Configuration Based on Runtime Parameter)
  14. 14. Model Transformation • The UML model for the CN computation is created in the form of an activity diagram • The UML model is exported as an XMI document • The XMI document is transformed, using XSL Transformations to a CNX client descriptor • The CNX client descriptor is transformed, using XSL Transformations, to a client program in the target language (currently Java)
  15. 15. Transformation Pipeline
  16. 16. XMI Fragment (from UML Activity Diagram) ! UML:ActionState xmi.id = 'a89' name = 'TCTask2' < ! isSpecification = 'false' isDynamic = 'false'> ! <UML:TaggedValue xmi.id = 'a91' isSpecification = 'false' ! dataValue = '1000'> ! <UML:TaggedValue.type> ! <UML:TagDefinition xmi.idref = 'a13'/> ! </UML:TaggedValue.type> ! </UML:TaggedValue> ! <UML:TaggedValue xmi.id = 'a92' isSpecification = 'false' ! dataValue = 'RUN_AS_THREAD_IN_TM'> ! <UML:TaggedValue.type> ! <UML:TagDefinition xmi.idref = 'a16'/> ! </UML:TaggedValue.type> ! </UML:TaggedValue> ! <UML:TaggedValue xmi.id = 'a93' isSpecification = 'false' ! dataValue = 'tctask.jar'> ! <UML:TaggedValue.type> ! <UML:TagDefinition xmi.idref = 'a7'/> ! </UML:TaggedValue.type> ! </UML:TaggedValue> ! <UML:TaggedValue xmi.id = 'a94' isSpecification = 'false' ! dataValue = 'org.jhpc.cn2.trnsclsrtask.TCTask'> ! <UML:TaggedValue.type> ! <UML:TagDefinition xmi.idref = 'a10'/> ! </UML:TaggedValue.type> ! </UML:TaggedValue> ! </UML:ModelElement.taggedValue> ! <UML:StateVertex.outgoing> ! <UML:Transition xmi.idref = 'a95'/> ! <UML:Transition xmi.idref = 'a96'/> ! </UML:StateVertex.outgoing> ! <UML:StateVertex.incoming> ! <UML:Transition xmi.idref = 'a78'/> ! </UML:StateVertex.incoming> ! /UML:ActionState> < !
  17. 17. CN/XML Descriptor Generated from XMI (Activity Diagram) ! ?xml version=quot;1.0quot;?> < ! cn2> < ! <client class=quot;TransCloserquot; log=quot;CN_Client1047909210005.logquot; port=quot;5666quot;> ! <job> ! <task name=quot;tctask1quot; jar=quot;tasksplit.jarquot; class=quot;org.jhpc.cn2.transcloser.TaskSplitquot; depends=quot;quot;> ! <task-req> ! <memory>1000</memory> ! <runmodel>RUN_AS_THREAD_IN_TM</runmodel> ! </task-req> ! <param type=quot;Stringquot;>matrix.txt</param> ! </task> ! <task name=quot;tctask2quot; jar=quot;tctask.jarquot; class=quot;org.jhpc.cn2.trnsclsrtask.TCTaskquot; depends=quot;tctask1quot;> ! <param type=quot;Integerquot;>1</param> ! <task-req> ! <memory>1000</memory> ! <runmodel>RUN_AS_THREAD_IN_TM</runmodel> ! </task-req> ! </task> ! <task name=quot;tctask3quot; jar=quot;tctask.jarquot; class=quot;org.jhpc.cn2.trnsclsrtask.TCTaskquot; depends=quot;tctask1quot;> ! <param type=quot;Integerquot;>2</param> ! <task-req> ! <memory>1000</memory> ! <runmodel>RUN_AS_THREAD_IN_TM</runmodel> ! </task-req> ! <param type=quot;Stringquot;>HELLO_WORLD</param> ! </task> ! <task name=quot;tctask4quot; jar=quot;tctask.jarquot; class=quot;org.jhpc.cn2.trnsclsrtask.TCTaskquot; depends=quot;tctask1quot;> ! <param type=quot;Integerquot;>3</param>
  18. 18. Conclusions • This work is at a preliminary stage. CN has a mature implementation but is still not a prevalent approach. • We've been successful to go from UML to CN's XML descriptors, to running programs. • Not all parallel programs have a simple decomposition. (Breaking news!) • Other UML functionality will need to be introduced to model complex (internal) task structure. • CN has much in common with workflow systems but is not aimed directly at supporting SOA. CN computations, of course, can be exposed as proper services. • That said, UML is aimed at large-scale software engineering. A component modeled with UML can be embedded in a larger UML diagram and/or stand on its own.
  19. 19. Futures • CN is an active and ongoing research effort. • Explore other aspects of UML, especially for modeling internal task behavior. • Integrate CN (a P2P clustering framework) with Hydra (a P2P storage framework). • Replace CN task messaging/signaling with Trull, an event- based framework for concurrent (and distributed) systems, led by Konstantin Läufer (also at Loyola University Chicago). • Redo Intelligent Object Editor (not covered here) using Eclipse or use Eclipse UML plugin. • Extend work for other grid/clustering projects.

×