2012 07-jvm-summit-batches

110 views
108 views

Published on

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
110
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

2012 07-jvm-summit-batches

  1. 1. JVM Summit 2012 Batches: Unifying RPC, WS and Database accessThe University of Texas at Austin William R. Cook University of Texas at Austin with Eli Tilevich, Yang Jiao, Virginia Tech Ali Ibrahim, Ben Wiedermann, UT Austin
  2. 2. JVM Summit 2012 Typical Approach to Distribution/DB 1. Design a programming language Finalize specification, then... 2. Implement distribution/queries as library RPC library, stub generator SQL libraryThe University of Texas at Austin Web Service library, wrapper generator 2
  3. 3. JVM Summit 2012 Typical Approach to Distribution/DB 1. Design a programming language Finalize specification, then... 2. Implement distribution/queries as library RPC library, stub generator SQL libraryThe University of Texas at Austin Web Service library, wrapper generator 3. Rejected and ignored by community CORBA, DCOM, RMI are complete disaster 3
  4. 4. JVM Summit 2012 Using a Mail Service int limit = 500; for ( Message m : mailer.Messages ) if ( m.Size > limit ) { print( m.Subject + “: “ + m.Sender.Name); m.delete();The University of Texas at Austin } 4
  5. 5. JVM Summit 2012 Using a Mail Service int limit = 500; for ( Message m : mailer.Messages ) if ( m.Size > limit ) { print( m.Subject + “: “ + m.Sender.Name); m.delete();The University of Texas at Austin } Works great if mailer is  a local object, but is terrible if  mailer is remote 5
  6. 6. JVM Summit 2012 Goals: Why not Have it All? A Note on  Low latency One round trip Distributed  Computing Stateless No Proxies Platform independent No Serialization Clean Server APIs No Data Transfer ObjectThe University of Texas at Austin No Server Facade No Superclass/type     (e.g. java.rmi.Remote) 6
  7. 7. JVM Summit 2012 Using a Mail Service int limit = 500; for ( Message m : mailer.Messages ) if ( m.Size > limit ) { print( m.Subject + “: “ + m.Sender.Name); m.delete(); Separate localThe University of Texas at Austin } and remote code! 7
  8. 8. JVM Summit 2012 for (m : ROOT.Messages) {               if (m.Size > In(“A”)) {                      Run remote OUT(“B”, m.Subject);                   script first OUT(“C”, m.Sender.Name); m.delete(); int limit = 500; } Remote<Mailer> connection = ....; B C batch (Mailer mailer : connection) { RE: testing Will Cook for ( Message m : mailer.Messages ) JVM Summit Dan SmithThe University of Texas at Austin ... ... if ( m.Size > limit ) { print( m.Subject + “: “ + m.Sender.Name); m.delete(); Remote<Mailer> connection = ....; } Forest in = new Forest(“A”, limit); Forest mailer = connection.execute(script, in); } for (m : mailer.getIteration(“m”)) print(m.getStr(“B”) + “: ” + m.getStr(“C”) ); 8
  9. 9. JVM Summit 2012 Batch Pattern Execution model: Batch Command Pattern 1. Client sends script to the server (Creates Remote Façade on the fly) 2. Server executes the script 3. Server returns results in bulk (name, size)The University of Texas at Austin (Creates Data Transfer Objects on the fly) 4. Client runs the local code (print statements) 9
  10. 10. JVM Summit 2012 Batch Script to SQL for (m : ROOT.Messages) { if (m.Size > In(“A”)) { out(“B”, m.Subject); out(“C”, m.Sender.Name); m.delete(); }The University of Texas at Austin SELECT m.Subject as B, u.Name as C FROM Message m INER JOIN User u     ON m.Sender = User.ID WHERE m.Size > ? Always constant  DELETE FROM Message number of queries WHERE m.Size > ?  10
  11. 11. JVM Summit 2012 Batch Script (Subset of JavaScript) s ::= <literal> variables, constants | e.x fields | e.m(e, ..., e) method call | e=e assignment | e ⊕ e | !e | e ? e : e primitive operatorsThe University of Texas at Austin | if (e) { e } [else { e }] conditionals | for (x in e) { e } loops | var x = e; e binding | OUTPUT(label, e) outputs | INPUT(label) inputs | function(x) { e } functions 11 ⊕= + - * / % < <= == => > && || ;
  12. 12. JVM Summit 2012 Batch Providers Execution: Forwarder: Execute A TCP TCPThe University of Texas at Austin Send Receive A B ? B SQL Gen DB A 12
  13. 13. JVM Summit 2012 Batch Providers Execution: 568 LOC in Java Forwarder: Execute A TCP TCPThe University of Texas at Austin Send Receive A B ? B XML, JSON, ... SQL (no fetish about Gen DB transport) A 13
  14. 14. JVM Summit 2012 Combining Batch Providers Traditional RPC: TCP TCP Send Receive (Python) (Java) Execute (Java) Database Connection:The University of Texas at Austin TCP TCP Send Receive (JS) (Java) SQL Gen DB (Java) 14
  15. 15. JVM Summit 2012 The Hard Part Add Batch Statement to your favorite language JavaScript, Python, ML, F#, Scala, etc (More difficult in dynamic languages) Batch (x) {...} LangToScript Script + OtherThe University of Texas at Austin                       Partition(x) I = Local1 Remote Local2 ScriptToLang Local1 O = Execute(Remote, I) (no other) Local2(O) This is the part that  needs to be written 15
  16. 16. JVM Summit 2012 Batch Summary Client Batch statement: compiles to Local/Remote/Local Works in any language (e.g. Java, Python, JavaScript) Completely cross-language and cross-platform ServerThe University of Texas at Austin Small engine to execute scripts Call only public methods/fields (safe as RPC) Stateless, no remote pointers (aka proxies) Communication Forests (trees) of primitive values (no serialization) Efficient and portable 16
  17. 17. JVM Summit 2012 Batch = One Round Trip Clean, simple performance model Some batches would require more round trips batch (..) { if (AskUser(“Delete ” + msg.Subject + “?”)The University of Texas at Austin msg.delete(); } Pattern of execution OK: Local → Remote → Local Error: Remote → Local → Remote Cant just mark everything as a batch! 17
  18. 18. JVM Summit 2012 What about Object Serialization? Batch only transfers primitive values, not objects But they work with any object, not just remotable ones Send a local set to the server? java.util.Set<String> local = … ;The University of Texas at Austin batch ( mail : server ) { mail.sendMessage( local, subject, body); // compiler error sending local to remote method } 18
  19. 19. JVM Summit 2012 Serialization by Public Interfaces java.util.Set<String> local = … ; batch ( mail : server ) { service.Set recipients = mail.makeSet(); for (String addr : local ) recipients.add( addr );The University of Texas at Austin mail.sendMessage( recipients, subject, body); } Sends list of addresses with the batch Constructors set on server and populates it Works between different languages 19
  20. 20. JVM Summit 2012 Interprocedural Batches Reusable serialization function @Batch service.Set send(Mail server, local.Set<String> local) { service.Set remote = server.makeSet(); for (String addr : local )The University of Texas at Austin remote.add( addr ); return remote; } Main program batch ( mail : server ) { remote.Set recipients = send( localNames ); 20
  21. 21. JVM Summit 2012 Exceptions Server Exceptions Terminate the batch Return exception in forest Exception is raised in client at same point as on server Client ExceptionsThe University of Texas at Austin Be careful! Batch has already been fully processed on server Client may terminate without handling all results locally 21
  22. 22. JVM Summit 2012 Transactions and Partial Failure Batches are not necessarily transactional But they do facilitate transactions Server can execute transactionally Batches reduce the chances for partial failureThe University of Texas at Austin Fewer round trips Server operations are sent in groups 22
  23. 23. JVM Summit 2012 Order of Execution Preserved All local and remote code runs in correct order batch ( remote : service ) { print( remote.updateA( local.getA() )); // getA, print print( remote.updateB( local.getB() )); // getB, print }The University of Texas at Austin Partitions to: input.put(“X”, local.getA() ); // getA input.put(“Y”, local.getB() ); // getB .... execute updates on server print( result.get(“A) ); // print print( result.get(“B”) ); // print 23
  24. 24. JVM Summit 2012 Lambdas for (InfoSchema db : connection) for (Group<Category, Product> g : db.Products.groupBy(Product.byCategory)) print("Category={0}t ProductCount={1}", g.Key.CategoryName, g.Items.count()); Fun<Product, Category> byCategory =The University of Texas at Austin new Fun<Product, Category>() { public Category apply(Product p) { return p.Category; } }; 24
  25. 25. JVM Summit 2012 Available Now... Batch Java 100% compatible with Java 1.7 (OpenJDK) Transport: XML, JSON, easy to add more Available now (alpha) Batch statement as “for”The University of Texas at Austin for (RootInterface r : serviceConenction) { ... } Full SQL generation and ORM Select/Insert/Delete/Update, aggregate, group, sorting 25

×