Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Unifying Remote Data, Remote Procedure, and Service Clients
1. ECOOP 11 Summer School
Unifying
Remote Data,
Remote Procedure,
and Service
Clients
University of Texas at Austin
William R. Cook
University of Texas at Austin
with
Eli Tilevich, Yang Jiao, Virginia Tech
Ali Ibrahim, Ben Wiedermann, UT Austin
2. ECOOP 11 Summer School
Note
The actual summer school lecture
was interactive with code and examples
written out on a large blackboard
University of Texas at Austin
with chalk. These slides are an alternative
presentation of the same material.
3. University of Texas at Austin ECOOP 11 Summer School
RPC
“remoteness”
Three kinds of
Web
Services
SQL
3
4. ECOOP 11 Summer School
Transactional, Efficient
SQL
RPC
Ease of
Programming
?
University of Texas at Austin
Web
Services
Cross-platform
4
5. ECOOP 11 Summer School
Using a Mail Service
int limit = 500;
Mailer mailer = ...connect to mail server...;
for ( Message m : mailer.Messages )
if ( m.Size > limit ) {
print( m.Subject & m.Sender.Name);
m.delete();
}
University of Texas at Austin
5
6. ECOOP 11 Summer School
Using a Mail Service
int limit = 500;
Mailer mailer = ...connect to mail server...;
for ( Message m : mailer.Messages )
if ( m.Size > limit ) {
print( m.Subject & m.Sender.Name);
m.delete();
}
University of Texas at Austin
Works great if mailer is local object,
but is terrible if mail server is remote
6
7. ECOOP 11 Summer School
Typical Approach to Distribution/DB
1. Design a programming language
(don't think at all about distributed computing)
2. Implement distribution as a library
University of Texas at Austin
7
8. ECOOP 11 Summer School
Typical Approach to Distribution/DB
1. Design a programming language
(don't think at all about distributed computing)
2. Implement distribution as a library
RPC library, stub generator
SQL library
University of Texas at Austin
Web Service library, wrapper generator
8
9. ECOOP 11 Summer School
Call Level Interface (e.g. JDBC)
// create a remote query/action
String q1 = “select Subject, Name
from Messages join Person on ...
where Size > ” + limit;
String q2 = “delete from Messages
where Size > ” + limit;
// execute on server
University of Texas at Austin
ResultSet rs = conn.executeQuery(q1);
conn.executeCommand(q2);
// use the results
while ( rs.next() )
print( rs.getString(“Subject”) & “ Deleted” );
9
10. ECOOP 11 Summer School
LINQ
// create the remote script/query
var results = from m in Messages
where m.Size > limit
select { m.Subject, m.Sender.Name };
// execute and use the results
for (var rs in results )
print( rs.Subject & rs.Name ); Programmer
// delete the items creates
University of Texas at Austin
var big = from m in Messages the result set
where m.Size > limit
select m;
db.Messages.Remove(big);
db.SubmitChanges();
// interesting issues about order of operations...
10
11. ECOOP 11 Summer School
Why can't we just say this?
int limit = 500;
Mailer mailer = ...connect to mail server...;
for ( Message m : mailer.Messages )
if ( m.Size > limit ) {
print( m.Subject & m.Sender.Name);
m.delete();
}
University of Texas at Austin
11
12. ECOOP 11 Summer School
Libraries are NOT good enough!
int limit = 500;
Mailer mailer = ...connect to mail server...;
for ( Message m : mailer.Messages )
if ( m.Size > limit ) {
print( m.Subject & m.Sender.Name);
m.delete();
}
University of Texas at Austin
Nice, but
slow!
12
13. ECOOP 11 Summer School
Original definition of “impedance mismatch”:
“Whatever the database programming
model, it must allow complex, data-
intensive operations to be picked out of
programs for execution by the storage
manager...”
University of Texas at Austin
David Maier, DBLP 1987
It's not impedance of data formats,
it's impedance in processing models!!
13
14. University of Texas at Austin ECOOP 11 Summer School
Where did
we go wrong?
14
15. University of Texas at Austin ECOOP 11 Summer School
Remote
Call
Procedure proc(data)
15
16. University of Texas at Austin ECOOP 11 Summer School
C
D
R
CO
BA
OR M
MI
20
yea
rs
proc(data)
Objects
Distributed
16
17. ECOOP 11 Summer School
Benefits
Use existing languages
Elegant OO model
Problems
Latency (many round trips)
Stateful communication
Platform-specific
Solutions?
University of Texas at Austin
Remote Façade
Data Transfer Object
Solutions are as bad
as the problem!
17
18. print( r.getName() );
ECOOP 11 Summer School
print( r.getSize() );
University of Texas at Austin
Another
Start 18
19. ECOOP 11 Summer School
Another start...
Motivating example:
print( r.getName() );
print( r.getSize() );
University of Texas at Austin
19
20. ECOOP 11 Summer School
Another start...
Starting point Notes:
print( r.getName() ); print is local
print( r.getSize() ); r is remote
University of Texas at Austin
20
21. ECOOP 11 Summer School
Language Design Problem
Starting point Notes:
print( r.getName() ); print is local
print( r.getSize() ); r is remote
Goals
Fast: one round trip
Stateless communication
University of Texas at Austin
do not require persistent connection
Platform independent
no serialization of complex user-defined objects
Clean programming model
21
22. ECOOP 11 Summer School
Language Design Problem
Starting point Notes:
print( r.getName() ); print is local
print( r.getSize() ); r is remote
Goals Can change
Fast: one round trip language and/or
Stateless communication compiler
University of Texas at Austin
do not require persistent connection
Platform independent
no serialization of complex user-defined objects
Clean programming model
22
23. ECOOP 11 Summer School
A New Language Construct: Batches
batch ( Item r : service ) {
print( r.getName() );
print( r.getSize() );
}
University of Texas at Austin
23
24. ECOOP 11 Summer School
A Novel Solution: Batches
batch ( Item r : service ) {
print( r.getName() );
print( r.getSize() );
}
Execution model: Batch Execution Pattern
1. Client sends script to the server
University of Texas at Austin
(Creates Remote Façade on the fly)
2. Server executes two calls
3. Server returns results in bulk (name, size)
(Creates Data Transfer Objects on the fly)
4. Client runs the local code (print statements)
24
25. ECOOP 11 Summer School
Compiled Client Code (generated)
// create remote script
script = <[
outA( *.getName() )
outB( *.getSize() )
]>;
// execute on the server
Forest x = service.execute( script );
University of Texas at Austin
// Client uses the results
print( x.get(“A”) );
print( x.getInt(“B”) );
Batch
Execution
Pattern
25
26. ECOOP 11 Summer School
A larger example
int limit = ...;
Service<Mailer> serviceConnection =...;
batch ( Mailer mail : serviceConnection ) {
for ( Message msg : mail.Messages )
if ( msg.Size > limit ) {
print( msg.Subject & “ Deleted” );
msg.delete();
University of Texas at Austin
}
else
print( msg.Subject & “:” & msg.Date );
}
26
27. ECOOP 11 Summer School
A larger example
int limit = ...;
Service<Mailer> serviceConnection =...;
batch ( Mailer mail : serviceConnection ) {
for ( Message msg : mail.Messages )
if ( msg.Size > limit ) { Local to remote
print( msg.Subject & “ Deleted” );
msg.delete();
University of Texas at Austin
}
Method call
else
print( msg.Subject & “:” & msg.Date );
}
Remote to local
27
28. ECOOP 11 Summer School
Remote part as Batch Script
script = <[
for ( msg : *.Messages ) {
outA( msg.Subject );
if ( outB( msg.Size > inX ) ) {
msg.delete();
} else {
outC( msg.Date );
University of Texas at Austin
}
]>
28
29. ECOOP 11 Summer School
Remote part as Batch Script
Root object of service
script = <[
for ( msg : *.Messages ) {
outA( msg.Subject ); Input named “X”
if ( outB( msg.Size > inX ) ) {
msg.delete(); Output named “C”
} else {
outC( msg.Date );
University of Texas at Austin
}
]>
Not Java: its “batch script”
29
30. ECOOP 11 Summer School
Forest: Trees of Basic Values
Input Forest
A
10k
Output Forest
msg
A B C
University of Texas at Austin
“Product Update” true
“Status Report” false 12/2/09
“Happy Valentine” true
“Happy Hour” false 2/26/10
“Promotion” false 3/1/10
30
31. ECOOP 11 Summer School
Compiled code (auto generated)
Service<Mailer> serviceConnection =...;
in = new Forest();
in.put(“X”, limit);
Forest result =
serviceConnection.execute(script, in);
for ( r : result.getIteration(“msg”) )
if ( r.getBoolean(“B”) )
University of Texas at Austin
print( r.get(“A”) & “ Deleted” );
else
print( r.get(“A”) & “:” & r.get(“C”) );
31
32. ECOOP 11 Summer School
Forest Structure == Control flow
for (x : r.Items) {
print( x.Name );
for (y : x.Parts)
print( y.Name );
}
items
Name
Name Parts
University of Texas at Austin
“Cylinder”
“Engine”
“Hood” Name
“Wheel” Name
“Tire”
“Rim”
32
33. ECOOP 11 Summer School
Batch Summary
Client
Batch statement: compiles to Local/Remote/Local
Works in any language (e.g. Java, Python, JavaScript)
Completely cross-language and cross-platform
Server
Small engine to execute scripts
University of Texas at Austin
Call only public methods/fields (safe as RPC)
Stateless, no remote pointers (aka proxies)
Communication
Forests (trees) of primitive values (no serialization)
Efficient and portable
33
34. ECOOP 11 Summer School
Batch Script Language
e ::= x | c variables, constants
| if e then e else e conditionals
| for x : e do e loops
| let x = e in e binding
| x = e | e.x = e assignment
| e.x fields
| e.m(e, ..., e) method call
University of Texas at Austin
| e … e primitive operators
| inX e | outX e parameters and results
| fun(x) e functions
= + - * / % ; < <= == => > & | not
Agree on script semantics, not object representation
(just like SQL) 34
35. University of Texas at Austin ECOOP 11 Summer School
LI
O/
BC
JD NQ
OD EJB
30
yea
rs
SQL
Clients
Database
35
36. ECOOP 11 Summer School
Call Level Interface (e.g. JDBC)
// create a remote script/query
String q = “select name, size
from files
where size > 90”;
// execute on server
Statement st = conn.createStatement();
ResultSet rs = st.executeQuery(q);
University of Texas at Austin
// use the results
while ( rs.next() ) {
print( rs.getString(“name”) );
print( rs.getInteger(“size”) );
}
36
37. ECOOP 11 Summer School
Call Level Interface (e.g. JDBC)
// create a remote script/query
String q = “select name, size Batch
from files execution
where size > 90”; pattern!
// execute on server
Statement st = conn.createStatement();
ResultSet rs = st.executeQuery(q);
University of Texas at Austin
// use the results
while ( rs.next() ) {
print( rs.getString(“name”) );
print( rs.getInteger(“size”) );
}
37
38. ECOOP 11 Summer School
Batches ==> SQL
batch ( Service<FileSystem> directory : service ) {
for ( File f : directory.Files )
if ( f.Size > 90 ) {
print( f.Name );
print( f.Size );
}}
Batch Script:
University of Texas at Austin
for (f : *.Files)
if (f.Size > 90) { outA(f.Name); outB(f.Size) }
SQL:
SELECT f.Name, f.Size
FROM Files
WHERE f.Size > 90 38
39. ECOOP 11 Summer School
Batches for SQL
Batch compiler creates SQL automatically
Efficient handling of nested of loops
Always a constant number of queries for a batch
Supports all aspects of SQL
Queries, updates, sorting, grouping, aggregations
Needs LINQ-style lambdas for some operations
University of Texas at Austin
(aggregation, grouping, sorting), but not select/project
Summary
Clean fine-grained object-oriented programming model
Efficient SQL batch execution
39
40. ECOOP 11 Summer School
LINQ
// create the remote script/query
var results = from f in files
where size > 90
select { f.name, f.size };
// execute and use the results
for (var rs in results ) {
University of Texas at Austin
print( rs.name ); Programmer
print( rs.size ); creates
} the result set
40
41. ECOOP 11 Summer School
Analysis of LINQ
Virtual Collections
Operations on collection are delayed
Where, Select, Take, Join, GroupBy, OrderBy, etc.
Most of the work is in lambdas (first-class functions)
Expression Trees
Convert lambdas into Abstract Syntax Trees (ASTs)
University of Texas at Austin
ASTs are then converted to SQL
41
42. ECOOP 11 Summer School
Dynamic Queries in LINQ
var matches = db.Course;
// add a test if the condition is given
if (Test.Length > 0)
matches = matches.Where(
c => c.Title == Test);
// select the desired values
matches = matches.Select(c => c.Title);
University of Texas at Austin
// iterate over the result set
for (String title : matches.ToList())
print(title);
42
43. ECOOP 11 Summer School
Dynamic Queries in Batches
batch (db : Database) {
for (Ticket t : db.Course)
if (Test.Length == 0 || c.Title == Test)
print(c.Title);
}
University of Texas at Austin
43
44. ECOOP 11 Summer School
Dynamic Queries in Batches
batch (db : Database) {
for (Ticket t : db.Course)
if (Test.Length == 0 || c.Title == Test)
print(c.Title);
}
Left side of
University of Texas at Austin
condition
is client-only:
Pre-evaluated
44
45. ECOOP 11 Summer School
Analysis of LINQ
Fundamental differences
LINQ is based on virtual collections
Composition of lambda ASTs
Based on Functional Programming
Batches are based on program partitioning
University of Texas at Austin
User mixes remote code and local
System partitions them automatically
Communication is managed automatically
Based on SQL execution model
45
46. ECOOP 11 Summer School
Analysis of LINQ
LINQ Issues
Programmer must create query result structure
Difficult to compose programs
Compilation happens at runtime
Dynamic queries are messy
Good for querying, less so for updates
University of Texas at Austin
Batch Issues
Don't have any of the above problems
Batches don't naturally create client side objects
just returns all needed fields (similar to SQL)
Uses lambdas (a la LINQ) for aggregation/sorting
46
47. ECOOP 11 Summer School
Batch = One Round Trip
Clean, simple performance model
Some batches would require more round trips
batch (..) {
if (AskUser(“Delete ” + msg.Subject + “?”)
msg.delete();
}
University of Texas at Austin
Pattern of execution
OK: Local → Remote → Local
Error: Remote → Local → Remote
Can't just mark everything as a batch!
47
48. ECOOP 11 Summer School
What about Object Serialization?
Batch only transfers primitive values, not objects
But they work with any object, not just remotable ones
Send a local set to the server?
java.util.Set<String> local = … ;
batch ( mail : server ) {
mail.sendMessage( local, subject, body);
University of Texas at Austin
// compiler error sending local to remote method
}
48
49. ECOOP 11 Summer School
Serialization by Public Interfaces
java.util.Set<String> local = … ;
batch ( mail : server ) {
service.Set recipients = mail.makeSet();
for (String addr : local )
recipients.add( addr );
mail.sendMessage( recipients, subject, body);
}
University of Texas at Austin
Sends list of addresses with the batch
Constructors set on server and populates it
Works between different languages
49
50. ECOOP 11 Summer School
Interprocedural Batches
Reusable serialization function
@Batch
service.Set send(Mail server, local.Set<String> local) {
service.Set remote = server.makeSet();
for (String addr : local )
remote.add( addr );
return remote;
University of Texas at Austin
}
Main program
batch ( mail : server ) {
remote.Set recipients = send( localNames );
50
51. ECOOP 11 Summer School
Exceptions
Server Exceptions
Terminate the batch
Return exception in forest
Exception is raised in client at same point as on server
Client Exceptions
Be careful!
University of Texas at Austin
Batch has already been fully processed on server
Client may terminate without handling all results locally
51
52. ECOOP 11 Summer School
Transactions and Partial Failure
Batches are not necessarily transactional
But they do facilitate transactions
Server can execute transactionally
Batches reduce the chances for partial failure
Fewer round trips
University of Texas at Austin
Server operations are sent in groups
52
53. ECOOP 11 Summer School
Order of Execution Preserved
All local and remote code runs in correct order
batch ( remote : service ) {
print( remote.updateA( local.getA() )); // getA, print
print( remote.updateB( local.getB() )); // getB, print
}
Partitions to:
University of Texas at Austin
input.put(“X”, local.getA() ); // getA
input.put(“Y”, local.getB() ); // getB
.... execute updates on server
print( result.get(“A) ); // print
print( result.get(“B”) ); // print
Compiler Error! 53
54. RD
BM
ECOOP 11 Summer School
one two SQL S
call calls
Batch
PC
R BA (SQL)
University of Texas at Austin
OR I
C M
R
Where do
Services fit? 54
55. University of Texas at Austin ECOOP 11 Summer School
Serv
Web
XML
ices
10
yea
rs
HTML
Services...
Web
55
56. RD
BM
ECOOP 11 Summer School
one two SQL S
call calls
PC Batch
R BA
OR P C (SQL)
University of Texas at Austin
C -R
ML I
X M
R
HTML
XML 56
57. RD
BM
ECOOP 11 Summer School
SQL S
one two
call calls
PC Batch
R BA
OR P C (SQL)
University of Texas at Austin
C -R
ML I
X M
R Web
Service
HTML
XML 57
58. ECOOP 11 Summer School
Amazon Web Service
<ItemLookup>
<AWSAccessKeyId>XYZ</AWSAccessKeyId>
<Request>
<ItemIds>
<ItemId>1</ItemId>
<ItemId>2</ItemId>
</ItemIds>
University of Texas at Austin
<IdType>ASIN</ItemIdType>
<ResponseGroup>SalesRank</ResponseGroup>
<ResponseGroup>Images</ResponseGroup>
</Request>
</ItemLookup>
59. ECOOP 11 Summer School
Amazon Web Service
<ItemLookup>
<AWSAccessKeyId>XYZ</AWSAccessKeyId>
<Request> Custom-defined language
<ItemIds> for each service operation
<ItemId>1</ItemId>
<ItemId>2</ItemId>
</ItemIds>
University of Texas at Austin
<IdType>ASIN</ItemIdType>
<ResponseGroup>SalesRank</ResponseGroup>
<ResponseGroup>Images</ResponseGroup>
</Request>
</ItemLookup>
60. ECOOP 11 Summer School
Amazon Web Service
<ItemLookup>
<AWSAccessKeyId>XYZ</AWSAccessKeyId>
<Request>
<ItemIds>
<ItemId>1</ItemId>
<ItemId>2</ItemId>
</ItemIds> for (item : Items) {
University of Texas at Austin
<IdType>ASIN</ItemIdType>
<ResponseGroup>SalesRank</ResponseGroup>
<ResponseGroup>Images</ResponseGroup>
</Request>
outA( item.SalesRank )
</ItemLookup>
outB( item.Images )
}
61. ECOOP 11 Summer School
Web Service Client Invocation
// create request
ItemLookupRequest request = new ItemLookupRequest() ;
request.setIdType("ASIN");
Method names
request.getItemId().add(1);
in strings
request.getItemId().add(2);
request.getResponseGroup().add("SalesRank") ;
request.getResponseGroup().add("Images") ;
// execute request
University of Texas at Austin
items = amazonService.itemLookup(null, awsAccessKey,
associateTag, null, null, request, null,
operationRequest) ;
// use results Batch execution
for (item : items.values) pattern (again!)
display( item.SalesRank, item.SmallImage );
62. ECOOP 11 Summer School
Batch Version of Web Service
// calls specified in document
batch (Amazon aws : awsConnection) {
aws.login("XYZ");
Item a = aws.getItem("1");
display( a.SalesRank, a.SmallImage );
University of Texas at Austin
Item b = aws.getItem("2");
display( b.SalesRank, b.SmallImage );
}
Fine-grained logical operations
Coarse-grained execution model
63. RD
BM
ECOOP 11 Summer School
SQL S
one two
call calls
PC Batch
R BA
OR P C (SQL)
University of Texas at Austin
C -R
ML I
=
X M
R Web
Service
HTML
XML 63
64. ECOOP 11 Summer School
Web Service calls are Batches
“Document” = collection of operations
Custom-defined language for each service operation
Custom-written interpreter on server
Batches factor out all the boilerplate
Multiple calls, binding, conditionals, loops, primitives...
University of Texas at Austin
64
65. ECOOP 11 Summer School
Available Now...
Jaba: Batch Java
100% compatible with Java 1.5
Transport: XML, JSON, easy to add more
Batch statement as “for”
for (RootInterface r : serviceConenction) { ... }
Full SQL generation and ORM
University of Texas at Austin
Select/Insert/Delete/Update, aggregate, group, sorting
Future work
Security models, JavaScript/Python clients
Edit and debug in Eclipse or other tools
Available now! 65
66. ECOOP 11 Summer School
Opportunities
Add batch statement to your favorite language
Easy with reusable partitioning library
C#, Python, JavaScript, COBOL, Ruby, etc...
Monads?
What about multiple servers in batch?
Client → Server* Client → Server → Server
University of Texas at Austin
Client ↔ Server
Generalize “remoteness”: MPI, GPU, ...
Concurrency, Asynchrony and Streaming
66
67. ECOOP 11 Summer School
Related work
Microsoft LINQ
Batches are different and more general than LINQ
Mobile code / Remote evaluation
Does not manage returning values to client
Implicit batching
Performance model is not transparent
Asynchronous remote invocations
University of Texas at Austin
Asynchrony is orthogonal to batching
Automatic program partitioning
binding time analysis, program slicing
Deforestation
Introduce inverse: reforestation for bulk data transfer
67
68. ECOOP 11 Summer School
Transactional, Efficient
(DBPL 2011) SQL
RPC
Ease of batch
Programming
(ECOOP 2009)
University of Texas at Austin
Web
Cross-platform Services
(ECOWS 2009)
68
69. ECOOP 11 Summer School
Conclusion
Batch Statement
General mechanism for remote service clients
An opportunity to leapfrog LINQ
Unifies
Distributed objects (RPC)
Relational (SQL) database access
Service invocation (Web services)
University of Texas at Austin
Benefits:
Efficient distributed execution
Clean programming model
Stateless, no proxies, no explicit queries
No serialization: Language & transport neutral
Requires adding batch statement to language! 69