Ateji PX: Java Parallel Programming made Simple<br />© Ateji – All rights reserved.<br />
Ateji – the Company<br />Specialized in parallelism & language technologies<br />Founded by Patrick Viry in 2005 <br />Jav...
The Grand Challenge : 	Parallel Programming for 	All Application Developers<br />2010 (100 cores)<br />2008 (4 cores)<br /...
Why Java ?<br />Increasingly used for HPC because:<br />Most popular language today<br />Good runtime performance<br />Muc...
How to parallelize Java code ?<br />		for(int i : I) {<br />for(int j : J) {				<br />for(int k : K) {			<br />		         ...
It’s easy AND efficient :<br />12.5x speedup on 16 cores<br />Seewhitepaper<br />on www.ateji.com/px<br />Ateji PX<br />		...
“The problem with threads”<br />[Technical Report, Edward A. Lee, EECS Berkeley]<br />Threads are a hardware-level concept...
Introducing Parallelism at the Language Level<br />Sequential composition operator: 	“;”<br />Parallel composition operato...
DataParallelism<br />Same operation on all elements <br />[<br />// quantified branches<br />|| (inti : N) array[i]++;<br ...
Task Parallelism<br />intfib(int n) {<br />			if(n <= 1) return 1;<br />int fib1, fib2;<br />			[<br />|| fib1 = fib(n-1);...
Speculative Parallelism<br />Stop when the fastest algorithm succeeds <br />	[    || return algorithm1();    || return alg...
Parallel reductions<br />Same behaviour for break, continue, throw<br />
Message Passing<br />Is an essential aspect of parallelism<br />Must be part of the language<br />Send a message: 		chan !...
in1<br />adder<br />out<br />in2<br />Data Flow and Stream parallelism<br />An adder <br />void adder(Chan<Integer> in1, i...
c1<br />adder<br />source<br />c3<br />sink<br />c2<br />source<br />Data Flow and Stream parallelism<br />Compose process...
Expressing non-determinism<br />Note the parallel reads <br />[ in1 ? value1 || in2 ? value2 ]<br />Impossibleto express i...
Distributing branches<br />Use indications <br />[ || #Remote(“192.168.20.1”)source(c1);<br />||#Remote(“Amazon EC2”) sour...
Compiler handles the boring stuff<br />Passing parameters<br />Returning results<br />Throwing exceptions<br />Accessing n...
Makingiteasyisalso about tools:EclipseIntegration<br />
Ateji PX Summary<br />Parallelism at the language level is simple and intuitive,  <br />      efficient, compatible with s...
Roadmap as of February 2011<br />Ateji PX 1.1 (multicore version) available today <br />			Free evaluation version on www....
Call to Action<br />Free download on  www.ateji.com/px<br />Read the whitepapers<br />Play with the online demo<br />Look ...
© Ateji – All rights reserved.<br />
Upcoming SlideShare
Loading in …5
×

Java parallel programming made simple

3,805 views

Published on

With Ateji PX, no need to be specialized in Java threads, parallel programming on multicore, GPU, cloud and grid can be as simple as inserting a || operator in the source code.

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,805
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
0
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Java parallel programming made simple

  1. 1. Ateji PX: Java Parallel Programming made Simple<br />© Ateji – All rights reserved.<br />
  2. 2. Ateji – the Company<br />Specialized in parallelism & language technologies<br />Founded by Patrick Viry in 2005 <br />Java extensions for optimization (OptimJ, 2008),<br />Parallelism (Ateji PX, 2010)<br />January 2010: 1st round of investment<br />AtejiPX Selected as Disruptive Technology during SC10<br />Member of HiPEAC, OpenGPU<br />
  3. 3. The Grand Challenge : Parallel Programming for All Application Developers<br />2010 (100 cores)<br />2008 (4 cores)<br />enterpriseservers<br />
  4. 4. Why Java ?<br />Increasingly used for HPC because:<br />Most popular language today<br />Good runtime performance<br />Much better productivity and code quality<br />Faster time-to-market, less bugs, less maintenance<br />Much easier staffing<br />Used in aerospace, bioinformatics, physics, finance, <br /> data mining, statistics, ...<br />Details and references in our latest blog posting: ateji.blogspot.com<br />
  5. 5. How to parallelize Java code ?<br /> for(int i : I) {<br />for(int j : J) { <br />for(int k : K) { <br /> C[i][j] += A[i][k] * B[k][j];<br /> }<br /> }<br /> }<br />Ateji PX<br />Threads<br />final int nThreads = System.getAvailableProcessors();<br />final int blockSize = I / nThreads;<br />Thread[] threads = new Thread[nThreads];<br />for(int n=0; n<nThreads; n++) {<br /> final int finalN = n; <br />threads[n] = new Thread() {<br /> void run() {<br /> final int beginIndex = finalN*blockSize;<br /> final int endIndex = (finalN == <br />(nThreads-1))?I :(finalN+1)*blockSize; <br />for( int i=beginIndex; i<endIndex; i++) {<br />for(int j=0; j<J; j++) {<br />for(int k=0; k<K; k++) { <br />C[i][j] += A[i][k] * B[k][j];<br />}}}}};<br />threads[n].start();<br />}<br />for(int n=0; n<nThreads; n++) {<br />try {<br />threads[n].join();<br />} catch (InterruptedException e) {<br />System.exit(-1);<br />}<br />}<br /> for||(int i : I) {<br />for(int j : J) { <br />for(int k : K) { <br /> C[i][j] += A[i][k] * B[k][j];<br /> }<br /> }<br /> }<br /> for||(int i : I) {<br />for(int j : J) { <br />for(int k : K) { <br /> C[i][j] += A[i][k] * B[k][j];<br /> }<br /> }<br /> }<br />for||<br />
  6. 6. It’s easy AND efficient :<br />12.5x speedup on 16 cores<br />Seewhitepaper<br />on www.ateji.com/px<br />Ateji PX<br /> for||(int i : I) {<br />for(int j : J) { <br />for(int k : K) { <br /> C[i][j] += A[i][k] * B[k][j];<br /> }<br /> }<br /> }<br /> for||(int i : I) {<br />for(int j : J) { <br />for(int k : K) { <br /> C[i][j] += A[i][k] * B[k][j];<br /> }<br /> }<br /> }<br />for||<br />
  7. 7. “The problem with threads”<br />[Technical Report, Edward A. Lee, EECS Berkeley]<br />Threads are a hardware-level concept, not a practical <br /> abstraction for programming<br />Threads do not compose<br />Code correctness requires intricate thinking and <br /> inspection of the whole program<br />Most multi-threaded programs are bugged ...<br /> … and debuggers do not help<br />Not an option for most application programmers !<br />
  8. 8. Introducing Parallelism at the Language Level<br />Sequential composition operator: “;”<br />Parallel composition operator: “||”<br />“Hello World!”<br /> [ ||System.out.println("Hello");||System.out.println("World");]<br />Run two branches in parallel, wait for termination<br />prints either or<br />Hello<br />World<br />World<br />Hello<br />
  9. 9. DataParallelism<br />Same operation on all elements <br />[<br />// quantified branches<br />|| (inti : N) array[i]++;<br />]<br />Multiple dimensions and filters<br />e.g. update the upper left triangle of a matrix<br />[<br />|| (int i:N, int j:N, i+j<N) m[i][j]++;<br />]<br />
  10. 10. Task Parallelism<br />intfib(int n) {<br /> if(n <= 1) return 1;<br />int fib1, fib2;<br /> [<br />|| fib1 = fib(n-1);<br />|| fib2 = fib(n-2);<br /> ];<br /> return fib1 + fib2;<br /> }<br />Note the recursivity: ||compatible with all language constructs<br />
  11. 11. Speculative Parallelism<br />Stop when the fastest algorithm succeeds <br /> [ || return algorithm1(); || return algorithm2(); ]<br />Stop sister branches then return<br />Same behaviour for break, continue, throw<br />Non-local exit very difficult to get right with threads<br />
  12. 12. Parallel reductions<br />Same behaviour for break, continue, throw<br />
  13. 13. Message Passing<br />Is an essential aspect of parallelism<br />Must be part of the language<br />Send a message: chan ! Value<br />Receive a message: chan ? value<br />Typed Channels<br /> Chan<T> : synchronous (rendez-vous)<br />AsyncChan<T>: asynchronous (buffered)‏<br /> User-defined serialization (Java, XML, ASN.1, ...)<br /> Can be mapped to I/O devices (files, sockets, MPI)<br />
  14. 14. in1<br />adder<br />out<br />in2<br />Data Flow and Stream parallelism<br />An adder <br />void adder(Chan<Integer> in1, in2, out) { for(;;) {int value1, value2;[in1 ? value1; ||in2 ? value2; ];out ! (value1 + value2);}}<br />
  15. 15. c1<br />adder<br />source<br />c3<br />sink<br />c2<br />source<br />Data Flow and Stream parallelism<br />Compose processes <br />[ || source(c1); // generates values on c1<br /> || source(c2); // generates values on c2<br /> || adder(c1, c2, c3);<br /> || sink(c3); ] // read values from c3<br />Numeric values + sync = “data flow”<br />String or tuples + async = “stream programming”<br /> e.g. MapReduce algorithm<br />
  16. 16. Expressing non-determinism<br />Note the parallel reads <br />[ in1 ? value1 || in2 ? value2 ]<br />Impossibleto express in a sequential language<br />|| for performance, but also expressivity<br />See also the select construct<br />
  17. 17. Distributing branches<br />Use indications <br />[ || #Remote(“192.168.20.1”)source(c1);<br />||#Remote(“Amazon EC2”) source(c2); <br />||#Remote(“GPU”) adder(c1, c2, c3);<br /> || sink(c3); ]<br />Multicore Desktop/Server<br />Multicore CPU/GPU cluster<br />
  18. 18. Compiler handles the boring stuff<br />Passing parameters<br />Returning results<br />Throwing exceptions<br />Accessing non-final fields<br />Performing non-local exits<br />Stopping branches properly<br />
  19. 19. Makingiteasyisalso about tools:EclipseIntegration<br />
  20. 20. Ateji PX Summary<br />Parallelism at the language level is simple and intuitive, <br /> efficient, compatible with source code and tools<br />Most patterns in a single language: <br />data, task, recursive and speculative parallelism<br />shared memory and distributed memory<br />Covers OpenMP, Cilk, MPI, Occam, Erlang, etc…<br />Most hardware architectures from a single language:<br />Manycore, grid, cloud, GPU<br />
  21. 21. Roadmap as of February 2011<br />Ateji PX 1.1 (multicore version) available today <br /> Free evaluation version on www.ateji.com<br />GPU version coming soon<br />OpenGPU project<br />Distributed version coming soon<br /> Grid / Cluster / Cloud<br />Interactive correctness proofs<br />Integration of profiling tools<br />
  22. 22. Call to Action<br />Free download on www.ateji.com/px<br />Read the whitepapers<br />Play with the online demo<br />Look at the samples library<br />Benchmark your || code<br />Contact  info@ateji.com<br />Blog : ateji.blogspot.com<br />
  23. 23. © Ateji – All rights reserved.<br />

×