Parallel Sorting on a Spatial Computer

607 views
469 views

Published on

Presentation by Max Orhai.
Paper and more information: http://soft.vub.ac.be/races/paper/parallel-sorting-on-a-spatial-computer/

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
607
On SlideShare
0
From Embeds
0
Number of Embeds
209
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Parallel Sorting on a Spatial Computer

  1. 1. Parallel Sorting ona Spatial Computer21 October 2012RACES workshop at SPLASH’12Max Orhai and Andrew P. BlackPortland State UniversityMaseeh College ofEngineering andComputer ScienceUndergraduate Researchand Mentoring ProgramSupported bya gift fromIBM Research
  2. 2. what is a spatial computer?a networked system of computing devicesdistributed through a physical space, in which:1. the difficulty of moving information betweenany two devices depends on the physicaldistance between them, and2. the functional goals of the system are definedin terms of the system’s spatial structure.(definition based on 2012 Spatial Computing Workshop)the ‘problem’the ‘solution’2
  3. 3. are these spatial computers?Inmos transputer array(early 1980s)XMOS XMP-64dev kit (2010)Tilera Tile64 (2008)↝asynchronous logic automatacell (Chen 2009)cell matrix tile(Macias & Durbeck 2009)GreenArrays GA144asynchronous stackmachine array(2010)fine-grained coarse-grained3
  4. 4. a spatial computer isa networked system of computing devicesdistributed through a physical space, in which:1. the difficulty of moving information betweenany two devices depends on the physicaldistance between them, and2. the functional goals of the system are definedin terms of the system’s spatial structure.the ‘problem’the ‘solution’4
  5. 5. spatial computing offers real insights into1. the costs and constraints ofcommunication in large parallelcomputer arrays2. how to design algorithms that respectthese costs and constraintsmatch the communicationstructure of the program tothe physical structure of thecomputer networkmessage of this talk5
  6. 6. our example: collision sortspace ∼particles ∼collisions ∼orderdatacomparisonssimulation physics computing machinerygoal:arrange particles by color➡distributed computation of aglobal solution using onlylocal information andminimal communicationapproach:an entropy-reducingparticle system6darker lighter
  7. 7. collision sort: mechanismspace ∼particles ∼collisions ∼orderdatacomparisonsphysics computation7multi-way collisions quantize time‘boundarycollisions’always causerebounda b c din a single step, the color ofparticle b is compared withthe average color ofoverlapping particles a and c(a + c) / 2 b
  8. 8. collision sort: mechanismspace ∼particles ∼collisions ∼patches of space ∼motion between patches ∼orderdatacomparisonsprocessorsmessage-passingphysics computationloop (one step):• accept incoming particlesfrom neighbors• detect collisions and altervelocities accordingly• increment positions,sending outgoing particlesto neighbors8patches quantize spacecollisions across patchesare not detected!
  9. 9. 0360stepstime100particlesconstantspeed:0.1patch-widthsper stepcollision sort: low speed9
  10. 10. 040stepstime100particlesconstantspeed:0.9patch-widthsper stepcollision sort: high speed10
  11. 11. 040stepstime100particlesspeedsproportionalto colordifferencesmaximumspeed:0.9patch-widthsper stepcollision sort: variable speed11
  12. 12. 040stepstime200particlesspeedsproportionalto colordifferencesmaximumspeed:0.9patch-widthsper stepcollision sort: more data12
  13. 13. 040stepstimecollision sort: worst case?200particlesin reverseorderproportionalspeedsmaximumspeed:0.9patch-widthsper step13
  14. 14. 040stepstimecollision sort: worst case?200particlesin two sortedsubsequencesproportionalspeedsmaximumspeed:0.9patch-widthsper step14
  15. 15. collision sort: in 2D1000particleseach withboth blue andgreen valuesproportionalspeedsmaximumspeed:0.9patch-widthsper step per axis15darker lighterdarkerlighter
  16. 16. the total cost of any parallelalgorithm is a combination oflocal computation costs andcommunication costsfor simplicity, we assume thatlocal computation is cheap,and focus on thecost of communicationbetween system componentscollision sort: performance16
  17. 17. because each processor need only• receive at most one and• send at most one messageper neighbor during each loop cycle,communication cost is the minimum possible,under these assumptions:• a one-patch-per-step information speed limit• message cost independent of message size• local computation is freeABmessages mayconvey multipleparticlescollision sort: performance17
  18. 18. what about component failure?if messages are acknowledged, thenany ‘undeliverable’ particles cansimply have their velocities reversed,just as in a boundary collision:particle bounce ↜ message bouncefault tolerance is an ‘emergentproperty’ of this rule!collision sort: performance18
  19. 19. collision sort: fault-tolerance19
  20. 20. research efforts:• MIT CSAIL amorphous computingproject, c. 2000↝ BBN Proto language• Laboratoire IBISC (ComputerScience, Systems Biology andComplex Systems) at U. d’Evry↝ MGS topologicalsimulation language• abstract geometric computation(continuous signal machines)D. Duchier, J. Durand-Lose,M. Senot at U. d’Orleansapplication domains:• large-scale sensor networks• distributed control systems• mobile robot swarms• biological modeling• computability theoryrelated work: spatial computing20
  21. 21. conclusionlarge-scale parallel computer arrayscan be spatial computer systems, ifprogrammed in a way that respectsphysical distances.spatial computing gives us a cost modelthat approximates the physics ofinformation transmission, and thatleverages our intuition about physicalobjects to design concurrent algorithms.Tilera processor arrays21references:• http://www.spatial-computing.org• http://www.greenarraychips.com/• http://www.tilera.com/• http://www.xmos.com/• Abelson et al. Amorphous computing. Communications of the ACM 2000.• Gershernfeld et al. Reconfigurable asynchronous logic automata (RALA).POPL 2010.• Macias & Durbeck. The Cell Matrix: an architecture for nanocomputing.Nanotechnology 2001.• Duchier, Durand-Lose, and Senot. Solving Q-SAT in bounded space and timeby geometrical computation. 7th Intl. Conf. Computability in Europe 2011.

×