Your SlideShare is downloading. ×
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Off the Grid
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Off the Grid

2,642

Published on

Grid computing is a form of distributed computing that is increasing in popularity in fields that have high computation and/or data storage requirements. In the presentation we give an overview of …

Grid computing is a form of distributed computing that is increasing in popularity in fields that have high computation and/or data storage requirements. In the presentation we give an overview of grid computing, describe our experiences using grid tools on a real project and develop a working grid across a cluster of two nodes using GridGain, an open source grid toolkit.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,642
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
135
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Off the Grid Introduction to Grid Computing with GridGain QJUG February 2007 Tom Adams Nick Partridge Workingmouse Veitch Lister Consulting
  • 2. Why are we here?
  • 3. Large distributed application
  • 4. Grid-based solution worked
  • 5. Flow
  • 6. Grid? • Multiple independent computing clusters which act like a quot;gridquot; (Wikipedia) • Many nodes, each node is indistinguishable from other nodes •Complete machines over co-located CPUs? •Multiple processes? •Commodity hardware? •Homogenous machines?
  • 7. A tale of two grids
  • 8. Partition data across grid
  • 9. Partition processing across grid
  • 10. http://www.jroller.com/nivanov/entry/grid_computing_compute_grid_data
  • 11. Selection
  • 12. Requirements • Callable from a Rails webapp •Real-time - synchronous responses less than 30 seconds •Large dataset - 100 GB (computation runs across all data)
  • 13. Rails webapp • Simple document-literal web service • Ruby - soap4r • Java - GlassFish, Spring-WS •Not really interesting for this talk... see Brisbane.rb
  • 14. Data • Read-only •Full control •45 TB (became 100 GB with pre-processing) •SQL? 3 tables, one query w/ 2 joins
  • 15. Don’t want to roll our own
  • 16. (Row) database good enough
  • 17. And we can federate them
  • 18. Result?
  • 19. http://battellemedia.com/archives/2007_01.php
  • 20. What about BigTable?
  • 21. Column database
  • 22. Result?
  • 23. http://failblog.wordpress.com/2008/01/29/satellite/
  • 24. Where are we?
  • 25. Progress • Don’t need to distribute data no data grid •No off the shelf solutions that scale/go fast •Understand data better happy to roll our own as fallback
  • 26. Data solution
  • 27. Data • CSV files on filesystem (now binary) •Directories form indices •Data files broken up into chunks
  • 28. What about the code? http://giapet.net/wp-content/uploads/2007/05/luluwtf.gif
  • 29. Need to distribute the computation
  • 30. Options?
  • 31. Erlang
  • 32. Scala
  • 33. Java
  • 34. Java frameworks • Hadoop •GridGain •Oracle Coherence •GigaSpaces •Terracotta •JavaSpaces/Jini •Shoal
  • 35. GridGain
  • 36. GridGain • “fully open source full-stack grid computing platform for Java” •Map/reduce-based computation •Easy to setup and use •Can be extended via SPI implementations •Just works •“Scalable” (we’ve had it up to 32 nodes)
  • 37. Map/reduce
  • 38. When does it work • When data is independent (pure/referentially transparent) •When data can be combined (reduce) based solely on input
  • 39. foo foo:1 bar bar:1 foo bar bar bar:1 foo: 1 split bar baz baz map baz:1 reduce bar: 4 quux bar quux quux:1 baz: 2 baz bar bar bar:1 quux: 1 baz baz:1 bar bar:1
  • 40. GridGain grid
  • 41. foo bar foo: 1 bar baz bar: 4 quux bar baz: 2 baz bar quux: 1 Grid
  • 42. foo bar foo: 1 bar baz bar: 4 ? quux bar baz: 2 baz bar quux: 1 bar: 2 foo bar baz: 1 bar baz quux: 1 foo: 1 quux bar bar: 2 baz bar baz: 1 Node Node
  • 43. foo bar foo: 1 bar baz Master bar: 4 quux bar Node baz: 2 baz bar quux: 1 bar: 2 foo bar baz: 1 bar baz quux: 1 foo: 1 quux bar bar: 2 baz bar baz: 1 Node Node
  • 44. foo bar quux bar bar baz baz bar foo: 1 Master Master bar: 2 bar: 2 Node Node baz: 1 baz: 1 quux: 1 foo bar baz bar quux bar bar baz Node Node
  • 45. Did you say map/reduce?
  • 46. foo bar foo: 1 bar baz Master bar: 4 quux bar reduce Node baz: 2 baz bar quux: 1 bar: 2 foo bar baz: 1 bar baz quux: 1 foo: 1 quux bar bar: 2 baz bar baz: 1 Node map map Node
  • 47. Show me the types!
  • 48. foo bar foo: 1 bar baz Master bar: 4 reduce[B, C](List[B], C, (C, B) quux bar Node → C) → List[C] 2 baz: baz bar quux: 1 bar: 2 foo bar baz: 1 bar baz quux: 1 foo: 1 quux bar bar: 2 baz bar baz: 1 map[A, B](List[A], Node A → B) → List[B] Node
  • 49. Terminology
  • 50. foo bar foo: 1 bar baz Master bar: 4 quux bar Node baz: 2 baz bar quux: 1 Task Result bar: 2 foo bar quux bar baz: 1 bar baz baz bar quux: 1 foo: 1 Job bar: 2 Job baz: 1 Node Node
  • 51. foo bar foo: 1 bar baz Master bar: 4 quux bar Node baz: 2 baz bar quux: 1 Task Result foo bar baz bar Job Job bar baz quux bar Job Job Node Node
  • 52. foo bar foo: 1 bar baz Master bar: 4 quux bar Node baz: 2 baz bar quux: 1 Task Result bar baz foo bar quux bar baz bar Job Job Job Job Node Node Node Node
  • 53. What defines a grid?
  • 54. IP MCast: 228.1.2.4 IP MCast: 228.1.2.5 Node Node Node Node Node Node
  • 55. Failover
  • 56. foo bar foo: 1 bar baz Master bar: 4 quux bar Node baz: 2 baz bar quux: 1 Task bar baz foo bar quux bar baz bar Job Job Job Job Node Node Node Node
  • 57. foo bar foo: 1 bar baz Master bar: 4 quux bar Node baz: 2 baz bar quux: 1 Task bar baz foo bar quux bar baz bar Job Job Job Job Node X Node Node Node
  • 58. foo bar foo: 1 bar baz Master bar: 4 quux bar Node baz: 2 baz bar quux: 1 Task bar baz quux bar baz bar Job bar foo Job Job Job Node XNode Node Node
  • 59. foo bar foo: 1 bar baz Master bar: 4 quux bar Node baz: 2 baz bar quux: 1 Task foo bar bar baz Job quux bar baz bar Job Job Job X X Node Node Node Node
  • 60. Task execution
  • 61. http://www.gridgain.com/javadoc/org/gridgain/grid/GridTask.html
  • 62. GridGain demo
  • 63. The good, the bad, the ugly
  • 64. Just works, fast, easy, extensible, scalable
  • 65. Error messages, doco, code quality, coupling, odd APIs, management overview
  • 66. Nomenclature, JMS?
  • 67. References • http://wiki.workingmouse.com/ •http://www.gridgain.com/ •http://labs.google.com/papers/mapreduce.html

×