Slideshare.net (beta)

 
Post: 
Myspace Hi5 Friendster Xanga LiveJournal Facebook Blogger Tagged Typepad Freewebs BlackPlanet gigya icons



All comments

Add a comment on Slide 1

If you have a SlideShare account, login to comment; else you can comment as a guest


Showing 1-50 of 1 (more)

Off the Grid

From tomjadams, 5 months ago

Grid computing is a form of distributed computing that is increasi more

1120 views  |  0 comments  |  1 favorite  |  69 downloads  |  1 embed (Stats)
 

Groups/Events

Not added to any group/event

 
 

Privacy InfoNew!

This slideshow is Public

 
CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License
Embed in your blog
Embed (wordpress.com)
custom

Slideshow Statistics
Total Views: 1120
on Slideshare: 1119
from embeds: 1* * Views from embeds since 21 Aug, 07

Slideshow transcript

Slide 1: Off the Grid Introduction to Grid Computing with GridGain QJUG February 2007 Tom Adams Nick Partridge Workingmouse Veitch Lister Consulting

Slide 2: Why are we here?

Slide 3: Large distributed application

Slide 4: Grid-based solution worked

Slide 5: Flow

Slide 6: Grid? • Multiple independent computing clusters which act like a "grid" (Wikipedia) • Many nodes, each node is indistinguishable from other nodes •Complete machines over co-located CPUs? •Multiple processes? •Commodity hardware? •Homogenous machines?

Slide 7: A tale of two grids

Slide 8: Partition data across grid

Slide 9: Partition processing across grid

Slide 10: http://www.jroller.com/nivanov/entry/grid_computing_compute_grid_data

Slide 11: Selection

Slide 12: Requirements • Callable from a Rails webapp •Real-time - synchronous responses less than 30 seconds •Large dataset - 100 GB (computation runs across all data)

Slide 13: Rails webapp • Simple document-literal web service • Ruby - soap4r • Java - GlassFish, Spring-WS •Not really interesting for this talk... see Brisbane.rb

Slide 14: Data • Read-only •Full control •45 TB (became 100 GB with pre-processing) •SQL? 3 tables, one query w/ 2 joins

Slide 15: Don’t want to roll our own

Slide 16: (Row) database good enough

Slide 17: And we can federate them

Slide 18: Result?

Slide 19: http://battellemedia.com/archives/2007_01.php

Slide 20: What about BigTable?

Slide 21: Column database

Slide 22: Result?

Slide 23: http://failblog.wordpress.com/2008/01/29/satellite/

Slide 24: Where are we?

Slide 25: Progress • Don’t need to distribute data no data grid •No off the shelf solutions that scale/go fast •Understand data better happy to roll our own as fallback

Slide 26: Data solution

Slide 27: Data • CSV files on filesystem (now binary) •Directories form indices •Data files broken up into chunks

Slide 28: What about the code? http://giapet.net/wp-content/uploads/2007/05/luluwtf.gif

Slide 29: Need to distribute the computation

Slide 30: Options?

Slide 31: Erlang

Slide 32: Scala

Slide 33: Java

Slide 34: Java frameworks • Hadoop •GridGain •Oracle Coherence •GigaSpaces •Terracotta •JavaSpaces/Jini •Shoal

Slide 35: GridGain

Slide 36: GridGain • “fully open source full-stack grid computing platform for Java” •Map/reduce-based computation •Easy to setup and use •Can be extended via SPI implementations •Just works •“Scalable” (we’ve had it up to 32 nodes)

Slide 37: Map/reduce

Slide 38: When does it work • When data is independent (pure/referentially transparent) •When data can be combined (reduce) based solely on input

Slide 39: foo foo:1 bar bar:1 foo bar bar bar:1 foo: 1 split bar baz baz map baz:1 reduce bar: 4 quux bar quux quux:1 baz: 2 baz bar bar bar:1 quux: 1 baz baz:1 bar bar:1

Slide 40: GridGain grid

Slide 41: foo bar foo: 1 bar baz bar: 4 quux bar baz: 2 baz bar quux: 1 Grid

Slide 42: foo bar foo: 1 bar baz bar: 4 ? quux bar baz: 2 baz bar quux: 1 bar: 2 foo bar baz: 1 bar baz quux: 1 foo: 1 quux bar bar: 2 baz bar baz: 1 Node Node

Slide 43: foo bar foo: 1 bar baz Master bar: 4 quux bar Node baz: 2 baz bar quux: 1 bar: 2 foo bar baz: 1 bar baz quux: 1 foo: 1 quux bar bar: 2 baz bar baz: 1 Node Node

Slide 44: foo bar quux bar bar baz baz bar foo: 1 Master Master bar: 2 bar: 2 Node Node baz: 1 baz: 1 quux: 1 foo bar baz bar quux bar bar baz Node Node

Slide 45: Did you say map/reduce?

Slide 46: foo bar foo: 1 bar baz Master bar: 4 quux bar reduce Node baz: 2 baz bar quux: 1 bar: 2 foo bar baz: 1 bar baz quux: 1 foo: 1 quux bar bar: 2 baz bar baz: 1 Node map map Node

Slide 47: Show me the types!

Slide 48: foo bar foo: 1 bar baz Master bar: 4 reduce[B, C](List[B], C, (C, B) quux bar Node → C) → List[C] 2 baz: baz bar quux: 1 bar: 2 foo bar baz: 1 bar baz quux: 1 foo: 1 quux bar bar: 2 baz bar baz: 1 map[A, B](List[A], Node A → B) → List[B] Node

Slide 49: Terminology

Slide 50: foo bar foo: 1 bar baz Master bar: 4 quux bar Node baz: 2 baz bar quux: 1 Task Result bar: 2 foo bar quux bar baz: 1 bar baz baz bar quux: 1 foo: 1 Job bar: 2 Job baz: 1 Node Node

Slide 51: foo bar foo: 1 bar baz Master bar: 4 quux bar Node baz: 2 baz bar quux: 1 Task Result foo bar baz bar Job Job bar baz quux bar Job Job Node Node

Slide 52: foo bar foo: 1 bar baz Master bar: 4 quux bar Node baz: 2 baz bar quux: 1 Task Result bar baz foo bar quux bar baz bar Job Job Job Job Node Node Node Node

Slide 53: What defines a grid?

Slide 54: IP MCast: 228.1.2.4 IP MCast: 228.1.2.5 Node Node Node Node Node Node

Slide 55: Failover

Slide 56: foo bar foo: 1 bar baz Master bar: 4 quux bar Node baz: 2 baz bar quux: 1 Task bar baz foo bar quux bar baz bar Job Job Job Job Node Node Node Node

Slide 57: foo bar foo: 1 bar baz Master bar: 4 quux bar Node baz: 2 baz bar quux: 1 Task bar baz foo bar quux bar baz bar Job Job Job Job Node X Node Node Node

Slide 58: foo bar foo: 1 bar baz Master bar: 4 quux bar Node baz: 2 baz bar quux: 1 Task bar baz quux bar baz bar Job bar foo Job Job Job Node XNode Node Node

Slide 59: foo bar foo: 1 bar baz Master bar: 4 quux bar Node baz: 2 baz bar quux: 1 Task foo bar bar baz Job quux bar baz bar Job Job Job X X Node Node Node Node

Slide 60: Task execution

Slide 61: http://www.gridgain.com/javadoc/org/gridgain/grid/GridTask.html

Slide 62: GridGain demo

Slide 63: The good, the bad, the ugly

Slide 64: Just works, fast, easy, extensible, scalable

Slide 65: Error messages, doco, code quality, coupling, odd APIs, management overview

Slide 66: Nomenclature, JMS?

Slide 67: References • http://wiki.workingmouse.com/ •http://www.gridgain.com/ •http://labs.google.com/papers/mapreduce.html