Indic threads pune12-grammar of graphicsa new approach to visualization-karan
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Indic threads pune12-grammar of graphicsa new approach to visualization-karan

  • 1,096 views
Uploaded on

The 7th Annual IndicThreads Pune Conference was held on 14-15 December 2012. http://pune12.indicthreads.com/

The 7th Annual IndicThreads Pune Conference was held on 14-15 December 2012. http://pune12.indicthreads.com/

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,096
On Slideshare
1,096
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
6
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Computer and network technology has made it trivial to capture, store, manipulate and disseminate ever increasing amounts of data. The amount of information in the world has been growing more than exponentially.
  • Data set is taken from a public source releases by the FAA. It contains every commercial flight in the US over several decades (last year of data is 2008), with many details Here, we have rolled up to get monthly counts of CANCELLED flights for two years, a decade apart. These are standard recommended tables and charts from Excel
  • Data set is taken from a public source releases by the FAA. It contains every commercial flight in the US over several decades (last year of data is 2008), with many details Here, we have rolled up to get monthly counts of CANCELLED flights for two years, a decade apart. These are standard recommended tables and charts from Excel Note the “cloud” of outliers for 2008 delays being much worse that 1998 in winter.
  • Don ’t describe charts by type (barchart, linechart, histogram etc.) but by mapping “ bar chart” = basic 2D coordinates, categorical x numeric displayed with intervals dropped from locations “ line chart” = basic 2D coordinates, any x numeric displayed with lines connecting locations Statistical operations (sum, count), styling (color) etc. Grammar-based approach means flexibility: new charts or charts attributes can be added without a new product binary Field team will be able to build a customer-specific chart Customers will be able to add that one extra customization they need Research will be able to rapidly build cutting edge visualizations Declarative language for visualizations (charts, interactivity, events, etc.), x-IBM standard
  • Draw an example with writing individual connectors for each database, versus working with SQL. Discuss the reduced cost of working with SQL, standardization, future proofing and so on. Consider each type of visualization to be the equivalent of a database, and the GoG approach to be an abstraction on which each chart can be based.
  • Charts in the top line are ones that you could get in a traditional graphing package, but would look different in different environments Charts in the middle line are ones where you would probably need a specialist solution – each chart would require a different OEM solution Charts in the bottom line are ones that may not be possible anywhere else without a dedicated graphics programmer
  • AW chart (top right)
  • Extensibility
  • Show examples of each feature in RAVE

Transcript

  • 1. Grammar of Graphics:A New Approach To VisualizationKaranbir Singh GujralIBM, India Software Labs
  • 2. Why visualization?“Visualization & Data Discovery” market is the fastest growing segment of Business Analytics. Our customers need a solution that provides: • High definition (HD) visualizations • In-market flexibility – new novel visualizations without a new release • Portable (across the full mobile landscape) • Scalable • Extensible • Interactive 2
  • 3. Data Rich & Information Poor (DRIP)
  • 4. Dealing with DRIP: VisualizationHuman visual system has evolved over time to spot patterns, outliers and trendGain insight, by visually assessing data first, perform deeper analysis afterwardVisualization is not just about reporting and “business graphics” Anscombes QuartetVisualization is the ‘face’ of analysis & knowledgeVisualization is a force multiplier, Analytics “A great visualization isnot a stand-alone technology Visualization worth a million data points”
  • 5. Visualization by example • DATA: Basic functionality by any system for analyzing data is to filter, slice and dice to create a view of the data you want • TABLE: Presenting the data in the simplest form • CHART: Standard recommendation: To compare two categories of counts, use a clustered bar chart Month Y1998 Y2008 0 13880 17308 1 10484 20596 2 9847 16183 3 6952 10355 4 9393 6229 5 12870 10931 6 9330 10598 7 14726 9835 8 11893 9913 9 7815 3249 10 6419 4458 11 9900 17779
  • 6. Visualization by example• Adapt the layout to the data: Months are cyclical; use a polar axis. This allows the user to spot seasonal effects more easily• Bars are not good for comparisons: Change to aligned points. This allows the years to be compared directly• Engage the user: Use a custom symbol appropriate for the domain
  • 7. Grammar of GraphicsGrammar not Types Visualization VisualizationNot a prescribed “Library of Charts” “Description” “Description”A highly adaptive framework that allows each integrator to quickly create Common Visualization Framework and customize their own library of interactive visualizationsLanguage is flexible enough to: describe our known chart types describe unknown chart typesPlatform native visualizations
  • 8. Old Way: Charts are TypesFixed Set of “supported charts” • If it isn’t in the list, you can’t have itExpensive and slow to innovate • Each new chart is a new development effort“Ad hoc” features tightly coupled to type • E.g. “Animation only implemented for Hans Rosling-style bubble charts, not for all charts”Adding a new feature to 20 charts is a large effort Kills creativity
  • 9. New Way: Grammar of GraphicsA language-based specification of a chartIn terms of features, not “types”, e.g. • “bar chart” = basic 2D coordinates, categorical x numeric displayed with intervals dropped from locations • “line chart” = basic 2D coordinates, any x numeric displayed with lines connecting locations • “histogram” = basic 2D coordinates, numeric x statistic binned counts, displayed withOrthogonal set of features describes all common charts,virtually all uncommon charts, and most cutting-edge researchcharts
  • 10. Art of the possible
  • 11. Visual Analytics
  • 12. Switching chart types
  • 13. Workload Analytics
  • 14. Live Social Analytics Extensible
  • 15. Where is it available?Books • Grammar of Graphics by Leland WilkinsonOpen Source • Javascript libraries: ProtoVis and D3 • Ggplot2 in ‘R’: Statistical computing • Bokeh in PythonCommercial • IBM RAVE (Rapidly Adaptive Visualization Engine) • Tableau software
  • 16. GoG: Composable Set of Chart Features Element Type Element Type Guides Guides Aesthetics Aesthetics ••Point, line, area, interval ••Simple Axis Simple Axis ••Map Data to Graphic Map Data to Graphic Point, line, area, interval ••Nested Axis Attributes. (bar), polygon, schema, (bar), polygon, schema, Nested Axis Attributes. ••Facet Axis Facet Axis ••Works on all elements Works on all elements text text ••Each element can be ••Legend Legend ••Color (exterior, interior, Color (exterior, interior, Each element can be used with any data used with any data Element Guides Type Coordinates Aesthetics Layouts Faceting (numeric, category, time (numeric, category, time …) works with gradients) works with gradients) ••Size (width/height/both) Size (width/height/both) ••Symbol Symbol …) ••Dashing, General Styles Dashing, General Styles •Map number Graphic Attributes.be •Any Data to of (Network, (bar), polygon, Simple line, area, interval Treelike) Graph Layouts dimensions can Point, Axis ••As many elements on a As many elements on a ••Label, Tooltip, Meta Label, Tooltip, Meta chart as you like chart as you like Faceting •Chart-in-chart chain of transformations Faceting •Works on all a schema, with defined, Axis elements •Nested text Treemaps ••Chart-in-chart •PanelingLayouts be used with any data Chart-in-chart •Color Paneling and can •Clustering Facet (exterior, stackingworks with Custom Each Axis element interior, ••Paneling •Polar (numeric, gradients)category, time …) Legend Coordinates Coordinates Layouts Layouts Transposeelements on a •Size (width/height/both) chart as you likebenumber ofwith aachain As many ••Any number of dimensions Any dimensions can be defined, with chain can defined, Map projections of transformations Treelike) Treelike) •Symbol ••Graph Layouts (Network, Graph Layouts (Network, of transformations ••Clustering and stacking Clustering and stacking ••Polar Polar Custom Layouts •Dashing, General Styles ••Treemaps Treemaps ••Transpose ••Custom Layouts Transpose ••Map projections Map projections •Label, Tooltip, Meta This CFO Dashboard Visualization uses chart-in-chart faceting: •The outer chart uses a graph layout, with an integrator-designed schema element for the nodes and standard edge element links. The schema element has multiple parts and five different aesthetics set symbol type and color for each part. •The inner chart uses an interval element with 2D coordinates and two axes: a standard bar chart.
  • 17. The Grammar of a Bar Chart "grammar": [ { "color": [ {"field": "coordinates": { {"$ref": "pop1960"}} ], "dimensions": [ {"axis": "style":{}}, {"axis": {}} ], {"stroke": {"width": 0.25}} "transforms": [ {"type": }"transpose"} ] ], }, "style": {"fill": "elements": [ { "#bbf", "padding": 5} This is the complete "type": "interval", } grammar VizJSON "position": [ {"field": {"$ref":"pop2010"}}, chart flipped (transpose) - bars Coordinates: 2D Position (how we place elements in the run horizontally coordinate system) shows state names {"field": {"$ref": by current population"name"}} ], for both x and y dimensions Guides: "axis“ Aesthetics (how to color it): Color uses Element Types (go inside the data area): the population data for 1960 Uses a single interval (e.g. bar) Layouts , Faceting - None Style for a thin border (e.g. 0.25 width) Pop2010, pop1960, name are references to parts of the data
  • 18. Simple Changes: Power of Composition Before: { After "type": "interval", "position": [Before {"field": {"$ref": "pop2010"}}, {"field": {"$ref": "name"}} ],Adda position field to make it Add a position field to make it a range chart with start at a range chart with start at 1960, end at 2010 1960, end at 2010 After: { "type": "interval", "position": [ {"field": {"$ref": "pop1960"}}, {"field": {"$ref": "pop2010"}}, {"field": {"$ref":
  • 19. Simple Changes: Power of CompositionAdd a point element for 1980 populations { "type": "interval", "position": [ {"field": {"$ref": "pop1960"}}, {"field": {"$ref": "pop2010"}}, {"field": {"$ref": "name"}} ], "color": [ {"field": {"$ref": "pop1960"}} ], "style": {"stroke": {"width": 0.25}} }, { "type": "point", "position": [ {"field": {"$ref":
  • 20. Maps are just another element { "coordinates": { Use a projection type of transform, "dimensions": [ {}, {} ], specifically with a Mercator coordinate system. "transforms": [ { Data set already has the geographic "type": "projection", Aesthetics, labels, style are all as usual. "projectionParams": Map layers correspond well to elements. {"name": "mercator"} } ] }, "elements": [ { "type": "polygon", "label": [ {"content": [ {"$ref": "abbr"} ]} ], "color": [{"field": {"$ref":"pop2010"}}], "style": {"stroke": {"width":0.25}} } ], "style": {"fill": "#bbf", "padding":5} }
  • 21. Demo: Showcase each feature Guides Aesthetics Element Type Faceting Layouts Coordinates
  • 22. Anything is possible Rangoli built with a GoG toolkit - Nitin Chaturvedi
  • 23. Thank You Slide authors: Greg Adams (IBM) Graham Wills (IBM) Karan Gujral (IBM)