Upcoming SlideShare
×

# Visual Intelligence @ IBS, Hyderabad

1,074 views

Published on

An introduction to the basics of Visual Intelligence with an illustration of Industry case studies.

Published in: Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
1,074
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
23
0
Likes
0
Embeds 0
No embeds

No notes for slide
• Let’s take a small test. We’ll show a table of numbers on the screen, and ask 3 questions about those numbers. You have 30 seconds to answer these. You can just write down the answers or remember them – there’s no need to say the answer out aloud.

• What answers did you get?

How many numbers were above 100?
How many were below 10?

[Typically, there will be a lot of variance in these answers]
So there’s considerable variation in the answers you get.

Now, let’s do the same exercise again, but with some extremely simple highlighting. It’s the same questions. You have 30 seconds. This time, you can say the answer out aloud if you like.

• You’ll now see a series of examples of the work Gramener has done with its clients.

All of these examples, every visual that you will see from now on (including the ones in PowerPoint) were directly created as output from the Gramener visualisation server.
• We were also interested in applying these rich visualisations to sports. One question we had was, for example, “Who’s the fastest one day international player?”

The trouble with that is, depending on when you measure it and how you measure it, the results could be very different. For example, if we take strike rate as a metric, it turned out (when we did it) that it was a South African who had the highest strike rate – of 200%. He played one match, hit a four, and got out the next ball.

Clearly, that’s not what we’re looking for. We could, perhaps, take a minimum number of runs as a cut-off. But the question is, what should that be? 100? 1000? 5000? Where does one draw the line, and why is that the right one? If you don’t know the domain, answering this is difficult.

Like with the contract farming example before, we need a way of looking at performance combined with scale or importance.
• We said, let’s take all of the players who’ve ever played one day internationals. Each box is one player. The size of the rectangle is proportional to the number of runs they’ve scored.

So you can see that Tendulkar has scored the most runs, followed by Ganguly, then Dravid, and so on. The size of the entire visual is representative of the total runs ever scored by India in one-day internationals.

Colour is based on the strike rate. The greener the rectangle, the faster the score. You can see that Sehwag has done a fairly good job. So has Yusuf Pathan, one of the smaller green boxes. But given that that box is just about 1/10th the size of Sehwag’s, you could say that Yusuf Pathan has a long way to go before he can be considered on par.

You’ll also find that many of the players who have a lower run rate – like Ravi Shastri, Dilip Vengsarkar, Sunil Gavaskar, Mohinder Amarnath, etc. – were playing in a different era, a time when a score of 200 was considered a rather good score. Today, 300 would be a respectable target.

It turns out that strike rate increases at around 3.5% every decade. If we adjust for that and re-plot the strike rates, it emerges that Kapil Dev’s adjusted strike rate is almost exactly the same as Sewhag’s, and between them, we have the two fastest players India has had.
• Gramener is a data analytics and visualisation company.

We have the ability to process data at a small and a large scale.
We analyse the data to find non-intuitive insights that lie hidden behind it and present it as a visual story that makes those insights obvious in real time.
• ### Visual Intelligence @ IBS, Hyderabad

1. 1. GANES KESARI, VP DELIVERY VISUAL INTELLIGENCE IBS CONFERENCE, HYDERABAD (13 AUG ‘14)
2. 2. AGENDA 2 Industry Case Studies Some interesting applications of Visual Intelligence Why Visualize What is Data Visualization and what’s the need for it Q&A Open discussion
3. 3. A DATA VISUALISATION CHALLENGE… You will see 3 questions. You have 30 seconds. Try it! Your timer starts now
4. 4. HOW MANY NUMBERS ARE ABOVE 100? 1 23 32 71 72 58 87 11 77 70 16 17 21 56 44 68 51 84 20 60 40 37 8 107 14 12 41 69 14 18 71 62 55 59 64 33 55 71 58 103 92 101 56 45 34 43 15 73 78 6 93 39 53 22 26 26 94 60 82 99 74 11 12 36 67 70 71 97 59 73 99 75 74 69 69 51 48 2 66 92 98 15 10 41 58 104 94 92 84 74 82 12 52 10 57 33 77 88 81 81 91 15 56 25 30 21 7 66 66 78 87 29 23 5 34 11 96 74 99 99 88 37 10 43 15 50 71 65 60 101 98 46 34 19 102 57 70 95 84 63 91 3 34 39 37 60 81 65 63 9 71 48 46 25 50 22 64 91 76 71 79
5. 5. HOW MANY NUMBERS ARE BELOW 10? 2 23 32 71 72 58 87 11 77 70 16 17 21 56 44 68 51 84 20 60 40 37 8 107 14 12 41 69 14 18 71 62 55 59 64 33 55 71 58 103 92 101 56 45 34 43 15 73 78 6 93 39 53 22 26 26 94 60 82 99 74 11 12 36 67 70 71 97 59 73 99 75 74 69 69 51 48 2 66 92 98 15 10 41 58 104 94 92 84 74 82 12 52 10 57 33 77 88 81 81 91 15 56 25 30 21 7 66 66 78 87 29 23 5 34 11 96 74 99 99 88 37 10 43 15 50 71 65 60 101 98 46 34 19 102 57 70 95 84 63 91 3 34 39 37 60 81 65 63 9 71 48 46 25 50 22 64 91 76 71 79
6. 6. WHICH QUADRANT HAS THE HIGHEST TOTAL? 23 32 71 72 58 87 11 77 70 16 17 21 56 44 68 51 84 20 60 40 37 8 107 14 12 41 69 14 18 71 62 55 59 64 33 55 71 58 103 92 101 56 45 34 43 15 73 78 6 93 39 53 22 26 26 94 60 82 99 74 11 12 36 67 70 71 97 59 73 99 75 74 69 69 51 48 2 66 92 98 15 10 41 58 104 94 92 84 74 82 12 52 10 57 33 77 88 81 81 91 15 56 25 30 21 7 66 66 78 87 29 23 5 34 11 96 74 99 99 88 37 10 43 15 50 71 65 60 101 98 46 34 19 102 57 70 95 84 63 91 3 34 39 37 60 81 65 63 9 71 48 46 25 50 22 64 91 76 71 79 3
7. 7. A DATA VISUALISATION CHALLENGE… We’ll answer the same questions again. But with simple visual cues. See how long it takes. Your timer starts now
8. 8. 23 32 71 72 58 87 11 77 70 16 17 21 56 44 68 51 84 20 60 40 37 8 107 14 12 41 69 14 18 71 62 55 59 64 33 55 71 58 103 92 101 56 45 34 43 15 73 78 6 93 39 53 22 26 26 94 60 82 99 74 11 12 36 67 70 71 97 59 73 99 75 74 69 69 51 48 2 66 92 98 15 10 41 58 104 94 92 84 74 82 12 52 10 57 33 77 88 81 81 91 15 56 25 30 21 7 66 66 78 87 29 23 5 34 11 96 74 99 99 88 37 10 43 15 50 71 65 60 101 98 46 34 19 102 57 70 95 84 63 91 3 34 39 37 60 81 65 63 9 71 48 46 25 50 22 64 91 76 71 79 HOW MANY NUMBERS ARE ABOVE 100? 1
9. 9. HOW MANY NUMBERS ARE BELOW 10? 2 23 32 71 72 58 87 11 77 70 16 17 21 56 44 68 51 84 20 60 40 37 8 107 14 12 41 69 14 18 71 62 55 59 64 33 55 71 58 103 92 101 56 45 34 43 15 73 78 6 93 39 53 22 26 26 94 60 82 99 74 11 12 36 67 70 71 97 59 73 99 75 74 69 69 51 48 2 66 92 98 15 10 41 58 104 94 92 84 74 82 12 52 10 57 33 77 88 81 81 91 15 56 25 30 21 7 66 66 78 87 29 23 5 34 11 96 74 99 99 88 37 10 43 15 50 71 65 60 101 98 46 34 19 102 57 70 95 84 63 91 3 34 39 37 60 81 65 63 9 71 48 46 25 50 22 64 91 76 71 79
10. 10. WHICH QUADRANT HAS THE HIGHEST TOTAL? 3 23 32 71 72 58 87 11 77 70 16 17 21 56 44 68 51 84 20 60 40 37 8 107 14 12 41 69 14 18 71 62 55 59 64 33 55 71 58 103 92 101 56 45 34 43 15 73 78 6 93 39 53 22 26 26 94 60 82 99 74 11 12 36 67 70 71 97 59 73 99 75 74 69 69 51 48 2 66 92 98 15 10 41 58 104 94 92 84 74 82 12 52 10 57 33 77 88 81 81 91 15 56 25 30 21 7 66 66 78 87 29 23 5 34 11 96 74 99 99 88 37 10 43 15 50 71 65 60 101 98 46 34 19 102 57 70 95 84 63 91 3 34 39 37 60 81 65 63 9 71 48 46 25 50 22 64 91 76 71 79
11. 11. Most discussions of decision-making assume, that only senior executives make decisions or that only senior executives’ decisions matter. This is a dangerous mistake… - Peter F. Drucker It's clearly a budget! Has a lot of numbers in it! - George W. Bush
12. 12. Information is the oil of the 21st century, and analytics is the combustion engine -- Gartner Business analytics software grew 14% in 2011 and will hit \$50.7 bn by 2016 -- IDC … the #1 trend is applying information & analytics to solve business problems -- Deloitte MARKET Transaction data Increasing data being churned out by systems in information highway 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 Social network data Consumers embracing Web 2.0 and the social media lifestyle M2M data Portable devices generating data for consumption by systems Storage cost Material science research has led to significant increase in data density Bandwidth cost Driven by massive investments in fibre capacity Processing cost Moore’s law has doubled the processing power per \$ every 1.5 yrs Growth in available data, and the potential for exploiting these, have grown exponentially in the last 10 years. This changing data landscape heralds a radical shift in business decision-making approach, even for mere survival in this new age. Data growth to 7.9 ZB by 2015 posing a real ‘Data Tsunami’ Gartner’s BI Magic Quadrant Trends • Emergence of data discovery/ visualization • Increased willingness for new low-cost options • Embedded low-cost purpose-built analytic apps • Need for intuitive BI tools on mobile platforms
13. 13. “ THE VISUAL INTELLIGENCE MARKET Business Analytics software market is slated to grow at 10% annually and hit \$50.7 Bn by 2016 By 2015: • Big Data market would be \$16.9 Bn • Data Discovery market would be \$ 1.6 Bn Today, only strategic decisions are made at a rate slower than the speed of business. Tactical and operational decisions must increasingly be at a rate faster than humans are capable of. …challenge of mastering big data analytics is the hardest, because big data technologies are not ready for enterprise demands, & skills to work with big data are scarce “ Data discovery and visualization vendor initiatives, and the rapid adoption from end users, have the potential to shake the BI market to its foundations. “Visualization would be the platform on which Big Data consumption would happen”
14. 14. WHO USES DATA VISUALISATION? Internet giants Google LinkedIn ….. Newspapers New York Times The Guardian ….. TV Channels CNN BBC ….. Sports NFL NBA ….. Banks World Bank Citibank ….. Retailers Amazon Tesco ….. Manufacturer s Coke Airbus ….. Government US Government UK Government ….. Telecom Airtel BT ….. Science Molecule imaging Star formation ….. Healthcare GE Healthcare Detroit Medical C. ….. and pretty much anyone, today…
15. 15. CASE STUDIES
16. 16. EDUCATION PREDICTING MARKS What determines a child’s marks? Do girls score better than boys? Does the choice of subject matter? Does the medium of instruction matter? Does community or religion matter? Does their birthday matter? Does the first letter of their name matter?
17. 17. 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 TN CLASS X: ENGLISH
18. 18. TN CLASS X: SOCIAL SCIENCE 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
19. 19. TN CLASS X: LANGUAGE 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
20. 20. TN CLASS X: SCIENCE 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
21. 21. TN CLASS X: MATHEMATICS 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
22. 22. CBSE 2013 CLASS XII: ENGLISH MARKS
23. 23. Based on the results of the 20 lakh students taking the Class XII exams at Tamil Nadu over the last 3 years, it appears that the month you were born in can make a difference of as much as 120 marks out of 1,200. June borns score the lowest The marks shoot up for Aug borns … and peaks for Sep-borns 120 marks out of 1200 explainable by month of birth An identical pattern was observed in 2009 and 2010… … and across districts, gender, subjects, and class X & XII. “It’s simply that in Canada the eligibility cutoff for age-class hockey is January 1. A boy who turns ten on January 2, then, could be playing alongside someone who doesn’t turn ten until the end of the year— and at that age, in preadolescence, a twelve-month gap in age represents an enormous difference in physical maturity.” -- Malcolm Gladwell, Outliers
24. 24. BIRTHDAYS IN THE US AND IN INDIA https://gramener.com/posters/Birthdays.pdf
25. 25. CRICKET FASTEST SCORERS “ I’ve always been curious… who among India’s prolific one-day run-getters had the best strike rate? Sachin? Sehwag? What about the rest of the world?
26. 26. INDIA ODI BATTING https://gramener.com/cricket/
27. 27. https://gramener.com/cricket/
28. 28. Shift Evening Morning Night Weekday Fri Mon Sat Sun Thu Tue Wed Product category FAH N70 RPP TDS ZDH Part shipment 20-40% 40-60% 60-80% <20% Full CARGO DELAY This visualisation measures the recovery time (time from arrival of the flight until delivery), and identifies which factors most influence the recovery time. Recovery times are neutral during the evening and morning shifts (mornings are slightly worse), night times are the best. Recovery times are worst on Fridays, and best on Saturdays & Wednesdays. Specifically, Friday mornings are particularly bad. So are Thursday mornings. The FAH product category has the best recovery time, while ZDH is much worse. However, RPP on Sundays is unusually slow. Part shipped products tend to perform worse than full-shipments. Specifically the <20% and 40-60% part-shipments. This is especially problematic for ZDH This visualisation is part of a suite of analytical techniques we call “grouped means” that allows us to measure the impact of every parameter (shifts, weekdays, etc.) on any measure of interest – recovery time in this case, but this could be extended to revenue, operational efficiency, or ability to cross-sell. It allows automatically detection of statistically significant flows and highlights only relevant ones to users. The system therefore analyses all possible patterns, but users only see the insights that matter.
29. 29. THE SOCIAL TALE OF TWO CITIES: BANGALORE & SINGAPORE Recruiting top quality developers is always a problem. We decided to use an algorithmic approach and pulled out the social network of developers on Github (a social network for open source code). In this visualisation, each circle is a person. The size of the circle represents the number of followers. Larger circles have more followers (but not in proportion – it’s a log scale.) The circle’s colour represents the city the programmer’s live in. This visual is a slice showing the tale of two cities: Bangalore and Singapore Two people are connected if one follows the other. This leads to a clustering of people in the form of a network. Here, you can see that Bangalore and Singapore are reasonably well connected cities. Bangalore has more developers, but Singapore has more popular ones (larger circles). However, the interaction between Bangalore and Singapore are few and far between. But for a few people across both cities, like: … etc. Sudar, Yahoo! Anand C, Consultant Kiran, Hasgeek Anand S, Gramener Mugunth, Steinlogic Honcheng, buUuk Sau Sheong, HP Labs Lim Chee Aung Bangalore Singapore 1 follower 100 followers A follows B (or) B follows A Most followed in Bangalore Most followed in Singapore Ciju Cherian Lin Junjie Amudhi Sebastian There are, of course, a number of smaller independent circles – people who are not connected to others in the same city. (They may be connected to people in other cities.) Apart from this, there are a few small networks of connected people – often people within the same company or start-up – who form a community of their own. https://gramener.com/codersearch/
30. 30. Can we visualize the results of every single Lok Sabha election since Independence?
31. 31. https://gramener.com/election/parliament
32. 32. On the other hand, which party has the most losses in Lok Sabha elections?
33. 33. https://gramener.com/election/parliament#story.ddp
34. 34. Which constituency had the most number of candidates ever?
35. 35. https://gramener.com/election/parliament#cartogram?YEAR=1996&METRIC=CANDIDATES
36. 36. We handle terabyte-size data via non-traditional analytics and visualise it in real-time. Gramener visualises your data Gramener transforms your data into concise dashboards that make your business problem & solution visually obvious. We help you find insights quickly, based on cognitive research, and our visualisations guide you towards actionable decisions. A data analytics and visualisation company www.gramener.com blog.gramener.com
37. 37. FURTHER READING EDWARD TUFTE: CLASSICS ON VISUALIZATION STEPHEN FEW: DASHBOARD DESIGN