Choosing a data visualization tool is like being a barista serving coffee: everyone wants their data, their way, personalized, fast, and perfect. Many organizations have a cottage industry of data visualization tools, and it's difficult to know what tool to use, and when. Different tools exist in different departments, and if it doesn't meet the user requirements, the default position is to go back to Excel and move the data around there.
This session will examine data visualization tools such as SSRS Excel, Tableau, QlikView, Datazen, Kibana and PowerBI, in order to craft and blend your data visualization tools to serve your data customers better.
23. Why R?
• most widely used data analysis software - used by 2M + data scientist,
statisticians and analysts
• Most powerful statistical programming language
• flexible, extensible and comprehensive for productivity
• Create beautiful and unique data visualisations - as seen in New York Times,
Twitter and Flowing Data
• Thriving open-source community - leading edge of analytics research
• Fills the talent gap - new graduates prefer R.
24. Growth in Demand
• Rexer Data Mining survey, 2007 - 2013
• R is the highest paid IT skill Dice.com, Jan 2014
• R most used-data science language after SQL -
O'Reilly, Jan 2014
• R is used by 70% of data miners. Rexer, Sept 2013
25. Growth in Demand
• R is #15 of all programming languages. REdMonk, Jan
2014
• R growing faster than any other data science language.
KDNuggs.
• R is in-memory and limited in the size of data that you
can process.
26. What do I need to install?
• Install R – www.r-project.org
• Install Rstudio – www.rstudio.com
• Handy Shortcuts
• Tab – autocomplete of available functions
• Control and Up Arrow – History
• Control and enter – executes the line of code
27. What tools do we have in R?
• 80% of your time will be spent preparing and wrangling data
• The remainder of your time will be spent complaining about it.
• dplyr: the essential data manipulation toolset
• In data wrangling, what are the main tasks?
• – Filtering rows
– Selecting columns of data
– Adding new variables
– Sorting
– Aggregating
36. Kibana
• It is highly customizable dashboarding
• It is constituted of panels:
– Time picker / Query / Filtering
– Charts / Table / Text
37. Flexible analytics and visualization platform
Real-time summary and charting of streaming
data
Intuitive interface for a variety of users
Instant sharing and embedding of dashboards
38. To better understand large volumes of data..
• easily create bar charts
• line and scatter plots
• Histograms
• pie charts
• maps.
39. To better understand large volumes of data..
• easily create bar charts
• line and scatter plots
• Histograms
• pie charts
• maps.
We will look at: introductory R and why it's useful, and where to go for more information. We will learn statistics and R by looking at: independent events, dependent probability, combinatorics, hypothesis testing, descriptive statistics, random variables, probability distributions, regression, and inferential statistics. We will loosely base the curriculum on the Khan Academy statistics course, but we aim to help the curious, the scared, and the rookie.
Effective: the viewer gets it (ease of interpretation)
Accurate: sufficient for correct quantitative evaluation. Lie factor = size of visual effect/size of data effect
Efficient: minimize data-ink ratio and chart-junk, show data, maximize data-ink ratio, brase non-data-ink, brase redundant data-ink
Aesthetics: must not offend viewer's senses (e.g. moire patterns)
Adaptable: can adjust to serve multiple needs
Effective: the viewer gets it (ease of interpretation)
Accurate: sufficient for correct quantitative evaluation. Lie factor = size of visual effect/size of data effect
Efficient: minimize data-ink ratio and chart-junk, show data, maximize data-ink ratio, brase non-data-ink, brase redundant data-ink
Aesthetics: must not offend viewer's senses (e.g. moire patterns)
Adaptable: can adjust to serve multiple needs