Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

VDD Tools

568 views

Published on

  • I seemed to underperform in my mock exams - achieving D's/E's but after following your strategy and advice, I achieved a 'B' grade in my final GCSE maths exam. I was chuffed because this result enabled me to study A-Level Chemistry. I've used your revision principles again and this has helped me immensely in this subject. Thank you so much Jeevan.. my 'B' grade will definitely help me in applying for a Pharmacy' course at University...◆◆◆ http://ishbv.com/jeevan91/pdf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • The Scrambler Unlock Her Legs | 95% Off by Bobby Rio-Rob Judge? ★★★ http://scamcb.com/unlockher/pdf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • This is a great guide! It is well written and easy to follow. It tells you what you should do on the very first day of revision right up until the day before the exam. I recommend anyone that's taking their GCSE maths to get their hands on this valuable guide! It showed me exactly where I was going wrong. I was one of those students who was making this 'silly mistake' repeatedly, and not knowing that I was making it. No wonder I failed my GCSE maths first time... Once I addressed this mistake, I scored 90% in my GCSE maths exam... ➤➤ http://ishbv.com/jeevan91/pdf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

VDD Tools

  1. 1. Visual Data Discovery Tools Database Warehouse Environment (COMP10002) B00272662 – Gary Brogan
  2. 2. Contents Introduction to Visual Data Discovery Tools.................................................................................................2 Definition ......................................................................................................................................................2 Key Features..................................................................................................................................................3 Data Connectivity..........................................................................................................................................4 SQL Database ........................................................................................................................................4 Hadoop..................................................................................................................................................5 Data warehouse....................................................................................................................................6 Excel Spreadsheet files..........................................................................................................................6 OLAP Cube.............................................................................................................................................7 The Visual Data Discovery Market................................................................................................................8 Choosing the Best in the Segment............................................................................................................9 Best in Segment Table.............................................................................................................................10 Qlik Sense................................................................................................................................................10 Power BI..................................................................................................................................................11 Tableau....................................................................................................................................................11 Self-service BI..............................................................................................................................................12 Tableau........................................................................................................................................................14 Using Tableau..............................................................................................................................................16 Evaluating Tableau......................................................................................................................................19 Dashboards .................................................................................................................................................22 References ..................................................................................................................................................26 Appendix .....................................................................................................................................................27 Appendix 1 – Worksheets & Related Question Number .......................................................................27 Appendix 2 – Questions & Related Dashboards.....................................................................................34
  3. 3. Introduction to Visual Data Discovery Tools The recent data explosion has seen the volume, variety and velocity of data increase. This has created the need for more powerful BI tools which can explore and understand data in more interesting and user friendly ways. This is where Visual Data Discovery (VDD) tools come in. VDD tools not only explore and understand data but are becoming the fastest way for business users to unlock the potential of the data. Data is represented in a highly visual, interactive form that can provide valuable insights, at a glance, to the user. Definition Visual data discovery tools are a relatively new addition to the business intelligence market. Like any new tool there are a lack of sources that provided a clear definition which has led to a fair amount of confusion on “what” it is and “what” it is not. This report will review and evaluate many academic sources, BI Industry books, journals and reports as well as Vendor specific websites to try and identify the most accurate definition. It has been initially agreed that ease of use, time to insight, the ability to view data through highly visual reports are some of the key features of VDD tools. The three definitions below were chosen as they represent two of the main industry players and an independent researcher. Tableau, a business intelligence software provider who are currently Gartner's Magic Quadrant for Business and Analytics Platforms top performer for VDD tools, define VDD tools as tools that "can enable diverse types of users from data scientists working with big data to nontechnical business managers and frontline users to see significant trends and patterns in data that they would have struggled to see in voluminous tabular reports and spreadsheets". David Stodder, Director of TDWI Research for Business Intelligence, added during his recent webinar with Tableau that the "users’ ability to comprehend information quickly and put it to productive use hinges on data visualization." Datawatch, who provide the only Managed Analytics Platform that brings together self-service data preparation with VDD, define VDD tools as interactive and exploratory. Tools that build on your intuitive ability to find the nuggets of truly important information – the stuff that will transform your business. You can ask questions of the data and see results instantly, then ask follow up questions in just seconds. Drill down to get more detail about an outlier, or zoom out to discover trends you never saw before. Interactive filters let you get rid of irrelevant data and noise so you can spot the underlying patterns and see what others can’t. It’s a highly effective and iterative exploration that gives you the tools to make better and faster decisions, no matter how much data you have to sort through or how quickly it changes. Wayne Eckerson Director, BI Leadership Research defines VDD tools as self-service, in-memory analysis tools that enable business users to access and analyse data visually at the speed of thought with minimal or no IT assistance and then share the results of their discoveries with colleagues, usually in the form of an interactive dashboard. VDD tools are used by - (1) Power users to explore and analyse data in a variety of systems, (2) Super users and BI specialists to create interactive dashboards for colleagues and (3) Casual users to view and work with those dashboards.
  4. 4. Cindi Howson, a BI industry expert and author of Successful Business Intelligence: Unlock the Value of BI & Big Data, defines a Visual Data Discovery tool as a "tool that speeds the time to insight through the use of visualizations, best practices in visual perception, and easy exploration. Such tool support business agility and self-service BI through a variety of innovations that may include in-memory processing and mashing of multiple data sources". After careful consideration and deliberation a decision was made to compare Cindi Howson's definition of a VDD tool against the other three definitions to find which one of the three definitions more closely matched her definition. It was agreed that as an industry expert her definition gave a broader overview of the VDD tool and seems the more impartial viewpoint. It captures the essence of what Visual Data Discovery in a concise manner and would be the best definition to compare the others against. Given this, the definition from Wayne Eckerson captures some key features and explains how the user can manipulate the data in such a way that the business can make better and faster business decision. The other two definitions where considered more technical and contain more industry jargon that could be confusing to those who could be conceived as non-technical and unfamiliar with BI. The other two definitions fail to mention much on the key features and more on the patterns and trends associated with the tool. As both these definitions are from specific vendors of VDD tools they can be viewed as having a bias towards their own products and therefore do not capture the true essence of what VDD tools are. Key Features Through the research carried out to determine a definition for a VDD tool It was agreed that ease of use, time to insight, the ability to view data through highly visual reports are some of the key features of VDD tools. In this section of the report we will look to expand on these features and add to them to give a better understanding of what they are and why they are key for any VDD tool to be successful in today’s market. The features are listed below in the agreed order of importance. 1. Highly integrated graphics that incorporate data visualisations best practices takes the number one spot on the list as VDD tools by nature should be highly visual. This allows the end users to see graphical elements that would mean something to them. The tool should also be able to automatically represent the data with the most appropriate visual for the type of data selected, (i.e. provide geographical map views for a quick understanding of geospatial data, identify and explain the relationships between variables, and offer a variety of analytic visuals such as box plots, heat maps, and correlations.) 2. Data Connectivity meaning the ability to connect to multiple data source takes the second spot on the list as modern businesses tend not store data in single place. Multi-platform data storage solutions are the rule and not the exception. The ability to analyses data no matter where it is stored to get the answers needed is essential to any business.Figure1 below shows a visual representations of the different types of data sources that can be connected to a VDD.
  5. 5. Figure 1 http://www.qlik.com/products/qlik-sense 3. In-memory processing capabilities adds to speed of analysis and enable very fast interaction by storing the data in memory so it does not have to continually access data from the mechanical disk drive. There is no predetermined structure in-memory, so that analysis can be completely ad-hoc and can be used against any element of the data 4. Integrated, intuitive, approachable analytic capabilities should remove the complexity of data structures for nontechnical users so that they can explore and seek correlations on data sets. Allowing users to slice and dice multidimensional data by applying filters on any level of a hierarchy structure drill up and down through hierarchies or expand and collapse entire levels. Calculate new measures and add them to any view and save views as report packages to share with others. 5. Reporting is the number one BI process and has been for a number of years. As VDD tools are aimed not only at technical but nontechnical users Easy report building is an important feature allowing nontechnical users the means to share their finding with collages through dashboards and story boards. 6. Ability to easily distribute insights via mobile devices and Web portals. As we are now in the “mobile age” more and more business wants to access their data while on the move. Giving users the ability to easily distribute insights via Mobile devices and web portals will encourage more users to engage with VVD tools. Data Connectivity Throughout the investigation conducted to select a definition and find key features of a VDD tool it was noted that most vendors were listing the same data sources when talking about data connectivity. There were five data sources that every vendor spoke about and they were Excel spreadsheets, SQL databases, Data warehouses, Hadoop and OLAP cubes due to these finding a decision was made to conduct further investigation into what these source are and how they are used. Listed below are the five explanations of the data sources outlining how they store and manage the data. SQL Database SQL databases are more commonly referred to as a relational database are used to store structured and dynamic data , a relational database can be defined as a set of tables containing data fitted into
  6. 6. predefined categories. Each table (which is sometimes called a relation) contains one or more data categories in columns. Each row contains a unique instance of data for the categories defined by the columns. The example in figure 2, shows an appointment booking database for a vet’s surgery would include a table to describe a customer with columns for name, address, phone number, and so forth. Another would describe the pet and finally there would be a table to record the appointment details. Different views of the data can be created to suit the users of the databases needs. This Data type was chosen as it is the most commonly used data base type. Figure 2 Hadoop Hadoop, formally called Apache Hadoop, is an Apache Software Foundation project and open source software platform for scalable, distributed computing. Hadoop can provide fast and reliable analysis of both structured data and unstructured data. Given its capabilities to handle large data sets, it's often associated with the phrase big data. The Apache Hadoop software library is essentially a framework that allows for the distributed processing of large datasets across clusters of computers using a simple programming model. Hadoop can scale up from single servers to thousands of machines, each offering local computation and storage. The Apache Hadoop software library can detect and handle failures at the application layer, so it can deliver a highly-available service on top of a cluster of computers, each of which may be prone to failures. Unlike relational database Hadoop can handle the storage of structured and unstructured data. The diagram in figure 3 shows how the data in Hadoop is stored and queried. Figure 3 https://www.linkedin.com/pulse/big-data-answer-hadoop-spark-jawad-yaqub.
  7. 7. Data warehouse The term data warehouse was coined by William H. Inmon, who is known as the Father of Data Warehousing. Inmon described a data warehouse as being a subject-oriented, integrated, time-variant and nonvolatile collection of data that supports management's decision-making process.(Inmon, 1993). Data warehouse can store these types of data: · A data warehouse typically contains several years of historical data. The amount of data that you decide to make available depends on available disk space and the types of analysis that you want to support. This data can come from a transactional database archives or other sources. · Derived data is generated from existing data using a mathematical operation or a data transformation. It can be created as part of a database maintenance operation or generated at run- time in response to a query. · Metadata is data that describes the data and schema objects, and is used by applications to fetch and compute the data correctly. A data warehouse is made of data marts which often hold only one subject area- for example, Finance, or Sales the diagram in figure 4 shows an example off many data marts are joined to create a warehouse. Figure 4- http://www.dataonfocus.com/data-mart-vs-data-warehouse/ Excel Spreadsheet files Microsoft Excel has been the industry standard spreadsheet program since 1993. Excel has the same basic features of all spreadsheets, using a grid of cells arranged in numbered rows and letter-named columns to organize data manipulations like arithmetic operations. It has a battery of supplied functions to answer statistical, engineering and financial needs. In addition, it can display data as line graphs, histograms and charts, and with a very limited three-dimensional graphical display that allow users to organize, format and calculate numeric data with formulas using a spreadsheet system ,Figure 5 shows a basic excel report.
  8. 8. ] Figure 5 Figure 6 http://clasificadosde.com/microsoft-excel-spreadsheet-templates.html OLAP Cube OLAP (online analytical processing) is computer processing that enables a user to easily and selectively extract and view data from different points of view by slicing and dicing the cube of data. To facilitate this kind of analysis, OLAP data is stored in a multidimensional database. Whereas a relational database
  9. 9. can be thought of as two-dimensional, a multidimensional database considers each data attribute (such as product, sales region, and time period) as a separate "dimension." OLAP software can locate the intersection of dimensions (all products sold in the Eastern region above a certain price during a certain time period) and display them. Attributes such as time periods can be broken down into sub attributes. An example of this is found in figure 7 Figure 7 http://www.oracle.com/technetwork/articles/sql/11g-dw-olap-100058.html The Visual Data Discovery Market Since around 2013 VDD tools have begun to receive widespread attention and deployment. Specialty vendors such as Tableau, Qlik, Birst and Pentaho (to name just a few) have emerged to challenge the Industry giants such as Microsoft, SAP, IBM, Oracle and SAS. As the speciality vendors are proving to be more “agile” in deploying visual solutions they have gained large chucks of the market place with the more traditional global multinational vendors seeing their market share stagnate. More recent trends have suggested this gap maybe closing. Microsoft for example has release Power BI which is a free VDD solution and has seen Microsoft regain ground on what are considered specialty vendors such as Tableau and Qlik. Gartners Magic Quadrant for BI and Analytics Platforms shows how vendors within the market place are currently performing
  10. 10. Figure 8 http://www.informationweek.com/big-data/big-data-analytics/gartner-bi-magic-quadrant- 2015-spots-market-turmoil/d/d-id/1319214 Research of the VDD marketplace was conducted with the intention to highlight vendors and tools considered “best in the segment”. In order determine “best in the segment”, and justify the choice, many products and vendors had to be eliminated. To help with the elimination process a number of factors which fall into certain criteria where considered. Each of which are listed below-  Platforms the VDD tool is available on- Windows, MAC etc.  An estimated rating – This has been sourced from independent journals, websites, academic resources, Industry and social media reviews.  Deployment - Cloud, Mobile and traditional  Business size - Small, Medium and Large  Key Features  Availability to Research team – Including costs involved available demos of the software through vendors and other third party sites. Each member of the research team analysed several vendor tools independently from the other and gave feedback on what vendor tools they found matched the criteria given above and which ones had been eliminated. This enabled the search to be filtered down quickly with three vendors emerging as potentially “best in the segment”. Tableau, Qlik Sense by Qlik, and Microsoft Power BI. Choosing the Best in the Segment The following table (Table 1-1) shows the three VDD Vendor tools selected as potentially the best in the segment and the key features considered to be important for the success of this type of BI tool.
  11. 11. A rating system has been devised to give a quantative measure based on the results of the findings (Table 1-2). The measures will help to justify which vendor and BI tool is considered “best in the segment”. Best in Segment Table Table 1-1 Rating System Table Table 1-2 Score Explanation 3 Exceptional capabilities 2 Very good capabilities 1 Limited capabilities, difficult to do, may require work around 0 Minimal capabilities out of the box. The software may require customisation Qlik Sense Qlik Sense [4] is the next generation, data visualisation product from Qlik. It provides self-service data visualisation that is made to be fast, easy to use and understand, and to easily visualise data . Qlik Sense helps business users use their intuition to prompt them to ask questions of the data that can provide valuable business insights and that can lead to important business decisions. Qlik Sense supports a range of use cases, including centrally deployed guided analytics apps and dashboards, custom and embedded analytics, and self-service visualization [5] and includes an in- memory engine. Below is a list of strengths and challenges associated with the software. Main strengths include -  The ability to tell “Stories” through the use of storyboards. The user has a power-point style platform that they can use to navigate through, and provide a highly visual presentation of, the results of analysis.  Load scripts perform data cleansing and transformation.  Has the potential to take the place of a full ETL tool and data warehouse due to its scripting and in-memory engine.  New measures and dimensions can be created and stored as a master item, providing reusability across all the sheets within the application. [6] Best in Segment Table Vendors Key Features Tableau Power BI Qlik Sense Integrated Graphics 3 2 2 Data Connectivity 2 3 2 In-Memory processing 2 1 3 Report Building 3 2 2 Distribution Via Mobile devices and Web portals 3 3 3 Analysis of Real Time & High Volume Data 3 1 2 Total Rating 16 12 14
  12. 12. Challenges  Currently does not include native forecasting features  Transformations and joining of tables requires scripting  Calculations can be created, but there’s no GUI for the functions Power BI Power BI is a visual data discovery tool developed by Microsoft it was originally known as Power BI for office 365. Power BI is based on Power Pivot for Microsoft Excel and has now been developed into a standalone BI application that works independently from Excel Power BI is the newest to the market compared to the other two tools and has heavily focused on collaboration and has integrated it with other Microsoft services and unlike the other two it is free for businesses which have a Microsoft for business account. Main strengths include-  Lesser Learning curve for those who come from a Microsoft background as many element are based off existing products  The Integration to the cloud enables storing sharing and collaboration  No additional cost to businesses who have already adopted Microsoft services  Power BI has an app developed for (Windows Phone, Android and IOS) Challenges  The product feels immature lacks many of the features of existing tools  Lack of information about the product and only has a small online community  Can only be shared to users inside the same organisation  Maximum of 10GB of Data can be stored on the cloud Tableau Tableau as a company was founded in 2003 as a commercial spin off a PHD research thesis conducted at Stanford University’s Department of Computer Science between 1999 and 2002.The Company now offers five main products: Tableau Desktop, Tableau Server, Tableau Online, Tableau Reader and Tableau Public the last two are available for free to use. In French tableau has two meanings -- painting and table. Artwork and data. Excellent match of brand and product concept Gartner the world's leading information technology research and advisory company have moved Tableau into its coveted upper right quadrant this year, said that “Tableau has the intuitive, visual- based, interactive data exploration experience that customers love to use and competitors love to imitate.
  13. 13. Main strengths include -  State-of-art data visualization makes Tableau the stand out tool in visual data discovery. Tableau people talk about "being creative with data”, it’s easy to see why looking at clean and elegant Tableau dashboards you can create.  Tableau is a Research and Development (R&D) driven company. It continues to invest in R&D at a higher pace (in terms of percentage of revenue — 29% in 2014) than most other BI vendors. This approach has been fruitful as the company has now raised into the position of market leader.  Web-based analytics in minutes, not days as tableau now connect directly to Google Analytics to allow exploration of website' analytical data in an easy, visual way to find patterns and trends, giving multi-dimensional insights into your website traffic.  Tableau's Data Engine has the ability to do ad-hoc analysis of millions of rows of data in seconds with. The Data Engine is a high-performing analytics database on your PC. It has the speed benefits of traditional in-memory solutions without the limitations that the data must fit in memory. There’s no custom scripting needed to use this feature allowing people with a low it skill set to use this function Challenges  Tableau offers limited advanced analytics capabilities. R integration has been recently added and is a major improvement for users needing more statistical and advanced capabilities. Other vendors, such as SAS, SAP and Tibco, have more advanced native capabilities.  The in-memory calculations are slower than other venders such as Qlik Sense. Self-service BI VDD tools are becoming synonymous with self-service BI due to their ease of use and lack of IT involvement. Gartner defines self-service BI “as end users designing and deploying their own reports and analyses within an approved and supported architecture and tools portfolio.”[8] Imhoff, C. & White, C. (2011) define self-service BI as “the facilities within the BI environment that enable BI users to become more self-reliant and less dependent on the IT organization. These facilities focus on four main objectives: easier access to source data for reporting and analysis, easier and improved support for data analysis features, faster deployment options such as appliances and cloud computing, and simpler, customizable, and collaborative end-user interfaces”[15]. [Figure 9]
  14. 14. Figure 9- The four main objectives of do-it-yourself or self-service BI. (Courtesy of BI Research and Intelligent Solutions,Inc) https://tdwi.org/Articles/2011/11/09/RESEARCH-EXCERPT-Introduction-to-Self-Service-Business- Intelligence.aspx Traditional BI created a centralized BI model with a heavy reliance on IT departments for producing reports and analysis. One of the main reasons traditional BI best practices were considered “a bit of a failure” was that requests for information became built-up as IT departments tried to cope with the demand from every area of the business. This created a BI “bottleneck” leading to slow response times (days or weeks) from IT departments to requests for reports and analysis. The current shift away from the centralized BI model to a more decentralized model, in which business users are at the fore front of defining requirements, has seen a shift towards “self-service BI”. Self-service BI has become a top investment and innovation priority for businesses over the past few years. Self-service BI has the potential to offer great value to businesses and as such many industry leaders agree that it should be part of any organisations BI portfolio. The annual Gartner Business Intelligence and Analytics Summit highlighted that “Self-service analytics is white hot and growing while demand for traditional dashboard BI is in remission”. But the successful implementation of self-service BI has proved difficult. Gartner’s survey on the effectiveness of Self Service BI highlighted that 90% of self-service BI projects are projected to fail due to inconsistencies [9]. What is worth noting is that self-service BI should not be about giving business users BI tools and access to BI data and letting them ”get on with it”. For self-service BI to succeed organisations must understand the different considerations of each business user group and have the approved and supported architecture and tools portfolio needed for each group to work more autonomously. Howson.C (2013) highlights four main Self-service BI considerations The use case Sophistication of users Degree of IT involvement
  15. 15. Breadth of data sources [10] Differences in users’ technical awareness and ability can effect what they are trying to accomplish and without carefully considering the four main points above their self-service BI deployments may not be as successful as initially hoped. To put Figure 10 into context an understanding who accesses data and at what level needs to be explained. Information consumers constitute 90% of the BI community in a normal business and they need to track perhaps 20% of the data available. Power users total maybe 10% of the community but will require up to 80% of the data. It highlights the differences in technical skills needed at the various ends of self-service BI spectrum and can relate back to the four main self-service BI considerations. It also highlights the centralisation and de-centralisation BI models and the type of reports and analysis associated with each type of user group. Figure 10 http://exia.ca/en/self-service-bi/ Current self-service BI implementation tries to strike a balance between centralised and de-centralised models but suffers from inconsistencies in terms of data definitions and measures across an organisation. Could this be an indicator as to why Gartner [9] predicts 90% of self-service BI projects are projected to fail due to these inconsistencies? Self-service BI is still a relatively new trend and has its champions. The idea of offering BI tools, and in particular VDD tools, that business users can then use to explore and analyse their data and ask tough business questions with, without the need for a centralized BI model, is indeed an interesting proposition. Tableau With the BI industry growing at an enormous rate and to be worth around 20.1 Billion by 2018 from the 13.9 Billion it was worth in 2013 this has made more companies try and get into this market. Tableau shook the market in 2013 when it over took the traditional VDD providers such SAP, Oracle, SAS, and Microsoft. It became a publicly traded company in May of 2013 and revenues have almost doubled since 2013 from 232 Million to 432 Million this has in effect enabled them to invest more in research and
  16. 16. development. Where 29% of profits are invested into research and development this is one of the highest in the industry and has given them the competitive edge and the ability to innovate at a faster rate. Tableau formed as a company in 2003 from a project called Polaris from which the language wVizQL was developed this combined query, analysis and visualization and was developed by PHD students at Stanford University which tableau still runs on today. The first consumer release of tableau was in 2005 when Tableau 1.0 was released, the piece of software caught the attention of the industry because it allowed users without a high technical skill to use it, but with the severe lack of data connectivity made it no real threat to existing VDD software companies, although this changed in later revisions of the software which made them more of a viable threat. Tableau enables users to create visualisations from one or more data sources so that the viewer can interpret the information with ease. Unlike BI software that can draw graphs it can create multiple types of visualisations and helps the user by adhering to best practice for visualisations with only visualisation that have the correct data used to be created. Compared to the traditional BI software where only highly technical, highly skilled users could perform analytics on the business data compared to tableaus approach of BI for everyone with Tableau’s tremendously powerful and with its easy to use intuitive drag and drop interface and advanced analytic capabilities has empowered all users of any skill level to perform data analysis and visual data discovery, this enables the software to meet its mission statement “Tableau helps people see and understand their data” (Tableau, 2015).Tableau has also been a great choice for large organisations as it is a scalable tool in both hardware and memory. Backed with its security permissions it can ensure the CIA of the data by restricting users, projects and dashboard from those who haven’t got the privileges to see them this, combined with their vast user forum and stellar customer services has pushed them from being mentioned in 2010 as challenger until 2013 onwards where they have been the leader in VDD tools in the Gartner magic quadrant. 2010 2015 Figure 11
  17. 17. Using Tableau Tableau can be used to discover knowledge through visualising data in many different ways. Three different examples of this have been highlighted to show how Tableau encourages user interaction with data. Using Tableaus forecasting feature can quickly show users predicted future trends (figure 12). In the example below each university campus has the average pass rate for each academic year and what is predicted for the following two years after. The forecasting feature has a number of different options that allows the customisation of the forecast models, forecast accuracy testing and the ability to amend the forecast length. Figure 12 Tableau uses line graphs which can be used to show results of analysis and allow users to apply filters to drill down into specific areas of the dataset (figure 13). The example below highlights the average pass rate per academic year per campus. The filters encourage users to interact with the data directly by drilling down to a specific year or campus. The application of filters in figure 13 shows the data set drilled down to three specific academic years and two different campuses.
  18. 18. Figure 13 Figure 14 The final example (figure 15) highlights Tableaus ability to show geographical locations and the results of the data analyzed within each location using the map feature. Tableau calculates the latitude and longitude of each location then presents the locations in a highly visual way. The main mapping features available in Tableau allow maps to be layered, allows import custom geocoding, and editing of locations. By applying dimensions to the Pages shelf, and showing quick filters, Tableau can control the display of the analysis results. This again encourages user interaction with the data.
  19. 19. Figure 15
  20. 20. Evaluating Tableau Tableau was assessed using a System Usability Scale (SUS) which highlighted that each member of the group had encountered a similar experience when using tableau with the majority of the group giving it a SUS score above the average of 68. After the extensive use of the software a meeting was held where some of the pros and cons of using Tableau for Visual Data Discovery were discussed and an agreement was reached on what the strengths and weaknesses of tableau are and the finding will be discussed below. The intuitive and easy to use drag and drop interface was agreed as a major strength of the software with feel been that this works well and made the software non-threating to non-technical users. This interface made it very easy to use key features such as Drilling down, Slice and Dice without prior knowledge of SQL or Business Intelligence as well as making it easy to make dashboard to increase the impact of the analytics to the viewer. One feature which was supposed to compliment dashboards was storyboarding but the feeling was that this feature was lacking and did not add any more benefits to the software. The only issue the group had with the user interface was that on a project with many worksheets tableau would become cluttered and difficult to navigate as you had to scroll left and right along the bottom of the screen which made it quite difficult to locate a specific sheet that maybe required. One of the main issues the group had with the software was the lack of collaboration features built to the software, When multiple members of the group were wanting to work on tableau at the same we had to use separate documents which slowed us down as we had to combine the changes we all made at a later date. Tableau had plenty of security features for enterprise environments and had the ability to change a wide range of user permissions in the software meaning that the one project could have users from different departments and depending on the level of access they have would allow them to see different worksheets, dashboards or Storyboard One area tableau excelled in was the ease of connectivity this was very easy to use and the vast amount of data sources although unlike some other VDD providers we could not manipulate the data in the software which made us have to go back and edit the data source via other software. Tableaus ability for aggregations was not as intuitive as the interface as when trying to use more advanced calculations it became hard to code tableau due to tableau having its own language and not using a standard such SQL. Overall Tableau is a very impressive piece of software, the ability to create advanced visualisations quickly and with relative ease meant that even users without a background in Business intelligence could find tends and key pieces of information to improve their business. Below are the results of the Systems Usability Scale (SUS) that was used to evaluate Tableau. Each group member undertook the evaluation independently from the other group members. The names of the group members have been removed from SUS.
  21. 21. Dashboards This report answers fifteen questions relating to the data set used. The questions have been answered and positioned on the relevant dashboards. To properly analyse the data set the questions are not answered in order, i.e. 1 to 15, but in what is perceived as the "natural flow" of the questioning. It was agreed this would be a more realistic way to present the visualised data. Appendix 2 shows  The questions to be answered given the data set.
  22. 22.  What dashboard the question is in  What work sheet answers the question. Snapshots of each dashboard are shown below
  23. 23. References 1. https://support.powerbi.com/knowledgebase/articles/471664-getting-started-with-power-bi- designer 2. http://www.slideshare.net/JoelBenway/wwwadvizorsolutionscom-presents-rise-of-data- discovery-11-0509-3 3. Duell, E., 2013. Leading Tools for Data Discovery and Visual Analytics. http://www.clearmeasures.com/media/Leading-Tools-for-Data-Discovery-and-Visual-Analytics- WP.pdf [Retrieved : 01/11/2015] 4. Software Advice, http://www.softwareadvice.com/bi/qlik-sense-profile [Retrieved: 29/10/2011] 5. Llc, A.S.K., This, B.I.S. & Date, C.O.M.R., 2014. Qlik Sense A review of Qlik ’ s visual data discovery tool. 6. Visual discovery tools. 2015. Visual discovery tools. [ONLINE] Available at:http://www.slideshare.net/TheMarketingDistillery/visual-discovery-tools-22870544. [Retrieved : 02/11/2015]. 7. Gartner, Inc. (NYSE: IT) “IT Glossary” http://www.gartner.com/it-glossary/self-service-business- intelligence [Retrieved 08/11/2015] 8. Gartner, Inc. (NYSE: IT) “Business Intelligence” http://www.gartner.com/it-glossary/business- intelligence-bi/ [Retrieved: 04/10/2015] 9. Howson, Cindi, , 2014. Successful Business Intelligence: Unlock the Value of BI & Big Data. McGraw-Hill Education. 10. Data Warehouse | Gartner. 2015. Data Warehouse | Gartner. [ONLINE] Available at: http://www.gartner.com/it-glossary/data-warehouse. [Retrieved : 09/11/2015]. 11. What is Hadoop? | SAS. 2015. What is Hadoop? | SAS. [ONLINE] Available at: http://www.sas.com/en_us/insights/big-data/hadoop.html. [Retrieved : 02/11/2015]. 12. The Big 'Big Data' Answer: Hadoop AND Spark. | Jawad Yaqub | LinkedIn. 2015. The Big 'Big Data' Answer: Hadoop AND Spark. | Jawad Yaqub | LinkedIn. [ONLINE] Available at: https://www.linkedin.com/pulse/big-data-answer-hadoop-spark-jawad-yaqub. [Retrieved : 10/11/2015]. 13. Oracle Database 11g: The Top Features for DBAs and Developers | Data Warehousing and OLAP. 2015. Oracle Database 11g: The Top Features for DBAs and Developers | Data Warehousing and OLAP. [ONLINE] Available at: http://www.oracle.com/technetwork/articles/sql/11g-dw-olap- 100058.html. [Retrieved : 13/11/2015]. 14. Data Mart vs Data Warehouse - DataOnFocus. 2015. Data Mart vs Data Warehouse - DataOnFocus. [ONLINE] Available at: http://www.dataonfocus.com/data-mart-vs-data- warehouse. [Retrieved : 13/11/2015]. 15. Imhoff, C. & White, C. (2011). Self-Service Business Intelligence: Empowering Users to Generate Insights. TDWI Rep. 1–37 at <http://www.sas.com/resources/asset/TDWI_BestPractices.pdf>
  24. 24. Appendix Appendix 1 – Worksheets & Related Question Number 1. 2.
  25. 25. 3. 4.
  26. 26. 5. 6.
  27. 27. 7. 8.
  28. 28. 9. 10.
  29. 29. 11. 12.
  30. 30. 13. 14.
  31. 31. 15. Appendix 2 – Questions & Related Dashboards Use Tableau to examine the data and attempt to discover answers to the following questions through data visualization 1) How has the pass rate for this module changed over the past 6 years? See Cross Campus Dashboard - Pass Rate Monitoring 2) How has the number of students taking this module changed over the past 6 years? See Cross Campus Dashboard - Student Number Monitoring 3) The UWS introduced a new attendance monitoring processes starting in academic year 2011/12 – Did this initiative affect the pass rate? See Pass Rate Initiatives Dashboard - Attendance Monitoring 4) UWS encouraged the use of early engagement assessments as an indicator of student engagement in academic year 2012/13. For Database Applications this was the introduction of a multiple choice exam in Week 5 – Did this initiative affect the pass rate? See Pass Rate Initiatives Dashboard - Early Engagement Assessment 5) In 2013/14 the UWS introduced a rule that if a student scored zero for any assessed component of the module, an automatic re-take (i.e. RA or RA2 code) would be assigned to the student. This means that the student is withdrawn from the current session for the module and should re-register for next year’s running of the module - Did this rule change affect the pass rate?
  32. 32. See Pass Rate Initiatives Dashboard - Introduction of RA2 rule. 6) UWS introduced a categorisation of the A grade to reveal A3 (>=70%), A2 (>=80%) and A3 (>=90%) - Did this initiative affect the numbers achieving an A grade? See Pass Rate Initiatives Dashboard - A Grade categorisation results 7) In 2014/15 the multiple choice exam was integrated into the exam mark and not the coursework mark as in previous years – Did this change affect the pass rate? See Pass Rate Initiatives Dashboard - Multiple choice exam 8) Does the spread of marks/grades (e.g. for coursework, the exam and the final mark) for this module follow normal distribution? If not, why? See Spread of Grades Dashboard - 9) The UWS aims for equality of experience across all campuses of the university. Do the results for this module support this notion? See Lecturer and Campus Analysis Dashboard - Equality of Experience per Campus - 10) For 2015/16, students have registered to take the DA module and there are currently 18 students (Ayr campus), 126 students (Paisley campus), 11 students (Dumfries campus) and 15 students (Hamilton campus). Can you predict the future for this module in terms of the numbers taking and passing this module and the likely number of students achieving each grade at each campus based on past trends? See Forecasting Analysis Dashboard 11) Is there any trends or patterns that emerge over the years based on student sitting the coursework and exam over the years ? See Module and Grade Analysis- Students Results Pattern 12)What is the forecast of pass rate % per lecturer ? See Lecture & Campus Analysis Dashboard - Pass Rate per lecture including Forecast 13) What is the spread grades between majors ? See Module and Grade Analysis - Spread of Grades Between Major 14)How the number of student achieving proposal codes have changed over the years ? See Module and Grade Analysis- Proposal Code Analysis) 15) How many students took the course per Major for each year ? See Module and Grade Analysis -Student Numbers Per Major

×