Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Article: "Improved Knowledge from Data: Building an Immersive Data Analysis Platform"


Published on

Article published on SVR 2018.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Article: "Improved Knowledge from Data: Building an Immersive Data Analysis Platform"

  1. 1. Improved Knowledge from Data: Building an Immersive Data Analysis Platform Felipe Augusto Pedroso* Paula Dornhofer Paro Costa† Dept. of Computer Engineering and Industrial Automation (DCA) School of Electrical and Computer Engineering - University of Campinas (UNICAMP) ABSTRACT We are facing an unprecedented production growth of data, with increasing degrees of complexity, turning data analysis a difficult task. At the same time, Virtual Reality (VR) technology is becoming more popular, with affordable devices and easier content production tools. In this scenario, the current research project proposes the development of an open data analysis platform using VR, allowing researchers to evaluate how the immersion of this environment could add value to the Visual Analytics (VA) process and pose a better way to extract knowledge from the data. Index Terms: Visual Analytics—Virtual Reality—Visualization— Immersive Analytics; 1 INTRODUCTION The growth of data production and complexity is creating new chal- lenges to analyze big volumes of data and extract knowledge from it. The Visual Analytics (VA) field comes to this territory to provide tools and techniques to aid the process of data analysis and represent the knowledge acquired. On the other hand, Virtual Reality (VR) applications are becoming more accessible since head mounted dis- plays (HMD) are getting cheaper and VR content development tools are becoming easier to be used. Our research goal aims to explore and evaluate the application of VA techniques in VR environments. In particular, this research project focuses on developing an open data analysis platform using VR, where researchers could evaluate how the immersion and the experience provided by this kind of environment could improve the VA process and provide a better way to extract knowledge from data. 2 MOTIVATION Even with the growth of the Immersive Analytics field, researchers are producing immersive data visualizations with implementations from scratch, without using any proper tool or framework to aid the process [2,7,9]. This behavior can potentially make the result of their evaluation biased to their solution’s context or limitations. Our proposal is to create an open and extensible platform using VR that allows researchers to work in an immersive environment to manipulate data and create visualizations in an easier fashion. With this platform in hand, we hope that researchers could focus their work on extracting knowledge from large amounts of data or evaluating the feasibility of VR as a VA tool. 3 RELATED WORK The current production of data is achieving unprecedented levels [8] and together with the volume growth, the data complexity is facing a dramatic increase [3]. One of the biggest challenges that is presented *e-mail: †e-mail: Figure 1: The VA process proposed by Keim et al. [6]. by this data deluge phenomenon is how to discover and understand meaningful patterns hidden in the data [3]. Gorodov and Gubarev [4] stated that “Graphical Thinking” is a very simple and natural type of data processing for human beings, support- ing decision-making in an effective and understandable way. They also noted that when it comes to Big Data domain, visualization may not be an effective or applicable option, as it presents the following problems: visual noise; large image perception; information loss; high performance requirements; high rate of image change. According to Keim et al. the Visual Analytics field tries to address these challenges and problems by combining automated analysis techniques with interactive visualizations to provide an effective understanding, reasoning, and decision making on the basis of very large and complex data sets [6]. The techniques and practices de- veloped by VA are well-established for: emergency management; astronomy; monitoring climate and weather; security; scientific ap- plications; biology and medicine; business intelligence and fraud detection [6]. At the same time that data volume and complexity are getting bigger, the VR technology is becoming more affordable to be used by VA researchers, allowing them to explore the user experience and immersion to provide new ways to see the data [2,3]. There exist different perceptions about the use of VR for data visualization: from studies presenting positive results of the usage [7] to others pointing that conventional 2D Desktop presents better performance over an immersive solution [9]. The authors did not base their conclusions on the use of any standard solution to produce the visualizations for their experiment, potentially presenting biases related to their solution’s contexts and limitations. 4 PROPOSED SOLUTION Our work proposes an immersive VA platform capable of imple- menting the processes depicted in Figure 1. The technologies to be adopted and supported by the platform are still under evaluation, but we plan to use Unity to handle the visualization step and Python to do the work related to data management and analysis. This ap-
  2. 2. Figure 2: The proposed user interaction with the platform. proach lets the platform take advantage of the best of both worlds: the well-established data analysis environment from Python and the ready-to-use VR integration available on Unity. One big challenge that we want to tackle is to build this platform with an open, free and extensible foundation. This will allow other researchers to use or adapt it according to their needs and contexts without worrying with the underlying technologies. 4.1 Proposed User Experience The user interaction will happen entirely in the VR environment. This will demand more efforts on the VR Human Computer Interac- tion (HCI) factors, as this type of interface has its own peculiarities and affordances [2,7]. We propose the user interaction illustrated by Figure 2, where the user will pass through 4 steps: data source selection; data set loading; data visualization and automatic analysis. The first step is where the user will select the origin of the data set. At this moment we are planning to implement some of the most common data sources: JSON format, CSV files and Microsoft Excel documents. After this selection, we are going to present a summary of the available data with the option to choose what is going to be used to build the visualization. In this step, the user is going to select which dimensions will be used, how many records will be shown, if the data needs some cleaning, etc. With the data ready, the user will have the option to choose the next step between doing some automatic analysis or visualizing the data as is. If the user chooses the automatic analysis, the platform will present the option to execute operations to improve the understand- ing of the data or create a perspective to be visualized. Some op- erations that we are planning to offer are basic statistics, pattern recognition, data modeling, and dimensionality reduction. The visualization step is where the users will be immersed in the data, allowing them to explore, zoom, manipulate, obtain informa- tion about specific data points, etc. We are considering to offer here the option of using some kind of local analysis so the user could filter, identify outliers or even do a simple clusterization. At any time, the user will be able to go back and forth between analysis and visualization. We also plan to add the ability of “taking a snapshot” of the current state of the visualization, allowing users to share their results or revisit them later. 4.2 Experiments and Evaluation After the development of the platform, we plan to run an experiment with potential users to make observations about the usage and to collect some feedback. This information will be used to do small adjustments and some fine tuning. At this point, with an almost “ready-to-use” platform, we will run a second experiment aiming to compare the effectiveness of a VR visualization with a conventional 2D desktop visualization. The experiment will have a design very similar to the ones found in the literature, with users executing data analysis tasks using the VR environment and a traditional 2D desktop [7,9]. We plan to position our platform capabilities among other VA tools with a similar approach found in the literature [5, 10]. This will help us to identify mandatory features and gaps that we could fill the void. 5 PRELIMINARY RESULTS The project is at an early stage of development but during the litera- ture review we made some investigations regarding the technologies involved to understand the technical aspects of the scenario. Here are some of the activities developed until now: A) Creation of a sample of Data Visualization using Unity; B) Test Unity’s perfor- mance to render a large amount of data points; C) Investigation of WebVR libraries as an alternative to implement data visualiza- tions; D) Evaluation of other Visual Analytics tools as Metabase and Tableau. We are currently evaluating how to do the communication be- tween Python and Unity3D in an effective way. To create this integration, we are evaluating the use of Remote Procedure Calls (RPC) through the library gRPC [1], that allows a seamless com- munication between clients and servers using different technologies and supports data streaming with serialization out-of-the-box. 6 CONCLUSION Our main goal is to contribute to VA field by creating an open data analysis platform to allow researchers to evaluate or adopt VR visualizations. We hope that the Immersive Analytics field could also benefit from it, as we plan to create something that could be extended to support other immersive interfaces. REFERENCES [1] gRPC open-source universal RPC framework. [Online; accessed 30-August-2018]. [2] T. Chandler, M. Cordeil, T. Czauderna, T. Dwyer, J. Glowacki, C. Goncu, M. Klapperstueck, K. Klein, K. Marriott, F. Schreiber, and E. Wilson. Immersive analytics. In 2015 Big Data Visual Analytics (BDVA), pp. 1–8, Sept 2015. doi: 10.1109/BDVA.2015.7314296 [3] C. Donalek, S. G. Djorgovski, A. Cioc, A. Wang, J. Zhang, E. Lawler, S. Yeh, A. Mahabal, M. Graham, A. Drake, et al. Immersive and collaborative data visualization using virtual reality platforms. In Big Data (Big Data), 2014 IEEE International Conference on, pp. 609–614. IEEE, 2014. [4] E. Y. Gorodov and V. V. Gubarev. Analytical review of data visualiza- tion methods in application to big data. JECE, 2013:22:2–22:2, Jan. 2013. doi: 10.1155/2013/969458 [5] P. J. C. John R Harger. Comparison of open-source visual analytics toolkits. vol. 8294, pp. 8294 – 8294 – 10, 2012. doi: 10.1117/12. 911901 [6] D. Keim, J. Kohlhammer, and G. Ellis. Mastering the Information Age: Solving Problems with Visual Analytics. Eurographics Association, 1st ed., 2010. [7] O. Kwon, C. Muelder, K. Lee, and K. Ma. A study of layout, rendering, and interaction methods for immersive graph visualization. IEEE Transactions on Visualization and Computer Graphics, 22(7):1802– 1815, July 2016. doi: 10.1109/TVCG.2016.2520921 [8] B. Marr. Big data: 20 mind-boggling facts everyone must read. big-data-20-mind-boggling-facts-everyone-must-read, Nov 2015. [Online; accessed 30-August-2018]. [9] J. A. Wagner Filho, M. F. Rey, C. M. Freitas, and L. Nedel. Immersive analytics of dimensionally-reduced data scatterplots. In 2nd Workshop on Immersive Analytics. IEEE, 2017. [10] L. Zhang, A. Stoffel, M. Behrisch, S. Mittelstadt, T. Schreck, R. Pompl, S. Weber, H. Last, and D. Keim. Visual analytics for the big data eraa comparative review of state-of-the-art commercial systems. In Visual Analytics Science and Technology (VAST), 2012 IEEE Conference on, pp. 173–182. IEEE, 2012.