Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Graph Space Viewer


Published on

student project for Drexel University

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Graph Space Viewer

  1. 1. Graph Space Viewer Juan Crespo Philadelphia, PA 19103 USA ABSTRACT asymmetry precludes any geographic accuracy except at In this paper, a rapid prototype design for a large scale the very small scale. Psychology and physiology also graph viewing application is presented, where the graphs conspire to limit the amount of data we can retain and may be comprised of hundreds of layers, large numbers of comprehend. children per node, or combinations of the two. The only realistic solution is a combined presentation of Keywords bare summary schemas at the global level, and a detailed Large scale graphs, GUI, HCI, visualization, hyperbolic (high fidelity) local view. Clustering, summarization, sub- geometric projections, viewport, Perl, Java. setting and pruning of the data space for presentation to the user is the only way large volumes of data can be provided, INTRODUCTION without it all merging into a colorful blobs of high density Display screens are the main means of computer output, but devoid of meaning. have limited surface areas, at best measured by a thousand pixels (plus or minus) in height/width. There are also limits BACKGROUND as well on the number to displays that can be driven by a A rapid prototype selects among competing strategies and single video card, and large networked multi-displays are what can be done within a limited time constraint. The too rare and costly. The only current solution to the following assumptions were made: no interactive graph problem of displaying large quantities of information is the changes are allowed. The model is preprocessed ahead of scrolled pane window, a HCI Graphical User Interface loading. An early evaluation study [3] removed cone tree (GUI) element that provides a navigable viewport into the candidates from further consideration from this project. The data stream. The viewport allows limited access, but cuts following is the GUI’s set of selected features of interest: off data outside the bounds, unless scrolled into. The A display of global context (the bird’s eye view) viewport must also compete with other control widgets on A zoomed in view (detailed context) the GUI for real estate. A property’s display on the node of interest Attempts to visualize large data sets have used various strategies to fit within the viewport: A search by property • layout algorithms conserve real estate [3][4][9][12], Use Walrus as a contrasting visualization (generate Walrus compatible .graph files). • nonlinear magnification can reduce scrolling [1][5], Initial Feasibility • geometric distortions [2][8][9] and fisheyes enhance local detail while preserving global context [7], Prior to designing the prototype, a test of the development environment yielded an initial bound of 2.5 x 105 child • multiple dimensional data clustering [6] in sparse spaces, nodes hanging off the root before the Java Virtual Machine • the use of coloring, transparency [11], topological (JVM) exhausted its default heap space. Because most of mappings [7][10] and 3D perspective [8][9][10] in the errors were generated by String constructors in the support of all the above. ArrayList class, a similar strategy as the one employed by the Walrus Project ( was employed. Node The growth in data continues to grow into the petabyte names were minimized to a single monotonic numeric (1015) range. Structural maps of the web can easily number identifier (root node id = 0), assigned by creation order. in the millions of nodes; as these nodes are both dispersed Following this, an upper bound of 8 x 106 child nodes geographically and asymmetrically, one can find hundreds hanging off the root, followed by adding 7 x 104 nested of hosts collocated to within several hundred square meters nodes before the JVM exhausted its heap space (on a 12GB (the footprint of a single skyscraper), while other locations host, the JVM settings were: -Xs128m –Xm8000m). The show none. Combinatorial explosion prevents any host source data was collected from the bash shell (cd /dirname; display from presenting all of this in real-time, while the ls –Alr > ./dirname_doclist.dat), generating a 3 MB file, which was processed by several Perl scripts (in stages) to COPYRIGHT©2009, by Juan Crespo create the initial graph files, then compiled into Eclipse All Rights Reserved. (Galileo distribution).
  2. 2. Design Patterns The user selects a .graph file from the file chooser (File -> Lazy Evaluation and Proxy Pattern. A NodeCache object Open). To assist the user’s ability to find datasets, a maintains the summary elements of the generated graph FileFilter is installed on the FileChooser dialog to select (number of levels, child nodes, current reference, last directories and .graph files only. The system reads in the reference). Attempting to traverse the entire node list is selected .graph, and the corresponding .dgs file (if delayed unless the tree is restructured (addition/deletion available). The right-side window displays the default operations or initial loads from disk). The AddressTable (a coordinates in 3D space (Y pointing up, X to the right, and proxy) provides cross references of the node names with Z out) at a 1:1 zoom. This is referred to as the Home their object references; this allows search and access to the position. The program can also generate new datasets. detailed information (the node’s PropertyList), without Birdʼs Eye View having to repeat the search. The Bird’s Eye View is maintained by a zoom capable SYSTEM IMPLEMENTATION view pane that displays a schematic graphic of the The ability to use a consistent dataset was one goal. To this hierarchal structure. A means to scale the horizontal size is end (and to help minimize the development time), three 3rd provided. Two modes are provided: a text only view (PText party packages were selected: Piccolo2D, GraphStream, objects that auto pan and scale based on the user’s and CAIDA’s Walrus. manipulation of the mouse on the panel), and a circular 2D Test data was populated from the development host’s connected graph layout (called a rosette). See below. terminal, then processed in stages by a set of Perl scripts to Text ZUI Mode Circular Rosette ZUI Mode format the test data into file formats acceptable to the supporting 3rd party libraries (libsea and graphstream). Piccolo2D is a drawing package and does not depend on any particular file format. GraphStream was selected because it provided several algorithms for speedy traversal of large graphs (>104), and the capability of reading/writing in a format that was easily translated into a Perl script. It maintains arbitrary collections of user-defined properties and has a built in 2D rendering package. CAIDA’s Walrus was selected for its even more impressive hyperbolic display renderings, but needed to be updated to Java 1.6. It has a file reader whose format (while poorly documented) was also amenable to generation by the same Perl script. A fourth package (JUNG) was considered, but its API was more complex than the other two combined, given the time constraints. The CrossReference object (accessed during file opens to populate the data on the ZUI) is responsible for the rosette The program datasets were generated (e.g.: ls –AlR layout. Translational pans (x-y directions) and in-out zoom [dirname] > dirname_dataset_file) by recursive descent. for the textual display is handled completely by the Some directories generated 30 Megabyte files, with over Piccolo2D framework, which also allows mouse listeners to 80,000 entries. The resulting dataset file is then processed be installed on the pane for user selectivity of the node in by the Perl script which creates a GraphStream question. The rosette layout algorithm iterates over the formatted file (.dgs) and a Libsea formatted file (.graph). It libsea graph and draws rather striking circular displays of should be noted that while this demonstration uses the current structure. To prevent the problem of total directory trees as its principle item for display, the obscuration of the graph by overlaying tens of thousands of supporting packages are generic, and the application could nodes, it is limited to a depth of ten. be used to view data in other fields (ontologies, molecular modeling, genetic graph groupings, and any other field The Text ZUI is self-evident, as it identifies the node by where data can be structured in any kind of graph format). name (each line is a file node). The color assignment was The script was easily extended to also produce GraphML made by the Perl script and stored in the .dgs file. A mouse and ATT’s dot formats, while remaining at under 250 lines listener is installed on this pane that sets the detailed of code. Currently Perl becomes a performance bottleneck property list contents based on which node is selected. for datasets containing 40K or more nodes, pegging the Since both ZUI modes have auto-zoom, no scrollbars were CPU at 100% for hours at a time. For this reason, datasets needed. were limited to those under 40K. The Perl script enforces a Properties View Pane dataset naming convention, and is also responsible for An informational display that summarizes the file contents assigning colors from a built-in color table. Nodes at the (source file name, number of nodes and links contained), same level are assigned a color in the .dgs output file. and a detailed pane that lists the selected node’s properties (for directory browsing, this would be filename, file size,
  3. 3. creation/mod dates, file type, etc.). The node is selected A CrossReference object maintains the links between the from the Bird’s Eye View, selected from the Search Pane, attributes stored in the .dgs file, and the 3D coordinates or from the Hyperbolic (hyper3D display) pane. If no .dgs computed by the CAIDA package during runtime. These file was available, the detail pane remains blank. In coordinates are not stored in the .graph file but reside in addition, on the details pane resides the launcher button memory after the 3D placement algorithms run. A which uses the Java Process mechanism to spawn viewers, PropertyPane object displays both a summary of the file or other file types that the operating system platform can contents, and a detail pane tracks user selections. A support. The FileLauncher class is sensitive to the SwingWorker thread performs the GraphStream tree underlying operating system (MacOS X vs. Windows) and traversal while a separate thread does the libsea supports bitmaps (.bmp/.pict), media files (.mov/.avi), text computations. Internally, the program disables the use of (.txt), code files (.java/.h/.c/.cpp), and help files (.chm). See the GraphStream display() function to prevent their built-in the UML package diagram for architectural details. viewer frame from popping up. Source code for this project UML Package Diagram: is not available for download, so it was next to impossible trying to extract their canvas objects out from their window container, or conversely, try to add SpaceViewer components to their frame. Hyper Pane The Walrus code was modified to run without warnings on a Java 1.6 environment (tested on both Windows and MacOS X). The Hyper3D canvas object uses Java3D for its rendering. The Klein model projection is used to present a 3D visualization of the graph (calculated on the fly) and presented (initially at the Home position). The Hyper Pane Display (zoomed in) Search Pane A means to rapidly scan the dataset contents for a particular node identifier in question is provided via a textbox on the main display. The search item is typed in and the search button is pressed to initiate the scan. The search result is published on the detail pane. Two arrow buttons can increment/decrement to proceed to the next/previous item in the graph. An up arrow traverses to the parent of the previous selection. A Home button returns to the root of the display. An option menu presents the set of attributes to search on (options are file name label, date, size, owner, and group). See below: Search Pane Keyboard Listeners allow zoom-in (“.” Key), and zoom-out (“,”). Mouse Listeners allow rotations (CNTRL + mouse movement), and mouse clicks disable the rotations. Have your barf-bag ready, as some users reported disorientation and dizziness. Frame-rates were up to 60 fps on a Dell Optiplex 755 running Windows XP SP3; 65+ fps on an Apple Mac Pro running Snow Leopard (10.6.2). Right-mouse clicks on a node cause lookups in the cross- reference to extract the properties listed on that node for
  4. 4. display in the details pane. The node id is also written to Computer Proficiency was rated on the user’s self- the search text box. Entering a node id in that field causes confessed experience levels (ratings ranged from poor: very the cross-reference to display the details for that item as light computer skills bordering on non-existent; fair: some well. On the hyper display, right-clicking on a node causes ability to navigate the typical computer; good: familiarity the entire model to translate, so that the selected node with most features; excellent: defined as an expert capable occupies the center of the display. An option to of making a living at programming). Education levels dynamically update the display is given to the user: the ranged from full professorship to Ba. Science degrees. No listener tracks every mouse motion event and selects the one with High School level or less was available for the closest node under the pointer for immediate translation trials. and redisplay. The initial prototype garnered some high marks from the The system menus allow user selectivity of the colors used computer literate users (rated 3 and above); the tool tip text for nodes and links on the hyper display, including legacy help was rated higher by the good-to-poor rated subjects. CAIDA color schemes. There are also menu controls to All users reported the initial ZUI Pane’s rosette display enable/disable transparency of nodes and edges, the looked confusing (no labels), but the mouse tracking rendering of the axes & their labels, cycling the display updates and the detail property pane’s display permitted from node root to the 10th rank nodes, pruning the tree, and cognitive context to be quickly established. They also traversing from leaf to root. A color swatch preview is noted the extreme sluggishness of the display’s update available on these menus. when zooming the rosette, or when selecting the pane for a drag motion, as well as the delays in responding to mouse Status events over a menu item. The ZUI pane is located in the A status bar on the bottom of the display shows progress on upper left, where trying to select a menu item caused the lengthy tasks such as file reads, as well as short messages. mouse listeners to fire automatically if the user intended a Longer messages may occur in pop up dialogs. menu selection and not an info request from the ZUI. The The application was developed using the Eclipse Galileo addition of an enable/disable option to mouse tracking on Java framework. the ZUI cut down on the false selections. Oddly, the ZUI EVALUATION performed much better on the smaller laptop than the big Two machines were used for the evaluation sessions: a machine (this may be to host differences in the Java run- 10GB Dual Quad-Core Mac Pro (running Mac OS X, times). The Text-only ZUI got low marks, as it repeats version 10.6.2, Java 1.6), and a 2GB Dual Core Mac Book what is already available in greater detail in the Property Pro (running 10.5.11 and Java 1.5/1.6). Each was also Pane view. equipped with Microsoft Windows XP SP3 running Java 4 out of 5 users requested the Search Pane occupy a more 1.7 (beta). An additional Optiplex 755 running Windows central position. The initial location of center top was XP SP3 was used for a demonstration at L-3. considered too distant from the property pane. The initial prototype was presented to a group of 5 users (4 Two users made the same new feature request to the males, 1 female; with varying levels of computer skills and prototype. They asked for the ability to see the hyper-pane age ranges). graph cycle animate from root to leaf nodes. They also The initial task was whether the users could navigate the asked that the screen refresh menu item be made more GUI as is. The second task was the generation of new file convenient (instead of being buried in the Render Menu). sets to explore using the viewer application and how easy Three users commented on the progress bar tendencies to (or difficult) that task would present itself. behave erratically during long operations. The Walrus Table 1 gives a brief description of the users: package has an internal mechanism deep in its code that switches to extended precision during some dataset loads. Identifier Computer Education Age Level Because of this switch, what the progress bar reports is not Proficiency consistent with the actual results. A poor Ph.D 71 DISCUSSION (English) The Java3D environment has known bugs. Labeling of B fair Ma. Sc. 47 nodes does not work at all (even under Windows). Attempting the addition of a glass pane component to the (E.E) hyper display caused the entire pane to disappear. The C good Ms. Sc. 31 CAIDA organization did announce that the Walrus package (Math) may be ported to OpenGL, but details are sketchy at this time. D excellent Ba. Sc. (CS) 35 The lack of source code for the Graphstream viewer E excellent Ba. Sc. (CS) 27 prevented that swing component from being immediately incorporated as part of the main window container, and the
  5. 5. time constraints prevented a reverse engineering effort, so 3. Cockburn, Andrew, & McKenzie, Bruce. An it remains in an independent window. Initial size requests Evaluation of Cone Trees. ACM. are completely ignored and it always comes up in a small frame (but manual resizing is allowed on that viewer BCSHCI.pdf component once it is displayed). Aside from this, the rest of 4. Dutot, A., Guinand, F., Olivier, D. & Pign´e. Y. the Graphstream API was a flawless performer. GraphStream: A Tool for Bridging the Gap between The Piccolo2D API was an easy addition to make, and its Complex Systems and Dynamic Graphs. performance was quite stabile and responsive (except on the Mac Pro machine). However, its added memory hit to the already large allocations for the libsea and graphstream 5. Gansner, Koren & North. Topological Fisheye Views objects cuts down on the maximum size the dataset can for Visualizing Large Graphs. IEEE TRANSACTIONS achieve. The largest dataset processed (System library: 65K ON VISUALIZATION AND COMPUTER GRAPHICS nodes) took 4+ hours for Perl to digest), takes almost half an hour to load (on the Mac Pro), and does not load at all 6. Keahey, T. Alan. Visualization of High-Dimensional on the laptop. If we forgo the ZUI Pane dependency on Clusters Using Nonlinear Magnification. Proceedings GraphStream (i.e., load datasets containing only of .graph of SPIE Visual Data Exploration and Analysis VI, files), then System loads on the laptop, but we also lose the January 1999 search capability. Further work is envisioned: 7. Keahey, T. Alan. Area-Normalized Thematic Views. A. Tie the selection in the Search Pane to a hyper-pane Los Alamos National Laboratory node repositioning itself to the center of the display. The mouse clicks on the arrow buttons would also cause a 8. Kolliopoulos, Alexander. The 1-Hyperbolic Projection position update. In addition, the node selected would be for User Interfaces. Journal of Computing Sciences in highlighted on the ZUI Pane as well (if the node is off Colleges, Volume 18, Issue 4, 2003. screen, an automatic translation of the PCamera object could be forced, to bring it into view). =GUIDE&CFID=57258811&CFTOKEN=13079943 B. Save the session info (color setting, hyper-pane position, last selected node id). Color selection would also apply to 9. Munzner, Tamara. Interactive Visualization of Large the ZUI Rosette. Graphs and Networks (doctoral thesis), Standford University, June 2000. C. Fix the progress bar update issue. D. Post the project on under the GNU 10. Nowell, Lucy Terry & Hetzler, Elizabeth G... GPL. Graphical Encodings: Bet You Can’t Use Just One! ACKNOWLEDGMENTS ACM. This work would not be possible without the contributions 11. Shipman, F. M., III and Marshall, Catherine C. Spatial of the members of Sun Microsystems,, Hypertext: An Alternative to Navigational and, Piccolo2D, and the GraphStream Semantic Links. ACM Computing Surveys, Vol. 31, Library developers. I would also like to thank some of the Number 4es, December 1999. folks at L-3 Communications in Camden, NJ for their time and comments (you know who you are). 37.htm l REFERENCES 12. Ellson, J., Gansner, E., Koutsofios, E. & North, S. 1. Koike, H., & Yoshihara, H. Fractal Approaches for Graphviz and Dynagraph: Static and Dynamic Visualizing Huge Hierarchies. Proceedings of the 1993 Graph Drawing Tools. ATT. IEEE Symposium on Visual Languages, pp.55-60, IEEE/CS, 1993, Department of Communications and Systems 1-5-1, Chofugaoka, Chofu, Tokyo 182, Japan 13. Piccolo2D API Documentation. 2. Cannon, J. W., Floyd, W. J, Kenyon, R., & Parry, W. R. Hyperbolic Geometry. Flavors of Geometry, MSRI Publications; Volume 31, 1997.