Thank you for attending my thesis defense and being part of this jury. In this presentation I will summarize the content of our work, called theMental Vision platform.
The outline of my presentation will be as follows:I will first introduce the context and the goals of this thesis.I will then explain the different elements of our contribution and the results we obtained by evaluating them through in vitro and real experiences. I will then conclude this presentation with some final remarks and future perspectives.
Let’s start with the introduction
The universe of hardware conceived or potentially useable for computer graphics and virtual reality is very wide. It ranges from pocket-sized devices (like mobile phones and personal digital assistants) up to room-sized environments like CAVE. Between these two extents, there are a plethora of heterogeneous devices like desktop and portable computers, ultra-mobile PCs, gamingconsoles, dedicated workstations, etc.
The panorama is even wider if we take in the account also software aspects, like different operating systems, different programming interfaces to access low-level 3D acceleration, 3D content editors, file formats to store and transfer 3D content from an application to another one, and so on.
Software developers have to face with this heterogeneous reality when they wantto use these technologies on their projects involving computer graphics and virtual reality.According to the scenario (which could be gaming, research, biomedical imagery, education, simulation, mobile entertainment and many others)a specific set of software tools can be better than another. But a global, passe-partout solution is not available.
In fact, computer graphics and virtual reality lack of a universal solution working through all these devices and contexts. Unlike other computer technology fields, where widely adopted and robust standards exists (like TCP/IP sockets for communications or MySQL for databases). On the other hand, there exist many 3D software frameworks addressing a sub-group of this universe, but they are often oriented towards a specific context (entertainment, simulation) a specific user pool (elite developers and specialists). These frameworks cannot be easily used and adapted to fit into different scenarios. Most of the existing realtime 3D graphics engine privilege speed and performances to versatility and user-friendliness, making them less suited for teaching and research applications.
Usually projects involving computer graphics first select a target device and then which software to use for visualization among the options available on that platform.
With our approach, we wanted to change the point of view. In our case, we preferred to post-pone this choice, by offering users the opportunity to switch from a device to another one later. To improve fruition of our solution for a wider user-pool, we also reduced the complexity of the programming interface to the minimum. Mainly about the modifications required to change platform. This way we wanted to create a software graphics system accessible by both beginners and advanced users, from students to researchers. But still featuring advanced functionalities, just through a less sophisticated interface.
To simplify accessibility to less common, “hostile”, graphics environments, we extended the range of targeted hardware systems to two extents. On one side, we tried to reduce the weight and requirements for mobile 3Dgraphics . We made our framework running also on a wearable setup by using less shock-sensitive and less-encumbering devices than the ones used byprevious solutions.
On the other extent, we created a custom CAVE system by using commodity hardware, in order to reduce its cost. So we tried to find a good compromise between the architecture of professional systemsand low-cost approaches. We fixed most of the problems introduced by non-dedicated hardware through software techniques, integrated in our solution.
This is an overview of the universe targeted by the Mental Vision platform. We don’t pretend to create the ultimate standard about computer graphics frameworks. With our approach we want to show how it is possible to combine opposite characteristics under the same roof. How it is possible to create a platform giving at the same time access to a CAVE and a PDA. How it is possible to give to students and researchers a tool to simplify development of their projects.Since we work each day in research and educational contexts, we oriented our application scenarios towards these twodirections, but our platform is not limited to that and can be used also more globally.As far as I know, we are among the first proposing a graphics framework working on PDA, PC and CAVE under a same interface.
The Mental Vision platform is composed by the elements.Mvisio 3D graphics engine Core unit and common denominator of the whole platform A series of pedagogical modulesDeveloped on top of MvisioInteractive demonstratos of selected topics on CG and VRExample of concrete applications developed through Mvisio, reusable on other projects Corollary toolsa custom CAVE system developed by reducing the hardware costs of its components and by making its utilization more accessibleLight-weight wearable 3D system
Mental Vision: a cross-device 2D/3D graphics engine:Very simple interface (good learning curve)Maximizing effects reducing lines of codeMinimizing differences among different platformsMaking many operations automatic (like resource management, content adaptation, performance tuning)You can start work on the real thing soonerCompact in sizes and fast in speed (important for low profile/mobile devices)Robust (consistent results across different devices)This is just a list of the different features integrated in the engine. I will not enter too much in details: let say that most of them are required to consider Mvisio as a graphics engine
In this slide you can see an example of what Mvisio is capable to do, bringing modern graphics to the different devices targeted. Effects like bloom lighting, depth of field, soft shadowing and high dynamic range are automatically managed by the engine and can be activated or deactivated at will. Despite its simplified interface, Mvisio is not a sandbox preventing the implementation of advanced techniques and algorithms.
The overall Mvisio architecture looks like this. Users can create graphics applications by loading 3D content from files and manipulating them through the Mvisio API. The complexity of Mvisio is then hidden behind the interface, which is the same on the three devices targeted by our solution.If you are on PC, the PC pipeline will be used and images are generated according to the version of OpenGL your machine is running on. More recent computers will benefit from the shader-based pipeline and other optimizations to improve speed and visual appearance. In case of older and less performing machines, Mvisio will use the fixed pipeline. In any case, Mvisio will keep producing results, independently from the hardware: only the quality changes.If you are on PDA, OpenGL ES will be used instead of OpenGL. This pipeline is very close to the fixed pipeline available on older PC. At the time of the creation of our software shaders were not available on mobile devices (but they are since a week, with the new Iphone).On CAVE, Mvisio starts a client/server architecture with one PC per CAVE side. Locally performed graphics instructions are forwarded to the different connected machines, each one running an instance of Mvisio PC.Models, animations and textures are automatically synchronized at startup.In all these three cases, users don’t need to modify their data: Mvisio automatically adapt the content if necessary.
A basic Mvisio-based application looks like this.In few lines of code users can initialize the engine, load a complex scene from an external file (including light sources, meshes, textures, etc.), display it and free resources. Thanks to our simplified interface, it is easy to setup a working 3D environment in a short amount of time.As we will se later, this same piece of code can be quickly modified and adapted to run into our CAVE or on a PDA.
Pedagogical modules are small demos allowing students and teachers to dynamically interact with the algorithms and concepts introduced during the class.They cover several topics, ranging from spline generation to 3D camera manipulations, passing through animation, shading techniques, etc.With our modules we wanted to break the limitations of static images and videos, giving a more interactive support to teachers for their explain.Each module uses the MVisio engine and can be executed on virtually any personal computers, thanks to the characteristics of our graphics engine.Modules are also distributed with their source code, in order to show to students a concrete implementation of the different topics. This approach simplifies the implementation of such techniques into student projects (or any project based on Mvisio).
Typically, modules feature: a screenshot of the lesson slide, to keep a guiding thread with the ex cathedra explainations an intuitive interface (few buttons, click & drag interaction), allowing users to easilychange the different parameters of the algorithms and see the result a powerpoint style, to give a more coherent design to class notes and presentations when they’re started directly from a slide
Our teaching/learning pipeline looks as follows.The teacher uses a module as interactive support during classes to improve his explainations. At home (or remotely, in the case of e-learning), students can download the same module and practice or try to repeat the manipulations seen during the lesson as an exercise.If the technique introduced in the class is also one of the assignments for practical work, students can look at the module source code to figure out how to implement in their projects the same algorithm. Thanks to this approach, we have a unified series of tools ranging from theory to practice. Many assistants in our laboratory are also working with Mvisio for their research projects.This way it is easier for them to help students fixing their problem, since both assistants are students using the same software and don’t need to learn a new one only for teaching activities.
Our CAVE is made by a home-cinema display folded in the shape of a cube. We used nylon wires to fold it on the corners. The floor is made from a white wooden panel. 4 computers (one per side) generate images orchestrated by a server PC.Computers are connected through a private 1 gigabit LAN network. Two beamers are connected to each PC, using dual-output video cards.Up to three users can stay comfortable within the CAVE.Audio is integrated through a standarddolby home-cinema audiosystem.
Stereographic rendering has been implemented in two ways. The easiest one through colored glasses. This solution is very inexpensive and practical, mainly when we have many visitors coming to see demos in our laboratory. The other solution is more elegant and sophisticated. It is based on a very simple principle: two beamers, each one continuously beaming images for one eye. Shutters are put in front of each beamer and synchronized with shutter glasses worn by users. This way we don’t need to rely on expensive projectors with high vertical retrace speeds (like CRT beamers).Separate shutters and shutter glasses are synchronized through an external clock and infrared emitters. By using two projectors instead of one we also improved the overall luminosity of the system, since more lamps to generate light.
Unfortunately, the adoption of commodity hardware and the use of two beamers introduced a series of problems:The main ones are:First: corners curved like parenthesis around nylon wiresSecond: images projected from two beamers were not aligned when superposed on the CAVE sides (even after a fine-tuned calibration of the support)Third: different pixel resolutions, coming from side projectors being rotated by 90 degreesFourth: projectors and room conditions may change over time (because of the humidity, the temperature), so recalibration is often required
We fixed all these problems through a software calibration system.We used render to texture to generate images into textures which were then rendered on a triangle grid following the shape of the CAVE.Control points on the border of these grids allowed to match the irregular curved shape of the CAVE sides.While control points inside the grid allowed to align the image superposition of the two beamers. We developed an application to calibrate the CAVE walls in few minutes. A single user can comfortably select and adapt the different grids from inside the CAVE by using a wireless joypad.
An adaptation for CAVE of the previous source code example looks as follows.Only very small modifications are required. A #define command tells Mvisio to activate the CAVE pipeline of the engine.Information about the amount of CAVE sides to use is stored into a configuration file, which contains also the IP addresses of the different clients. The only important difference is about the putUser method. This procedure specifies the user head position in meters inside the CAVE. This information is used by the graphics engine to compute the correct projection matrices and stereographic rendering.
Our wearable framework is made by a PDA (used as core device), a see-through head-mounted display and some connectivity and power-supply facilities. We had to create a signal adapter since the VGA signal emitted by the PDA was not compatible with the HMD. The entire system weights about half a kilo and can be used for about 2 hours, according to the battery used.
The wearable framework works by using the PDA version of our graphics engine. Porting an Mvisio-based application from PC to PDA is a very easy task, as we can see in this source code example. The code is almost identical to the version for PC.
The Mental Vision platform is completed by some additional tools. One of them is a plugin for 3D studio max. This plugin allows user to export with just one button click, the whole content available in 3D Studio Max (including textures, light sources, meshes, material properties, etc.).Exported files can be directly loaded into Mvisio with just one line of code, as showed in the different source code examples illustrated so far. From the slides you can see how close the scene loaded in Mvisio is to its original in 3d studio max.
To have some technical evaluation of our framework, we developed a benchmark application running on the different devices supported by our solution. The goal of this benchmark is to evaluate the speed, code modifications and image fidelity across the different devices. Unlike other existing benchmarks strongly stressing the hardware to determine which computer can output the higher amount of triangles in millions, our speed indicator is used to show that our approach is giving enough responsiveness on the target system.
These are the results we obtained by running the software version of Mvisio for PDA.It is a bit slow because rendering is not performed through hardware accelerated units. And the computational power of PDA is very limited. Nevertheless, we achieve at least interactive framerates on all the models tested. And this version of Mvisio can be used on any PDA based on Windows Mobile. We tested it on 6 different PDAs and obtained always the same results (with some speed change according to the power of the CPU).
The hardware accelerated version of Mvisio for PDA is obviously performing faster and at an higher screen resolution. Images are more defined than on the previous slide, but they feature the same characteristics and ratio.
Same considerations for the PC version, keeping the same image restitution but, obviously, at higher framerates.
Things are a bit different in the CAVE. Graphics user interfaces are not rendered into the CAVE but kept on the server PC, while all the 3D rendering is executed on the different client PCs. In this case too images are rendered with the same characteristics showed on the previous devices. The speed impact is mainly due to network-oriented synchronization system, waiting for all the 4 CAVE sides to complete their rendering before starting to display a new image.
Our framework has been widely used by students on their class projects (like in the two upper images).We asked them to create a curling simulation (one year) and a snooker table (this year). During these VR projects they had to deal with many aspects:Real-time graphicsBasic physicsStereographic renderingUser input and force feedbackPositional audioGame mechanicsThanks to Mvisio, and unlike during previous years, we released students from the task to create a CG system from scratchThis way they could focus more on the orchestration of the different elements composing user-centric virtual environment instead of learning plain opengl all the time (which is more the goal of computer graphics classes and not virtual reality).Besides class projects, Mvisio has been used also for diploma and master projects, like in the bottom right images to simulate a truck gear box interface with haptic feedback.
Our platform has been used on many research and thesis projects. Researchers took advantage of the different Mvisio characteristics in order to spend less time to satisfy their visualization needs, thus focusing more on their thesis instead of computer graphics. Among the most welcomed features: the integrated graphics interface, easy portability from PC to PDA or from PC to CAVE, and versatility ranging from the creation of CAD-like applications (upper right image) and 3D simulations (bottom right).It is important to mention that these colleagues started to work first with different software and then asked to use Mvisio and were happy of their choice.
Our work has been published into 1 journal and 3 international conferences, and used and cited in many others.
The mental vision platform is also often used to disseminate scientific technologies during special events or doors open. Like in upper left image, where we used a portable adaptation of Mvisio (as a V-Cave) to show anatomical 3d models during a medical event.Or the upper right image, showing young students visiting our laboratory. Or for very important people, like in the last picture, where the former swiss president Pascal Couchepin enjoyed an haptic simulation using Mvisio for visual rendering during the inauguration of one of the EPFL new buildings.
3D everywhere is possible today by using the correct approach and system architecture. Cross-device applications open new scenarios/applications, mainly when porting across different systems can be achieved “for free”, without major modifications to the project.Creating an high quality CAVE with market level products is possible, easier and less expensive than in the past Most of the hardware problems can be solved through software with a good calibration system and a bit of practical senseModern PDAs are a valid alternative to bring high quality 3D interactive and real-time onboard rendered images to wearable and mobile contextsBut since computational resources are limited: it is important to fine-tuning the software for acceptable frame rates
Client/server network based on TCP/IPOnly high-level commands are sent (“load this model”, “put the light here”, “render this portion of the scene graph”, …)Very small and few packets sent, Nagle disabled MVisio objects have an unique serial numberComputational expensive operations are computed locally on each client (like particles/skinning) and synchronized through unique seeds and timersDynamic texture update is bandwidth consuming (RTP alternative)Dynamic geometry also bandwidth consuming
Mental VisionA Computer Graphics Platform for Virtual Reality, Science and Education Achille Peternier Thesis director: Prof. Daniel Thalmann
Plan• Introduction – Context – Goals• Our solution – Architecture and components – MVisio graphics engine – Pedagogical modules – Low-cost CAVE and wearable 3D system• Results – Benchmark – Case studies• Conclusion
Our solution - architecture Corollary tools (low costMVisio 3D graphics engine CAVE and wearable system) Pedagogical modules
Our solution - mvisio• User-friendly API, based on an C++ class-oriented architecture• Multi-device rendering on PC, PDA and CAVE (a same source-code compiles everywhere)• Multi-platform (Windows, Linux)• Full OpenGL and OpenGL|ES support• Dynamic scene graph management• Dynamic lighting• Dynamic soft shadows• HDR and bloom lighting• Depth of field• Vertex and pixel shaders• Skinning and animations• Particle emitters• Terrain engine• Video2texture from MPEG1/2 files• Loading of scenes directly exported from 3D Studio MAX through a specific plug-in• 2D GUI system with event handling• Loading of complex 2D interfaces from XML files• Object picking• Support for Head-Mounted Displays (HMD), even on PDA• New customizable plug-in objects
Results - benchmark• Simple cross device application tracking fps and using three different models: • classic static Stanford bunny • a building model (using many separated entities and transparencies) • a 86 bones skinned, animated, textured virtual human• Basic GUI (some text, a couple of buttons)• We want to evaluate speed issues, code differences and visual consistencyamong different platforms
Results - publicationsA. Peternier, F. Vexo, D. Thalmann, The Mental Vision framework: a platform for teaching, practicing and researching with Computer Graphics and Virtual Reality, LNCS Transaction on Edutainment, 2008A. Peternier, S. Cardin, F. Vexo, D. Thalmann, Practical Design and Implementation of a CAVE system, GRAPP 2007A. Peternier, F. Vexo, D. Thalmann, Wearable Mixed Reality system in less than 1 pound, EG Symposium on VR, 2006A. Peternier, D. Thalmann, F. Vexo, Mental Vision: a Computer Graphics teaching platform, Edutainment, LNCS, 2006
Results - disseminationportable V-Cave demos for visitors VIP events
Conclusion VS• Optimized for speed • Optimized for robustness• Specific contexts • Multi purposes• Complex to use • Intuitive• Expensive • Cost efficient• Limited to experts • Ideal for students too