Literature review of facial modeling and animation techniques


Published on

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Literature review of facial modeling and animation techniques

  1. 1. International Journal of Computer Engineering (IJCET), ISSN 0976 – 6367(Print), International Journal of Computer Engineering and Technology ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEMEand Technology (IJCET), ISSN 0976 – 6367(Print)ISSN 0976 – 6375(Online) Volume 1 IJCETNumber 1, May - June (2010), pp. 136-146 ©IAEME© IAEME, LITERATURE REVIEW OF FACIAL MODELING AND ANIMATION TECHNIQUES Mr. K. Gnanamuthu Prakash Research Scholar Anna University of Technology, Coimbatore Coimbatore – 641 047 Dr. S. Balasubramanian Research Scholar Anna University of Technology, Coimbatore Coimbatore – 641 047ABSTRACT A major unsolved problem in computer graphics is the construction and animationof realistic human facial models. Traditionally, facial models have been builtpainstakingly by manual digitization and animated by ad hoc parametrically controlledfacial mesh deformations or kinematics approximation of muscle actions. Fortunately,animators are now able to digitize facial geometries through the use of scanning rangesensors and animate them through the dynamic simulation of facial tissues and muscles.However, these techniques require considerable user input to construct facial models ofindividuals suitable for animation. Realistic facial animation is achieved throughgeometric and image manipulations. Geometric deformations usually account for theshape and deformations unique to the physiology and expressions of a person. Imagemanipulations model the reflectance properties of the facial skin and hair to achieve smallscale detail that is difficult to model by geometric manipulation alone.INTRODUCTION Computer facial animation is primarily an area of computer graphics thatencapsulates models and techniques for generating and animating images of the humanhead and face. Two-dimensional facial animation is commonly based upon thetransformation of images, including both images from still photography and sequences of 136
  2. 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEMEvideo. Image morphing is a technique which allows in-between transitional images to begenerated between a pair of target still images or between frames from sequences ofvideo. These morphing techniques usually consist of a combination of a geometricdeformation technique, which aligns the target images, and a cross-fade which creates thesmooth transition in the image texture. Another form of animation from images consistsof concatenating together sequences captured from video. Another one more is techniquecalled video-rewrite where existing footage of an actor is cut into segmentscorresponding to phonetic units which are blended together to create new animations of aspeaker. Video-rewrite uses computer vision techniques to automatically track lipmovements in video and these features are used in the alignment and blending of theextracted phonetic units. This animation technique only generates animations of the lowerpart of the face, these are then composited with video of the original actor to produce thefinal animation. Three-dimensional head models provide the most powerful means ofgenerating computer facial animation. The model was a mesh of 3D points controlled bya set of conformation and expression parameters. The former group controls the relativelocation of facial feature points such as eye and lip corners. Changing these parameterscan re-shape a base model to create new heads. Different methods for initializing such“generic” model based on individual (3D or 2D) data have been proposed andsuccessfully implemented. The parameterized models are effective ways due to use oflimited parameters, associated to main facial feature points. The MPEG-4 standarddefines a minimum set of parameters for facial animation. Animation is done by changingparameters over time. Facial animation is approached in different ways, traditionaltechniques include 1. shapes/morph targets, 2. skeleton-muscle systems, 3. bones/cages, 4. motion capture on points on the face and 5. knowledge based solver deformations. Facial animation is now attracting more attention than ever before in its 25 yearsas an identifiable area of computer graphics. Imaginative applications of animatedgraphical faces are found in sophisticated human-computer interfaces, interactive games, 137
  3. 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEMEmultimedia titles, VR telepresence experiences, and, as always, in a broad variety ofproduction animations. Graphics technologies underlying facial animation now run thegamut from key framing to image morphing, video tracking, geometric and physicalmodeling, and behavioral animation. Supporting technologies include speech synthesisand artificial intelligence. Whether the goal is to synthesize realistic faces or fantasticones, representing the dynamic facial likeness of humans and other creatures is givingimpetus to a diverse and rapidly growing body of cross-disciplinary research.LITERATURE REVIEW Facial modeling and animation research falls into two major categories, thosebased on geometric manipulations and those based on image manipulations. Each realmcomprises several subcategories. Geometric manipulations include key-framing andgeometric interpolations [A. Enmett 1985, F. I. Parke 1991], parameterizations [M.Cohen 1993] finite element methods [B. Guenter 1992], muscle based modeling [K.Waters 1987], visual simulation using pseudo muscles [P. Kalra 1992], splinemodels [C. L. Y. Wang 1994] and free-form deformations [S. Coquillart 1990]. Imagemanipulations include image morphing between photographic images [T. Beier et.al1992], texture manipulations [M. Oka 1987], image blending [F. Pighin 1998],and vascular expressions [P. Kalra 1994]. As stated by Ekman (1975), humans are highly sensitive to visual messages sentvoluntarily or involuntary by the face. Consequently, facial animation requires specificalgorithms able to render with a high degree of realism the natural characteristics of themotion. Research on basic facial animation and modeling has been extensively studiedand several models have been proposed. For example, in the Parke models (1975, 1982)the set of facial parameters is based on both observation and the underlying structuresthat cause facial expression. The animator can create any facial image by specifying theappropriate set of parameter values. Motions are described as a pair of numeric tupleswhich identify the initial frame, final frame, and interpolation. Pearce et al. (1986)introduced a small set of keywords to extend the Parke model. Platt and Badler (1981) have designed a model that is based on underlying facialstructure. The skin is the outside level, represented by a set of 3D points that define a 138
  4. 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEMEsurface which can be modified. The bones represent an initial level that cannot be moved.Between both levels, muscles are groups of points with elastic arcs. Waters (1987) represents the action of muscles using primary motivators on anon-specific deformable topology of the face. The muscle actions themselves are testedagainst FACS (Facial Action Coding System) which employs action units directly to onemuscle or a small group of muscles. Two types of muscles are created: linear/parallelmuscles that pull and sphincter muscles that squeeze. Magnenat-Thalmann et al. (1988)defined a model where the action of a muscle is simulated by a procedure, called anAbstract Muscle Action procedure (AMA), which acts on the vertices composing thehuman face figure. It is possible to animate a human face by manipulating the facialparameters using AMA procedures. By combining the facial parameters obtained by theAMA procedures in different ways, we can construct more complex entitiescorresponding to the well-known concept of facial expression. Nahas et al. (1987)propose a method based on the B-spline. They use a digitizing system to obtain positiondata on the face from which they extract a certain number of points, and organize them ina matrix. This matrix is used as a set of control points for a 5-dimensional bicubic B-spline surface. The model is animated by moving these control points.CLASSIFICATION OF FACIAL MODELING AND ANIMATIONMETHODS This taxonomy in Figure 1 illustrates the diversity of approaches to facialanimation. Exact classifications are complicated by the lack of exact boundaries betweenmethods and the fact that recent approaches often integrate several methods to producebetter results. The literature review as follows introduce the interpolation techniques andparameterizations followed by the animation methods using 2D and 3D morphingtechniques. The Facial Action Coding System, a frequently used facial description tool.Physics based modeling and simulated muscle modeling are discussed. Techniques forincreased realism, including wrinkle generation, vascular expression and texturemanipulation, are discussed. Individual modeling and model fitting are described inliterature review. 139
  5. 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME Figure 1: Classification of facial modeling and animation methodsFACIAL MODELING TECHNIQUESPOLYGONAL Polygonal modeling specifies exactly each 3D point, which connected to eachother as polygons. This is an exacting way to get topology (points) where you need it on aface and not where you don’t.PATCHES (NURBs) Patches (or a set of splines) indirectly defines a smooth curve surface from a setof control points. A small amount of control points (called CVs in Maya) can define acomplex surface. One type of spline is called NURBs which stands for Non-Uniform 140
  6. 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEMERational B-Splines. This type of batch allows each control point to have its own weightthat can affect the “pinch” of the curve at the point. So they are considered the mostversatile of batches. They work very well for organic smooth objects so hence they arewell suited for facial modeling however several issues arise.SUB-DIVISION SURFACES Sub-Division surfaces is a fairly new modeling technique that gives you thecontrol and flexibility of polygons with the ease of use and smoothness of patches. TonyDeRose (who wrote the paper on Sub-D surfaces and created a working version for Pixar,first used in Geri’s Game) has slides on the advantages on sub-d surface. Sub-D surfacesgives you the detail only where you need it. Paul Aichele discussed this on our Pixar tripwith Geri’s head.DIGITIZING Facial models can be created by digitizing live humans or physical models. Thereare several techniques. One that does need an expensive digitizer is using fudical pointsto reconstruct 2D photographs into a 3D model. Now however, automatic digitizingequipment like that from CyberWare is regularly used to create high resolution 3Dmodels of live human models complete with color data. While digitizing models are veryuseful for many application such as Stanford’s Digital Michelangelo Project typically thedata from these systems are too high resolution and not semantically setup for facialanimation.PHOTOGRAMETRIC ACQUISITION Web based avatar companies and others are using techniques that take aphotograph of a human face (sometimes from the front and side) and map it onto a pre-made 3D model that animates by going through a registration process, where key pointson the photograph (corners of: the eyes, eyebrows, mouth, ..) are picked via the mouse toregister the image with the model. Several researchers are working on automaticregistering techniques but lighting conditions on a live face or photograph and otherstandardization issues plagues the process. 141
  7. 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEMEFACE GENERATION Face generation systems genetically evolve the type of face you want or have yousearch or surf through a theoretical space of faces as with the . Still other face generationsystems allow you to pick from a database of facial parts to create a head as you mightwith a police artist sketch kit such as the Faces system we used as an assignment.FACIAL ANIMATION TECHNIQUESa) KEYFRAMING The most widely used animation technique is key-framing, where the animatorcreates key poses of an articulated model and the animation system interpolates the “inbetween” frames from that set of key-frame data. Typically the data being keyed andinterpolated are transformations (move, rotation, scale) of rigid objects such as thehierarchical parts of human body. The problem with facial animation is there really areno rigid parts that move in relation to each other to key-frame. Hence there is a myriad offacial animation techniques based on what you are actually key-framing/interpolating toget smooth animation of a flexible surface of the face.b) MORPH TARGETS One widely used basic technique is to create a model of a face in a rest position.Then using essentially modeling techniques, edit the points of a copy of that face to makeother faces (typically with the same topology hence the copying of the rest face) indifferent phoneme and expression states. Then animate a facial animation sequence bymorphing (point interpolation) between this set of like-minded faces. The disadvantage with this technique is that the animator is only picking from aset of pre-made face morphs and thereby limited to the expressions possible from that setin the final animated face sequence. There are several variants of this morphing technique– most notable compound or hierarchical morph targets which allow the animator toblend several faces together with differing weights and/or only morph specific areas ofthe face. Again, all versions of this technique limit the creative process by only allowingyou to pick from a pre-made set of expressions (or force you to stop the animationprocess to create additional morph targets). 142
  8. 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEMEc) CHARACTER ANIMATION TOOLS Tools that were made and are useful for character and organic animation havebeen used to animate the face. Typically these techniques are not straightforward and arecumbersome because they are not very well suited for a flexible face. They include freeform deformations, “bones”, point cluster techniques and others.d) PARAMETERIZED SYSTEMS Fred Parkes early work in facial animation at Univ. of Utah and NY Inst ofTechnology lead to the first facial animation system. It used a control parameterizedmethod where the animation becomes a process of specifying and controlling parameterset values as a function of time. Parameter systems are able to animate the facialexpressions as well as the facial types. The Face Lift program used for The Sims uses asimplified version of a production parameterization system. Most parameter systems usePaul Ekmans FACS (Facial Action Coding System) which describes facial musclemovement as a basis or starting point for specifying and defining the range of parameters.Ken Perlins web Java-based system also uses a simple but very effective parameterizedtechnique Parameter systems create a very compact specification for facial animation andare therefore ideally suited for the web and games.e) MUSCLE SIMULATION SYSTEMS Keith Waters work uses a simple simulation of muscle deformation to animate theface. It uses two types of muscles: linear muscles that pull and sphincter muscles thatsqueeze. He uses a mass and spring technique to animate or deform the skin. The musclecontrol system has a one to one correspondence to know face muscles and to EkmansFACS. An extended, fast and open source version of Waters technique with applicationsfor games and our real-time systems is called Expressionsf) MOTION CAPTURE Facial animation has also been achieved via performance systems where a liveperformance is digitalized and applied to the facial model rather than created by ananimator. Motion capture is the most widely used performance technique. Systemstypically track via one or several cameras, small point-like reflective stickers attached instrategic positions on the performers face. See two examples of such systems. 143
  9. 9. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEMEg) SPEECH GENERATED SYSTEMS Another form of performance animation system uses the voice only to create notonly the synched lip movement, but the movement of the other parts of the face as wellincluding the eyebrows, blinks and eye movement, and head movement (neck rotation).These systems analyze the voice to get lip syncing phoneme positions and also use pitch,volume, sentence semantics (dividing speech into sentence sections based on pauses) andother cues to approximate the animation of a faces. These systems in standalone form orcombined with photogrametric techniques are being used both in linear animationsystems and in real-time web-based applications.VIDEO BASED ANIMATION OF PEOPLE In order to create animations, that have natural motion AND have photo-realisticappearance, we need to combine motion-capture and image based (or video based)techniques. The goal is to build video based representations of annotated examplemotions. Unlike standard motion capture techniques that are based on markers or otherdevices, we need to annotate body and facial configurations directly in unconstrainedvideo. In static scenes the user could supply annotations by hand, but for videosequences, automatic techniques are crucial (10 min of video has 18,000 images, no-onehas the budged, patience, and consistency to do this by hand). To build libraries ofexample motions, we also need techniques that annotate coarse motion categoriesautomatically. Again, this has to be done automatically. For example a 10 minute videoof someone talking could be transformed into a video-based library of more then 2,000phonetic lip motions (phonemes or visemes).VIDEO REWRITE Video Rewrite uses existing footage to create automatically new video of a personmouthing words that she did not speak in the original footage. This technique is useful inmovie dubbing, for example, where the movie sequence can be modified to sync theactors lip motions to the new soundtrack.VIDEO MOTION CAPTURE This paper demonstrates a new vision based motion capture technique that is ableto recover high. 144
  10. 10. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME Degree-of-freedom articulated human body configurations in complex video sequences. It does not require any markers, body suits, or other devices attached to the subject. CONCLUSIONS Computer facial animation is now being used in a multitude of important fields. It brings a more human, social and dramatic reality to computer games, films and interactive multimedia, and is growing in both use and importance. Authoring computer facial animation with complex and subtle expressions is still difficult and fraught with problems. It is currently mostly authored using generalized computer animation techniques, which often limit the quality and quantity of facial animation production. Given additional computer power, facial understanding and software sophistication, new face-centric methods are emerging but typically are adhoc in nature. This research attempts to define and organizationally categorize current and emerging methods, including surveying facial animation experts to define the current state of field perceived bottlenecks and emerging techniques. REFERENCES1. Ekman P. and Friesen WV. (1975), Unmasking the Face: A Guide to Recognizing Emotions from Facial Clues, Prentice-Hall.2. Parke FI (1975) A Model for Human Faces that allows Speech Synchronized Animation, Computers and Graphics, pergamon Press, Vol.1, No.1, pp.1-4.3. Parke FI (1982) Parameterized Models for Facial Animation, IEEE Computer Graphics and Applications, Vol.2, No.9, pp.61-68.4. Pearce A, Wyvill B, Wyvill G and Hill D (1986) Speech and expression: a Computer Solution to Face Animation, Proc. Graphics Interface 86, pp.136-140.5. Platt S, Badler N (1981) Animating Facial Expressions, Proc. SIGGRAPH 81, pp.245- 252.6. Waters K (1987) A Muscle Model for Animating Three-Dimensional Facial Expression, Proc. SIGGRAPH 87, Vol.21, No.4, pp.17-24.7. Magnenat-Thalmann N, Thalmann D (1987) The Direction of Synthetic Actors in the film Rendez-vous à Montréal, IEEE Computer Graphics and Applications, Vol.7, No.12. 145
  11. 11. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME8. Nahas M, Huitric H and Saintourens M (1988) Animation of a B-spline Figure, The Visual Computer, Vol.3, No.5.9. A. Enmett, Digital portfolio: Tony de peltrie. Computer Graphics World, 1985, vol. 8(10), pp. 72– 7710. F. I. Parke, Techniques of facial animation, In N. Magnenat-Thalmann and D. Thalmann, editors, New Trends in Animation and Visualization, 1991, Chapter 16, pp. 229 – 241, John Wiley and Sons11. M. Cohen, D. Massara, Modeling co-articulation in synthetic visual speech. In N. Magnenat- Thalmann, and D. Thalmann editors, Model and Technique in Computer Animation, 1993, pp. 139– 156, Springer-Verlag, Tokyo12. B. Guenter, A system for simulating human facial expression. In State of the Art in Computer Animation, 1992, pp. 191–20213. K. Waters. A muscle model for animating three-dimensional facial expression. In Maureen C.14. Stone, editor, Computer Graphics (Siggraph proceedings, 1987) vol. 21 pp. 17-2415. P. Kalra, A. Mangili, N. M. Thalmann, D. Thalmann, Simulation of Facial Muscle Actions Based on Rational Free From Deformations, Eurographics 1992, vol. 11(3), pp. 59–6916. C. L. Y. Wang, D. R. Forsey, Langwidere: A New Facial Animation System, proceedings of Computer Animation, 1994, pp. 59-6817. S. Coquillart, Extended Free-Form Deformation: A Sculpturing Tool for 3D Geometric Modeling, Computer Graphics, 1990, vol. 24, pp. 187 – 19318. T. Beier, S. Neely, Feature-based image metamorphosis, Computer Graphics (Siggraph proceedings 1992), vol. 26, pp. 35-4219. M. Oka, K. Tsutsui, A. ohba, Y. Jurauchi, T. Tago, Real-time manipulation of texture- mapped surfaces. In Siggraph 21, 1987, pp. 181–188. ACM Computer Graphics20. F. Pighin, J. Hecker, D. Lischinski, R. Szeliski, D. H. Salesin, Synthesizing Realistic Facial Expressions from Photographs, Siggraph proceedings, 1998, pp. 75-8421. P. Kalra, N. Magnenat-Thanmann, Modeling of Vascular Expressions in Facial Animation, Computer Animation, 1994, pp. 50 -58 146