Your SlideShare is downloading. ×
Classifying human motion for active music systems
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Classifying human motion for active music systems

234
views

Published on

Tutorial @ AWASS 2012 by Arjun Chandra

Tutorial @ AWASS 2012 by Arjun Chandra

Published in: Technology, Business

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
234
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. AWASS 2012 Case Study Tutorial -Classifying human motion for activemusic systemsArjun ChandraUniversity of Oslo
  • 2. Outline of the Tutorial Why do motion classification for active music systems? The motion classification problem. Established solutions for the problem and demonstration. Challenges for the week with regards to motion classification for active music systems. 2/65
  • 3. Active MusicVideos from yesterday: iPhone ensemble (UiO) SoloJam (UiO) Performance based musicmaking (Wout Standaert) 3/65
  • 4. Active Music Boundary between someone performing music and someone listening/perceiving. Limited passive interaction - tapping feet etc. Active music blurs this boundary and allows participation by perceivers. End user may have little or no training in traditional musicianship or composition. The user gets control of the sonic/musical output to a greater or lesser extent. Users experience the sensation of playing music themselves. 4/65
  • 5. Active MusicTo build such a system... Give control via mobile media devices like iPods for example. Devices to be intelligent in order to mediate the control of the musical output by participants.Media device must be able to: Sense the inputs from the participants and the environment. Process these in various forms. Co-ordinate the activities of the participants as they perform. Maintain musicality and interest of the users. 5/65
  • 6. Active MusicKey type of input: Human motion is an integral part of the types of inputs that may be sensed by the devices. Motion and sound very closely related! Motion to be processed by the device in some fashion and eventually mapped to music. In a full fledged active music system, numerous other types of inputs will be sensed, pertaining to the participant, the device itself, as well as the external (to the human-device subsystem) environment, including other participants/devices. 6/65
  • 7. How Does All This Relate toSelf-awareness?Self-awareness to take the form of: Devices building models/possessing knowledge of the behaviours of the respective participants. Devices building models/possessing knowledge of themselves. Devices building models/possessing knowledge of the environment within which they get used. Such knowledge would help the devices to further reason about themselves in order to maintaining musicality, maintaining user interest, efficiency in computation, maintaining good response times, managing overhead in communication, managing energy needs, managing trade-offs between such goals etc. 7/65
  • 8. Classifying Human Motion forActive Music SystemsOne first step towards mapping sensed motion into music: Recognise patterns in human motion. We will work on such pattern recognition this week.Triggers or fine grained mapping: 1 The recognition may be used as triggers, i.e. recognise the type of motion once it has been performed and trigger sound synthesis. 2 In addition, the system may also be able to anticipate which type of motion is about to be performed, or is ongoing, and thus provide the possibility of more fine grained mapping with sound synthesis. 8/65
  • 9. Classifying Human Motion forActive Music SystemsOne first step towards mapping sensed motion into music: Recognise patterns in human motion. We will work on such pattern recognition this week.Triggers or fine grained mapping: 1 The recognition may be used as triggers, i.e. recognise the type of motion once it has been performed and trigger sound synthesis. 2 In addition, the system may also be able to anticipate which type of motion is about to be performed, or is ongoing, and thus provide the possibility of more fine grained mapping with sound synthesis. 9/65
  • 10. Classifying Human Motion forActive Music Systems Motion Classification Scheme Trained Training Classifier Identified class Pre- which can then 3D accelerometer Recognition processing be informed to a stream (optional)sound synthesiserExample video for motion classification. 10/65
  • 11. Classifying Human Motion forActive Music SystemsTwo classic phases: Training: take a bunch of data and build a classifier. Recognition: use the classifier on new streams to recognise patterns in these streams. 11/65
  • 12. Classifying Human Motion forActive Music SystemsSome challenges whilst training: The same type of motion can vary both spatially and temporally. Same type of motion may be performed differently depending on mood of the user. Intentions of the user have a bearing on the performed motion. User may be stationary or moving when performing the motion. Different users may perform the same motion differently. The motion types may grow or reduce in number over time, as the user operates the system. 12/65
  • 13. Classifying Human Motion forActive Music SystemsTo make things more challenging: Sometimes, quick training is essential. Ideally, online with little or no effort from user. Automated segmentation coupled with classification. 13/65
  • 14. Classifying Human Motion forActive Music SystemsMany ways to capture motion: Marker based motion capture systems, e.g. Qualsis motion capture system (Soundsaber) Vision based systems, e.g. Kinect (Piano via Kinect) Sensor based systems, e.g. iOS devices (SoloJam), Wiimote, Xsens full body motion capture suit (Dance Jockey, Mobile Dance Jockey) 14/65
  • 15. Classifying Human Motion forActive Music SystemsIn this case study, we capture motion data via: Media devices e.g. iPods, have internal motion (acceleration) sensors: 3D accelerometers. We will use the sensor data stream as the device is moved, and classify the performed motion into a relevant category. 15/65
  • 16. Classifying Human Motion forActive Music SystemsWhat category? You can choose to define categories. You will then collect some data pertaining to the categories you choose. Once you have collected the data, you will then train a classifier with this data. Once trained, you will use this classifier to recognise the categorised motions within a sensor data stream. 16/65
  • 17. Classifying Human Motion forActive Music SystemsDemo...Let’s define some categories, train and get some motion recognition going! 17/65
  • 18. Classifying Human Motion forActive Music SystemsAs I mentioned yesterday, you are going to be provided: Established algorithms for motion classification. Two algorithms to play with and build on to be precise. Data sets with different types of motion which you can use to get a feel for the algorithms. Exercises pertaining to some challenging active music scenarios where motion classification is required. These will require you to build new data sets.Let us look at the two algorithms briefly now... 18/65
  • 19. Motion Classification AlgorithmsThe two algorithms are: Dynamic Time Warping. Hidden Markov Models. You are encouraged to not be limited by these two algorithms. Apply others that you know of. 19/65
  • 20. Motion Classification Algorithms:Dynamic Time Warping (DTW)Key idea: To be able to compare two signals of different lengths. The result of such a comparison can be used in interesting ways. 20/65
  • 21. Motion Classification Algorithms:Dynamic Time Warping (DTW)Key idea: To be able to compare two signals of different lengths. The result of such a comparison can be used in interesting ways.You might wonder... What should be compared to what? What are these two signals? 20/65
  • 22. Motion Classification Algorithms:Dynamic Time Warping (DTW)Key idea: To be able to compare two signals of different lengths. The result of such a comparison can be used in interesting ways.You might wonder... What should be compared to what? What are these two signals? Template matching! 20/65
  • 23. Motion Classification Algorithms:Dynamic Time Warping (DTW)Template matching: Match a time varying signal, in our case motion data stream, against a stored set of signals. The stored set of signals are the templates, one representing each category. In effect, a motion data stream is compared against a representative from within the collected data, one for each category. The closest matching template, tells us about the category the stream most likely belongs to. 21/65
  • 24. Motion Classification Algorithms: Dynamic Time Warping (DTW)Euclidean distance... time 22/65
  • 25. Motion Classification Algorithms: Dynamic Time Warping (DTW)Euclidean distance... DTW... intuitive! time time 22/65
  • 26. Motion Classification Algorithms:Dynamic Time Warping (DTW) Two signals and their cost matrix. This cost → some distance measure, e.g. Euclidean. Note the valleys (dark - low cost) and hills (light - high cost). Goal → find alignment with minimal overall cost. The optimal alignment runs along a valley of low cost. 23/65
  • 27. Motion Classification Algorithms:Dynamic Time Warping (DTW) We have to find the optimal warping path in this cost matrix. Shown is the optimal warping path, i.e. optimal alignment. How do we find this warping path? There are exponentially many. 24/65
  • 28. Motion Classification Algorithms:Dynamic Time Warping (DTW) If P is a warping path. Note that P is a set of pairs of aligned points (p) on the signals. k s=1 d(ps )ws argmin k P s=1 ws gives the optimal path, where, d(ps ) is the cost, ws is the weighting coefficient (1 in our case), and denominator is the length of the path. 25/65
  • 29. Motion Classification Algorithms:Dynamic Time Warping (DTW) We first put some restrictions on the paths that may be found. 1 Monotonicity. 2 Continuity. 3 Boundary conditions. 4 Warping window. 5 Slope constraint. 26/65
  • 30. Motion Classification Algorithms:Dynamic Time Warping (DTW) Monotonicity. Path not allowed to go back in time. Prevents feature comparisons being repeated during matching. 27/65
  • 31. Motion Classification Algorithms:Dynamic Time Warping (DTW) Monotonicity. Path not allowed to go back in time. Prevents feature comparisons being repeated during matching. 28/65
  • 32. Motion Classification Algorithms:Dynamic Time Warping (DTW) time time 29/65
  • 33. Motion Classification Algorithms:Dynamic Time Warping (DTW) Continuity. Path not allowed to break. Prevents omission of features whilst matching. 30/65
  • 34. Motion Classification Algorithms:Dynamic Time Warping (DTW) Continuity. Path not allowed to break. Prevents omission of features whilst matching. 31/65
  • 35. Motion Classification Algorithms:Dynamic Time Warping (DTW) time time 32/65
  • 36. Motion Classification Algorithms:Dynamic Time Warping (DTW) Boundary conditions. Start at top-left position and end at bottom-right position in the matrix. Prevents one of the signals being only partially considered. 33/65
  • 37. Motion Classification Algorithms:Dynamic Time Warping (DTW) Boundary conditions. Start at top-left position and end at bottom-right position in the matrix. Prevents one of the signals being only partially considered. 34/65
  • 38. Motion Classification Algorithms:Dynamic Time Warping (DTW) time time 35/65
  • 39. Motion Classification Algorithms:Dynamic Time Warping (DTW) Warping window. A good alignment path is unlikely to wander too far from the diagonal. Stay within a window. Prevents sticking at similar features and skipping features. 36/65
  • 40. Motion Classification Algorithms:Dynamic Time Warping (DTW) Warping window. A good alignment path is unlikely to wander too far from the diagonal. Stay within a window. Prevents skipping features and sticking at similar features. 37/65
  • 41. Motion Classification Algorithms:Dynamic Time Warping (DTW) time time 38/65
  • 42. Motion Classification Algorithms:Dynamic Time Warping (DTW) Slope constraint. Path not allowed to be too steep or too flat. Prevents short parts of a signal to be matched with very long parts of the other. 39/65
  • 43. Motion Classification Algorithms:Dynamic Time Warping (DTW) Slope constraint. Path not allowed to be too steep or too flat. Prevents short parts of a signal to be matched with very long parts of the other. 40/65
  • 44. Motion Classification Algorithms:Dynamic Time Warping (DTW) time time 41/65
  • 45. Motion Classification Algorithms:Dynamic Time Warping (DTW) We then build an accumulated distance matrix. There is a nicely defined valley that emerges when we do so. Building the accumulated distance matrix is done via dynamic programming. Let us see how this is done... 42/65
  • 46. Motion Classification Algorithms:Dynamic Time Warping (DTW) This valley reveals the path we are after. The bottom-right corner of the matrix holds the value k s=1 d(ps )ws . This value is the un-normalised warping distance. Normalising this distance by the path length, gives us the distance between the two signals. 43/65
  • 47. Motion Classification Algorithms:Dynamic Time Warping (DTW) The two signals shown in this example are one dimensional. But, we are going to work with 3D data. The process described above works for N dimensional data. Remember that the initial cost matrix is built using Euclidean distance. 44/65
  • 48. Motion Classification Algorithms:Dynamic Time Warping (DTW)3D-DTW: 45/65
  • 49. Motion Classification Algorithms:Dynamic Time Warping (DTW)Training/finding the representative template for each category: Find the training example with the minimum average normalised distance against the remaining examples, for each category. See equation (7) in Gillian et. al. (2011). Note that there are other ways to find templates. You are encouraged to explore other ways. 46/65
  • 50. Motion Classification Algorithms:Dynamic Time Warping (DTW)Recognition Find the normalised distance between the stream and all the templates. The category of the closest (lowest normalised distance) matching template is the classification result, provided the distance is below the threshold. See equations (10 - 13) in Gillian et. al. (2011) for threshold distance for each category - to reject false positives. Come up with your own way! 47/65
  • 51. Motion Classification Algorithms:Hidden Markov Models (HMM)Key idea: Statistical generative model of time varying signals. One HMM per category. Can help ascertain the probability that a given observation/stream/time varying observed signal can be generated by the model. Knowing this probability, across multiple HMMs, allows us to categorise a stream. 48/65
  • 52. Motion Classification Algorithms:Hidden Markov Models (HMM)Schematic of a Markov chain with 5 states (Rabiner (1989)): Probability of being in a state only depends on the predecessor state (first order). Independent of time. Denoted by aij , and N j=1 aij = 1. But here, each state is observable, e.g. weather model: P (rain, rain, rain, ...|M odel)? 49/65
  • 53. Motion Classification Algorithms:Hidden Markov Models (HMM)Markov chain → HMM: v1 v2 b21 v1 v3 b22 v2 b11 b23 v3 b12 b13 v1 b31 v2 b51 b52 b32 v1 v3 b53 b33 v2 v3 b41 b42 b43 v1 v2 v3 50/65
  • 54. Motion Classification Algorithms:Hidden Markov Models (HMM)HMM: v1 v2 v1 b21 v3 b22v2 b11 b23v3 b12 b13 A hidden process generates what you observe. v1 b51 b31 Thus, you observe this hiddenv2 b52 b32 v1 process via observations only.v3 b53 b33 v2 v3 b41 b42 b43 v1 v2 v3 51/65
  • 55. Motion Classification Algorithms:Hidden Markov Models (HMM)HMM: v1 v2 b21 Observation a probabilistic v1 v3 b22 function of state!v2 b11 b23v3 b12 vj ’s are the possible b13 observations in any state. We do not observe state v1 b31 anymore, hence hidden:v2 b51 b52 b32 v1 examples - ask weather fromv3 b53 b33 v2 friend, observing acceleration v3 stream when person moves b41 b42 b43 in some way/or not in v1 v2 v3 another room. 52/65
  • 56. Motion Classification Algorithms:Hidden Markov Models (HMM)HMM elements: v1 v2 v1 b21 N states, S = {S1 , S2 , ..., SN }. v3 b22v2 b11 b23 qt , the state at time t.v3 b12 b13 M observation symbols/codebook, V = {v1 , v2 , ..., vM }. v1 b31 Observation sequencev2 b51 b52 b32 v1v3 O = O1 O2 ...OT , made up of b53 b33 v2 elements from the codebook, v3 e.g. sequence v1 v2 v1 v3 v2 ... of b41 b42 b43 length T . v1 v2 v3 53/65
  • 57. Motion Classification Algorithms:Hidden Markov Models (HMM)HMM elements: v1 v2 v1 b21 v3 b22 State transition matrixv2 b11 b23 b12 A = {aij }, wherev3 b13 aij = P (qt+1 = Sj |qt = Si ). Emission/observation symbol v1 matrix B = {bjk }, where b31v2 b51 b32 v1 bjk = P (vk |qt = Sj ). b52v3 Initial state probability b53 b33 v2 v3 vector π = {πi }, where b41 b42 b43 πi = P (q1 = Si ). v1 v2 v3 54/65
  • 58. Motion Classification Algorithms:Hidden Markov Models (HMM)HMM elements: v1 v2 v1 b21 v3 b22 A HMM λ specified byv2 b11 b23 b12 specifying N , M , V , A, B andv3 b13 π. Example: N = 4, M = 3, v1 V = {v1 , v2 , v3 }, aij ’s, bjk ’s, b31v2 b51 b32 v1 π1 = 1. b52v3 b53 b33 v2 We have to essentially specify v3 these, using the motion data, b41 b42 b43 one HMM per category. v1 v2 v3 55/65
  • 59. Motion Classification Algorithms:Hidden Markov Models (HMM)Some procedures: Pre-processing via vector quantisation: to build a codebook/process acceleration data in terms of observation symbols giving observation sequences. Forward algorithm: to calculate P (O|λc ), where λc denotes the cth HMM, and O is an observation sequence. Forward-backward algorithm: To estimate the parameters (A and B) of the HMM using multiple observation sequences, i.e. training. Baye’s rule: Together with the forward algorithm, helps ascertain P (λc |O), i.e. recognition that a new observation sequence O belongs to category c. 56/65
  • 60. Motion Classification Algorithms:Hidden Markov Models (HMM)Pre-processing by vector quantisation: Any acceleration stream (stream of 3D vectors) has large range of values and fine granularity. Abstracting each 3D vector into codes. Using k-means clustering and finding centroid of each cluster results in code words/vectors of the codebook. See Klingmann (2009), Sections 3.1 and 4.4, and Schloemer (2008). Index of a code word is what is used as an observation. String of indices of code words matching vectors in data/stream is then the observation sequence O. 57/65
  • 61. Motion Classification Algorithms:Hidden Markov Models (HMM)Forward algorithm: To find P (O|λc ). Probabilities/forward variables αt (i)’s need to be computed and used to find this. See Rabiner (1989), Section III-A and Klingmann (2009), Section 3.2.4. An efficient way to compute: P (O|λc ) = all Q P (O|Q, λc )P (Q|λc ), where Q’s are the many possible (N T ) state sequences that may be visited to generate O. 58/65
  • 62. Motion Classification Algorithms:Hidden Markov Models (HMM)Forward algorithm (figures from Rabiner (1989): αt (i) = P (O1 ...Ot , qt = Si |λc ) N P (O|λc ) = i=1 αT (i) 59/65
  • 63. Motion Classification Algorithms:Hidden Markov Models (HMM)Training: Forward-backward algorithm. For estimation of A and B matrices for each λc , given the respective training observation sequences. αt (i)’s and backward variables βt (i)’s need to be computed and used to update A and B. βt (i) is the probability of generating the remaining part of the observation sequence from time t + 1 to T , given state Si at time t, i.e. P (Ot+1 Ot+2 ...OT |qt = Si , λc ). See Rabiner (1989), Sections III-C and V-B, and Klingmann (2009), Sections 3.2.5, 3.2.6, 4.5.2. 60/65
  • 64. Motion Classification Algorithms:Hidden Markov Models (HMM)A update: expected number of transitions from Si to Sjaij =¯ expected number of transitions from SiB update:¯jk = expected number of times in Sj and observing symbolb vk expected number of times in Sj αt (i)’s and βt (i)’s used within the above update equations. See Section V-B in Rabiner (1989) and Sections 4.5.2 in Klingmann (2009) for a variant that we will use. This variant takes care of multiple training observation sequences. 61/65
  • 65. Motion Classification Algorithms:Hidden Markov Models (HMM)Recognition: If Ostream is the stream to be classified. We want to find P (λc |Ostream ). We use the forward algorithm and Baye’s Rule for this. P (λc |Ostream ) is the probability that λc , i.e. the HMM indexed by c, generated the sequence Ostream . The highest probability amongst all λc ’s, tells us the category c the stream has been classified into. 62/65
  • 66. Motion Classification Algorithms:Hidden Markov Models (HMM)Recognition: P (λc ) may be calculated as the average of P (Oj |λc ) across training observation sequences Oj ’s. We compute P (λc ) and P (Ostream |λc ) for all m’s. P (Ostream ) = c P (Ostream , λc ) = c P (Ostream |λc )P (λc ). stream |λc )P P (λc |Ostream ) = P (OP (Ostream ) (λc ) See Klingmann (2009), Section 3.3. 63/65
  • 67. Challenging Active Music ScenariosLower level technical challenges: How well does the system classify when reference point (user) is stationary and moving? Can we distinguish these? How well does the system separate impulsive and sustained actions, e.g. hitting a drum versus bowing a violin? Can it differentiate or otherwise between using the right or left hand to do the "same" action? 64/65
  • 68. Challenging Active Music ScenariosHigher level semantic challenges: Can it separate gestures from actions, i.e. find the meaning-bearing part, e.g. difference between actions that are performed with a sad, happy or angry intention? Can it distinguish between an expert user Vs. non-expert user handling the device? 65/65