Your SlideShare is downloading. ×
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Wearable Computing - Part III: The Activity Recognition Chain (ARC)

1,838

Published on

Introduction to wearable computing, sensors and methods for activity recognition.

Introduction to wearable computing, sensors and methods for activity recognition.

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,838
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
93
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Also, wie erkennt man aktivitäten von einem benutzer? Mit sensoren. Wir benutzen sensoren die am körper oder in der kleidung integriert sind. Es gibt verschiedene sensoren die uns helfen aktivitäten zu erkennen. Zb aktivität bedeuted/impliziert bewegung. Wir können dann ein bewegungsensor benutzen Vielleicht einige von ihnen haben ein Mobiltelefon mit bewegungsensoren drin. Das gibt schon. <demo bewegung sensors> wenn ich winke ihr könnt sehen das das signal sieht typisher aus. Aktivitäten implizieren auch oftmals typische gerausche. Bohrer. Deshalb ein mikrofon kann auch input fur aktivitäterkennung geben Jetzt das ziel ist mit ein Sensor die typische bewegung oder geräusche von eine aktivität zu erkennen. Das geht in zwei schritte. Erstens wir finden wie eine aktivität sieht aus wenn man die sensordaten schaut. Wir finden den Sensordatenmuster. Z.B. hier zwei aktivitäten mit einem bewegungsensors Zweitens wir müssen die Muster in die sensordaten finden. Jetzt kenne ich die Muster. Ich lese viel daten aus die sensoren. Das könnte so aussihen. Die frage ist jetzt ob ich eine von die aktivität gemacht habe. Wie macht man das? Wir vergleichen das muster mit die sensordaten. Und wir sehen das hier das muster und die sensordaten sind sehr ähnlich. Das bedeutet that wir haben gefunden das der benutzer hat aus einem glass getrinkt hier. Wir machen dasselbe mit die andere Muster. Und hir wir finden eine zweite aktivität.
  • On the following slides we will have a closer look at the activity spotting method. First we’ll walk through the processing chain to get an overview. The on-body sensors in the Motion Jacket deliver orientation data for the body limbs. From this data body trajectories are computed in Cartesian space. These trajectories are encoded in a discrete string representation. Strings are input to the spotting operation that uses string matching techniques. Activity-specific template strings are matched with the continuous motion string from the sensors. This is done in parallel for each activity class and trajectory. Retrieved segments of the SAME class from DIFFERENT trajectories are fused using a temporal overlap detection scheme.
  • The string matching operation is based on the approximate string matching algorithm. Its basis is the Levensthein or edit distance. This distance comprises three symbol level operations that allow to compare two strings and express their distance using a scalar. A key modification of the standard algorithm allows to find template occurrences at arbitrary positions within in the motion string. (Approach requires one template string per activity class)
  • This is an example of the matching operation with two activity class templates. For each incoming symbol in the motion string, the approximate string matching algorithm is executed to compute the current matching cost value. This is done for both template strings. Template String 1 as example: one substitution and one deletion operation on symbol level determine the costs at t0: r+d
  • This plot shows the matching costs over time for one template string which corresponds to a dedicated activity class. Minima in the time series of costs that fall below a trained spotting threshold k are inferred as END-POINTS of activity occurrences. The START-POINTS are found by tracing back the cost time series to an suitable previous maximum. The retrieved segment is implicitly tied to the activity class whose template string was used during matching.
  • Transcript

    • 1. Daniel Roggen 2011 Wearable Computing Part III The Activity Recognition Chain (ARC)
    • 2. © Daniel Roggen www.danielroggen.net droggen@gmail.com Focus: activity recognition • Activity is a key element of context! Fitness coaching iPhone: Location-based services Step counter Wii Fall detection, alarm Elderly assistant
    • 3. © Daniel Roggen www.danielroggen.net droggen@gmail.com There is no « Drink Sensor »  • Simple sensors (e.g. RFID) can provide a "binary" information – Presence (e.g. RFID, Proximity infrared sensors) – Movement (e.g. ADXL345 accelerometer ‘activity/inactivity pin’) – Fall (e.g. ADXL345 accelerometer ‘freefall pin’) • But in general « activity-X sensor » does not exist – Sensor data must be interpreted – Multiple sensors must be correlated (data fusion) – Several factors influence the sensor data • Drinking while standing: the arm reaches the object then the mouth • Drinking while walking: the arm moves, and also the whole body • Context is interpreted from the sensor data with – Signal processing – Machine learning – Reasoning • Can be integrated into a « sensor node » or « smart sensor » – Sensor chip + data processing in a device
    • 4. © Daniel Roggen www.danielroggen.net droggen@gmail.com User Activity Structure Working Resting Working Resting Working Resting Year 1 Year 2 Year 3 Go to work Read mailMeeting Shopping Go home Enter Give talk Listen Leave Walk ShowSpeak Stand SpeakSpeak Week 10 Week 11 Week 12
    • 5. © Daniel Roggen www.danielroggen.net droggen@gmail.com How to detect a presentation? • Place – Conference room – In front of audience – Generally at the lectern • Sound – User speaks – Maybe short interruptions – Otherwise silence • Motion – Mostly standing, with small walking motion – Hand motion, pointing – Typical head motion
    • 6. © Daniel Roggen www.danielroggen.net droggen@gmail.com Greeting Sensorplatzierung  Upper body  Right wrist  Left upper leg Activity  Person is seated  Stands up  Greets somebody  Seats again
    • 7. © Daniel Roggen www.danielroggen.net droggen@gmail.com Greeting -2g +2g 0g -1g +1g /-2g /+2g /0g
    • 8. © Daniel Roggen www.danielroggen.net droggen@gmail.com Data recording Stand up Sit downSeating Standing Seating Upper body Wrist Hand on table Hand on tableArm motion Arm motion Handshake Time [s] Combination from individual data is distinctive of the activity! Acceleration[g]
    • 9. © Daniel Roggen www.danielroggen.net droggen@gmail.com Turn pages:Drink from a glass: How to recognize activities? With sensors on the body, in objects, in the environment, … 1. Activities are represented by typical signal patterns Sensor data 2. Recognition: "comparison" between template and sensor data Drink recognized Turn page recognized Motion sensorActivity = movement Activity = sound Microphone
    • 10. © Daniel Roggen www.danielroggen.net droggen@gmail.com Characteristic Type Description Execution Offline The system records the sensor data first. The recognition is performed afterwards. Typically used for non-interactive applications such as activity monitoring for health-related applications. Online The system acquires sensor data and processes it on-the-fly to infer activities. Typically used for activity- based computing and interactive applications (HCI). Recognition Continuous The system “spots” the occurrence of activities or gestures in streaming data. It implements data stream segmentation, classification and null class rejection. Isolated / segmented The system assumes that the sensor data stream is segmented at the start and end of a gesture by an oracle. It only classifies the sensor data into the activity classes. The oracle can be an external system in a working system (e.g. cross-modality segmentation), or the experimenter when assessing classification performance during design phases. Recognition system characteristics
    • 11. © Daniel Roggen www.danielroggen.net droggen@gmail.com Activity recognition: learning by demonstration • Sensor data • 1) train activity models • 2) recognition Sensor data Recognition =? Context Activity Training data is required Activity models Model Training Training
    • 12. © Daniel Roggen www.danielroggen.net droggen@gmail.com Characteristic Type Description World model Stateless The recognition system does not model the state of the world. Activities are recognized by spotting specific sensor signals. This is currently the dominant approach when dealing with the recognition of activity primitives (e.g. reach, grasp). Stateful The system uses a model of the world, such as user’s context or environment map with location of objects. This enhances activity recognition performance, at the expense of designtime knowledge and more complex recognition system. Activity models
    • 13. © Daniel Roggen www.danielroggen.net droggen@gmail.com Assumptions • Constant sensor-signal to activity-class mapping • Design-time: identify sensor-signal/activity-class mapping – Sensor setup – Activity sets • Run-time: "low"-variability – Can't displace sensors or modify garments – Can't change the way activities are done
    • 14. © Daniel Roggen www.danielroggen.net droggen@gmail.com The activity recognition chain (ARC) • A standard set of steps followed by most research in activity recognition (e.g. [1,2,3,4]) • Streaming signal processing • Machine learning • Reasoning [1] J. Ward, P. Lukowicz, G. Tröster, and T. Starner, “Activity recognition of assembly tasks using body-worn microphones and accelerometers,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 10, pp. 1553–1567, 2006. [2] L. Bao and S. S. Intille, “Activity recognition from user-annotated acceleration data,” in Pervasive Computing: Proc. of the 2nd Int’l Conference, Apr. 2004, pp. 1–17. [3] D. Figo, P. C. Diniz, D. R. Ferreira, and J. M. P. Cardoso, “Preprocessing techniques for context recognition from accelerometer data,” Pervasive and Mobile Computing, vol. 14, no. 7, pp. 645–662, 2010. [4] Roggen et al., An educational and research kit for activity and context recognition from on-body sensors, Int. Conf. on Body Sensor Networks (BSN), 2010
    • 15. © Daniel Roggen www.danielroggen.net droggen@gmail.com Low-level activity models (primitives) Design-time: Training phase Optimize Sensor data Annotations High-level activity models Optimize Context Activity Reasoning Symbolic processing Activity-aware application A1, p1, t1 A2, p2, t2 A3, p3, t3 A4, p4, t4 t [1] Roggen et al., Wearable Computing: Designing and Sharing Activity-Recognition Systems Across Platforms, IEEE Robotics&Automation Magazine, 2011 Runtime: Recognition phase FS2 P2 S1 P1 S0 P0 S3 P3 S4 P4 S0 S1 S2 S3 S4 F1 F2 F3 F0 C0 C1 C2 Preprocessing Sensor sampling Segmentation Feature extraction Classification Decision fusion R Null class rejection Subsymbolic processing
    • 16. © Daniel Roggen www.danielroggen.net droggen@gmail.com Segmentation • A major challenge! • Find the boundaries of activities for later classification Classification Drink Turn • Methods: – Sliding window segmentation – Energy-based segmentation – Rest-position segmentation – HMM [1], DTW [2,3], SWAB [4] [1] J. Deng and H. Tsui. An HMM-based approach for gesture segmentation and recognition. In 15th International Conference on Pattern Recognition, volume 3, pages 679–682. 2000. [2] M. Ko, G. West, S. Venkatesh, and M. Kumar, “Online context recognition in multisensor systems using dynamic time warping,” in Proc. Int. Conf. on Intelligent Sensors, Sensor Networks and Information Processing, 2005, pp. 283–288. [3] Stiefmeier, Wearable Activity Tracking in Car Manufacturing, PCM, 2008 [4] E. Keogh, S. Chu, D. Hart, and M. Pazzani. An online algorithm for segmenting time series. In Proceedings of the IEEE International Conference on Data Mining, pages 289–96, 2001. • Classification here undefined – classifier not trained on "no activity" – "null class" hard to model: can be anything • Or use "null class rejection" after classification
    • 17. © Daniel Roggen www.danielroggen.net droggen@gmail.com Segmentation: sliding/jumping window • Commonly used for audio processing – E.g. 20 ms windows • or for periodic activities – E.g. walking, with windows of few seconds
    • 18. © Daniel Roggen www.danielroggen.net droggen@gmail.com Characteristic Type Description Activity kinds Periodic Nature of activities exhibiting periodicity, such as walking, running, rowing, biking, etc. Sliding window and frequency-domain features are generally used. Sporadic The activity or gesture occurs sporadically, interspersed with other activities or gestures. Segmentation plays a key role to to isolate the subset of data containing the gesture. Static The system deals with the detection of static postures or static pointing gestures. Sliding window and time-domain features are generally used. Activity characteristics
    • 19. © Daniel Roggen www.danielroggen.net droggen@gmail.com Segmentation • Energy-based segmentation [1] – Between activities the user does not move – Low energy in the acceleration signal – E.g. standard deviation of acceleration compared to a threshold • Rest-position segmentation [1] – User comes back to a rest position between gestures – Can be trained • Challenge: – Usually no 'pause' or 'rest' between activities! – Combination of segmentation and null class rejection – E.g. DTW [2] [1] Roggen et al., An educational and research kit for activity and context recognition from on-body sensors, Int. Conf. on Body Sensor Networks (BSN), 2010 [2] Stiefmeier, Wearable Activity Tracking in Car Manufacturing, PCM, 2008
    • 20. © Daniel Roggen www.danielroggen.net droggen@gmail.com Feature extraction • Compute features on signal that emphasize signal characteristics related to the activities • Tradeoffs – Reduce dimensionality – Computational complexity – Maximize separation between classes – Specificity of the features to the classes: robustness, overfitting • Some common features for acceleration data [1]: [1] Figo, Diniz, Ferreira, Cardoso. Preprocessing techniques for context recognition from accelerometer data, Pers Ubiquit Comput, 14:645–662, 2010 mean std
    • 21. © Daniel Roggen www.danielroggen.net droggen@gmail.com Car manufacturing activities Data from Zappi et al, Activity recognition from on-body sensors: accuracy-power trade-off by dynamic sensor selection, EWSN, 2008 Dataset available at: http://www.wearable.ethz.ch/resources/Dataset
    • 22. © Daniel Roggen www.danielroggen.net droggen@gmail.com Feature space: car manufacturing activities Data from Zappi et al, Activity recognition from on-body sensors: accuracy-power trade-off by dynamic sensor selection, EWSN, 2008 Dataset available at: http://www.wearable.ethz.ch/resources/Dataset Angle X, angle Y, angle Z Energy X, Energy Y, Energy Z Energy, angle X, angle Y Energy X, Energy Y Energy, angle XAngle X, angle Y
    • 23. © Daniel Roggen www.danielroggen.net droggen@gmail.com 1 = Stand; 2= Walk; 3 = Sit; 4 = Lie • Mean crossing rate of x, y and z axes, std of magnitude Feature space: modes of locomotion (FS1) Calatroni et al, Transferring Activity Recognition Capabilities between Body-Worn Motion Sensors: How to Train Newcomers to Recognize Modes of Locomotion, INSS, 2011
    • 24. © Daniel Roggen www.danielroggen.net droggen@gmail.com • Mean value of x, y and z axes, std of magnitude Feature space: modes of locomotion (FS2) 1 = Stand; 2= Walk; 3 = Sit; 4 = Lie Calatroni et al, Transferring Activity Recognition Capabilities between Body-Worn Motion Sensors: How to Train Newcomers to Recognize Modes of Locomotion, INSS, 2011
    • 25. © Daniel Roggen www.danielroggen.net droggen@gmail.com • Ratio of x and y axes, ratio of y and z axes, std of magnitude Feature space: modes of locomotion (FS3) 1 = Stand; 2= Walk; 3 = Sit; 4 = Lie Calatroni et al, Transferring Activity Recognition Capabilities between Body-Worn Motion Sensors: How to Train Newcomers to Recognize Modes of Locomotion, INSS, 2011
    • 26. © Daniel Roggen www.danielroggen.net droggen@gmail.com • Mean value of x, y and z axes, std of x, y and z axes Feature space: modes of locomotion (FS4) 1 = Stand; 2= Walk; 3 = Sit; 4 = Lie Calatroni et al, Transferring Activity Recognition Capabilities between Body-Worn Motion Sensors: How to Train Newcomers to Recognize Modes of Locomotion, INSS, 2011
    • 27. © Daniel Roggen www.danielroggen.net droggen@gmail.com Less overlapping features yield better accuracies regardless with all classifiers Classification Accuracy Feature set 1 Feature set 2 Feature set 3 Feature set 4 NCC 11-NN NCC 11-NN NCC 11-NN NCC 11-NN Knee 0.64 0.71 0.94 0.95 0.94 0.94 0.95 0.94 Shoe 0.53 0.65 0.68 0.86 0.7 0.86 0.77 0.87 Back 0.6 0.7 0.79 0.81 0.66 0.74 0.78 0.82 RUA 0.53 0.58 0.77 0.84 0.72 0.75 0.73 0.86 RLA 0.45 0.59 0.72 0.81 0.67 0.8 0.61 0.84 LUA 0.55 0.64 0.86 0.85 0.78 0.85 0.75 0.87 LLA 0.6 0.66 0.7 0.82 0.75 0.8 0.68 0.82 Hip 0.57 0.62 0.77 0.81 0.81 0.79 0.77 0.79 kNN better than NCC, more evident for more overlapping features
    • 28. © Daniel Roggen www.danielroggen.net droggen@gmail.com Feature extraction • Ideally: explore as many features as possible – Not limited to "human design space" • Evolutionary techniques to search a larger set of solutions – E.g. genetic programming [1] Förster et al., Evolving discriminative features robust to sensor displacement for activity recognition in body area sensor networks, ISSNIP, 2009 Space of all possible designs Human design space Example evolved featureCross-over genetic operator
    • 29. © Daniel Roggen www.danielroggen.net droggen@gmail.com Feature selection • Select the "best" set of features • Improve the performance of learning models by: – Alleviating the effect of the curse of dimensionality. – Enhancing generalization capability. – Speeding up learning process. – Improving model interpretability. • Tradeoffs – Select features that correlate strongest to the classification variable (maximum relevance), ... – ... and are mutually far away from each other (minimum redundancy) – Emphasize characteristics of signal related to activity – Computational complexity (minimize feature number) – Complementary – Robustness F1 F2 F3 F4 F5 F6 F7 F8 F9 [1] Peng et al., Feature selection based on mutual information-criteria of max-dependency max-relevance and min-redundancy, PAMI, 2005
    • 30. © Daniel Roggen www.danielroggen.net droggen@gmail.com Feature selection Filter methods • Does not involve a classifier but a 'filter', e.g. mutual information • + – Computationally light – General: good for a larger set of classifiers • - – Feature set may not be ideal for all classifiers – Larger subsets of features Set of candidate features Subset selection algorithm Learning algorithm Wrapper methods • Involves the classifier • + – Higher accuracy (exploits classifier's characteristics) – Can avoid overfitting with crossvalidation • - – Computationally expensive – Not general features Set of candidate features Subset evaluation Learning algorithm learning algorithm Subset selection algorithm
    • 31. © Daniel Roggen www.danielroggen.net droggen@gmail.com Sequential foward selection (SFS) • "Brute force" is not applicable! – With N candidate features: 2N feature sets to test 1. Start from an empty feature set Y0={Ø} 2. Select best feature x+ that maximize an objective function J(Yk+x+ ): x+ = argmax[J(Yk+x+ )] 3. Update feature set: Yk+1 = Yk + x+ ; k=k+1 4. Go to 2 [1] Peng et al., Feature selection based on mutual information-criteria of max-dependency max-relevance and min-redundancy, PAMI, 2005 • Works well with small number of features • Objective: measure of “goodness” of the features – E.g. accuracy
    • 32. © Daniel Roggen www.danielroggen.net droggen@gmail.com Classification • Map feature vector to a class label
    • 33. © Daniel Roggen www.danielroggen.net droggen@gmail.com Bayesian classification • F: sensor reading, features • C: activity class P(F) P(C¦F) = P(F¦C) · P(C) P(F¦C): conditional probability of sensor reading Z knowing x P(C): prior probability of class P(F): marginal probability (sum of all the probabilities to obtain F) P(C¦F): posteriori probability Bayes theorem: • With multiple sensors: conditional independence (Naive Bayes) P(F) P(C¦F1,...Fn) = P(F1,....Fn¦C) · P(C) P(F) P(F1¦C) · ... · P(Fn¦C) · P(C) = • In practice only the numerator is important (denominator is constant) • Classification with a detector: e.g. class with max posterior probability From training data
    • 34. © Daniel Roggen www.danielroggen.net droggen@gmail.com • Memory: C class centers • Classification: C comparisons • Pros: – Simple implementation – Online model update: add/remove classes, adapt class center – Fast, few memory • Cons: – Simple class boundaries – Suited when classes cluster in the feature space Nearest centroid classifier (NCC) • Simplest classification methods – No parameters – Classify to the nearest class center ? F1 F2
    • 35. © Daniel Roggen www.danielroggen.net droggen@gmail.com k-nearest neighbor (k-NN) • Simple classification methods – Instance based learning – Classify to most represented around the test point – Parameter: k – k=1: nearest neighbor (overfit) – k>>: "smoothes" noise in training data [1] Garcia et al, K-nearest neighbor search-fast GPU-based implementations and application to high-dimensional feature matching, ICIP, 2010 Figure from http://jakehofman.com/ddm/2009/09/lecture-02/ ? F1 F2 • Memory: N training points • Classification: N comparisons • Pros: – Simple implementation – Online model update (add/remove instances, classes) – Complex boundaries • Cons: – Potentially slow, or lots of memory • Some faster versions – GPGPU [1] – Kd-trees to optimize neighborhood search
    • 36. © Daniel Roggen www.danielroggen.net droggen@gmail.com Decision tree • Simple classification methods – Programmatic tree – Parameters: decision boundaries • C4.5 ? F1 F2 t1 t2 F1 F2 < t1 >= t1 < t2 >= t2 • Memory: Decision boundaries • Classification: lightweight if/else comparisons • Pros: – Simple implementation – Continuous and discrete values, symbols • Cons: – Appropriate when classes separate along feature dimensions • Or PCA – Limit the size of the tree to avoid overfitting Quinlan, J. R. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, 1993
    • 37. © Daniel Roggen www.danielroggen.net droggen@gmail.com Null-class rejection • Continuous activity recognition with sliding window segmentation – Gestures are not always present in a segment – Must be "null class" • Or confidence in the classification result is too low [1] Calatroni et al., ETHZ Tech Report, 2010 [2] I. Cohen and M. Goldszmidt, “Properties and benefits of calibrated classifiers,” in Proc. Knowledge Discovery in Databases (PKDD), 2004. NCC kNN • Many classifiers can be "calibrated" to have probabilistic outputs [2] – Statistical test / likelihood of an activity
    • 38. © Daniel Roggen www.danielroggen.net droggen@gmail.com Sliding window and temporal data structure • Activities where temporal data structure generally not important: – Walking, running, rowing, biking... – Generally periodic activities • Activities where it is important: – Open dishwasher: walk, grasp handle up, pull down, walk – Close dishwasher: walk, grasp handle down, pull up, walk – Opening or closing car door – Generally manipulative gestures – Complex hierarchical activities • Problem with some features: – Different sensor readings but identical features: μ1 = μ2 μ1 μ2 Act A Act B
    • 39. © Daniel Roggen www.danielroggen.net droggen@gmail.com Sliding window and temporal data structure • Time to space mapping • Encode the temporal unfolding in the feature vector – E.g. subwindows μ1,1 μ2,1 μ1,2 μ2,2 Act A Act B sw1 sw2 A B • Other approaches: – Hidden Markov models – Dynamic time warping / string matching – Signal predictors
    • 40. © Daniel Roggen www.danielroggen.net droggen@gmail.com t-1 Predictor (CTRNN) Error ax t ay t az t ay t-1 az t-1 ax t-1 py t pz t px t Prediction error for gesture of class 1 Gesture recognition using neural-network signal predictors • Signal: 3-D acceleration vector • Predict future acceleration vector • Operation on raw signal a t Predictor t0 t1 Time delayRaw acceleration Prediction Prediction error Class=best predictiont-1 Predictor (CTRNN) Error Prediction error for gesture of class 2 • Predictors “trained” on gesture classes • Prediction error smaller on trained class [1] Bailador et al., Real time gesture recognition using Continuous Time Recurrent Neural Networks, BodyNets, 2007
    • 41. © Daniel Roggen www.danielroggen.net droggen@gmail.com Predictor: Continuous Time Recurrent Neural Network (CTRNN) Continuous-time recurrent neural network (CTRNN) • Continuous model neurons • Fully connected network • Rich dynamics (non-linear, temporal dynamics) • Theoretically: approximation of any dynamical system • Well suited as universal predictor γi γj ωij [1] Bailador et al., Real time gesture recognition using Continuous Time Recurrent Neural Networks, BodyNets, 2007
    • 42. © Daniel Roggen www.danielroggen.net droggen@gmail.com Architecture of CTRNN Predictor • 5 neurons, fully connected • 3 inputs. Acceleration vector in previous step • “Hidden” neurons • 3 outputs. Acceleration vector in next step • Connections between neurons/inputs = State of neuron i at time t = Connection weight between neuron i and j = Connection weight of input k to neuron i = Value of input k (X,Y,Z) = Bias of neuron j = Time constant of neuron i = 0.01 secsDiscretization using Forward Euler numerical integration: [1] Bailador et al., Real time gesture recognition using Continuous Time Recurrent Neural Networks, BodyNets, 2007
    • 43. © Daniel Roggen www.danielroggen.net droggen@gmail.com Training of the signal predictors • Record instances of each gesture class • Train one predictor for each class • For each class: minimize prediction error • Genetic algorithm – Robust in complex search spaces – Representation of the parameters by a genetic string (binary string) • Global optimization of neural network parameters – Neuron interconnection weights – Neuron input weights – Time constant – Bias [1] Bailador et al., Real time gesture recognition using Continuous Time Recurrent Neural Networks, BodyNets, 2007
    • 44. © Daniel Roggen www.danielroggen.net droggen@gmail.com Genetic algorithm ···Neuron weights Input weights 6 bits Bias & Time Constant Neuron parameters: 60 bits Genetic string (5 neurons): 300 bits Fitness function • Minimize prediction error for a given class • Measured on N of a training set T (T1...TN) • Lower is better (smaller prediction error) GA Parameters • 100 individuals • Rank selection of the 30 best individuals • One-point crossover rate: 70% • Mutation rate: 1% per bit • Elitism [1] Bailador et al., Real time gesture recognition using Continuous Time Recurrent Neural Networks, BodyNets, 2007
    • 45. © Daniel Roggen www.danielroggen.net droggen@gmail.com Experiments • 8 gesture classes • Planar • Acceleration sensor on wrist • 20 instances per class (one person) • "Restricted" setup – No motion between gestures – Automatic segmentation (magnitude of the signal >1g indicates gesture) • "Unconstrained" setup – Freely moving in an office, typical activities (sitting, walking, reading …) – Manual segmentation pressing a button [1] Bailador et al., Real time gesture recognition using Continuous Time Recurrent Neural Networks, BodyNets, 2007
    • 46. © Daniel Roggen www.danielroggen.net droggen@gmail.com Results: unconstrained setup • Training: 62%-100% (80.5% average); testing: 48%-92% (63.6% average) • User egomotion Training Testing [1] Bailador et al., Real time gesture recognition using Continuous Time Recurrent Neural Networks, BodyNets, 2007
    • 47. © Daniel Roggen www.danielroggen.net droggen@gmail.com Prediction error: gesture of class A [1] Bailador et al., Real time gesture recognition using Continuous Time Recurrent Neural Networks, BodyNets, 2007
    • 48. © Daniel Roggen www.danielroggen.net droggen@gmail.com Prediction error: one instance per class [1] Bailador et al., Real time gesture recognition using Continuous Time Recurrent Neural Networks, BodyNets, 2007
    • 49. © Daniel Roggen www.danielroggen.net droggen@gmail.com Activity segmentation and classification with string matching Strings Trajectories Sensors + Signal Processing Sensors becfcca aabadca bad cfcc Templates Segments bad cfcc String Matchin g Fusion Overlap Detection Activity Spotting Filtering [1] Stiefmeier et al., Wearable Activity Tracking in Car Manufacturing, PCM, 2008
    • 50. © Daniel Roggen www.danielroggen.net droggen@gmail.com Motion encoding [1] Stiefmeier et al., Wearable Activity Tracking in Car Manufacturing, PCM, 2008 a b c d e f g h Codebook x y b b c c b b d d c c b b b b Direction Vector Trajectory
    • 51. © Daniel Roggen www.danielroggen.net droggen@gmail.com • Approximate string matching algorithm is used to spot activity occurrences in the motion string – Based on a distance measure called Levensthein or edit distance – Edit distance involves symbol operations associated with dedicated costs • substitution/replacement r • insertion i • deletion d – Crucial algorithm modification to find template occurrences at arbitrary positions within the motion string String matching [1] Stiefmeier et al., Wearable Activity Tracking in Car Manufacturing, PCM, 2008
    • 52. © Daniel Roggen www.danielroggen.net droggen@gmail.com Approximate string matching [1] Stiefmeier et al., Wearable Activity Tracking in Car Manufacturing, PCM, 2008
    • 53. © Daniel Roggen www.danielroggen.net droggen@gmail.com Spotting operation t Matching Cost C1(t) b bd cc bb kthr,1 Activity End Point Activity Start Point Spotted Segment [1] Stiefmeier et al., Wearable Activity Tracking in Car Manufacturing, PCM, 2008
    • 54. © Daniel Roggen www.danielroggen.net droggen@gmail.com String matching • + – Easily implemented in FPGAs / ASIC – Lightweight – Computational complexity scales linearly with number of templates – Multiple templates per activities • - – Need a string encoding – Hard to decide how to quantize sensor data – Online implementation requires to "forget the past"
    • 55. © Daniel Roggen www.danielroggen.net droggen@gmail.com Activity Recognition with Hidden Markov model • Markov chain – Discrete-time stochastic process – Describes the state of a system at successive times – State transitions are probabilistic – Markov property: state transition depends only on the current state – State is visible to the observer – Only parameter: state transition probabilities • Hidden Markov model – Statistical model which assumes the system being modeled is a Markov chain – Unknown parameters – State is NOT visible to the observer – But variables influenced by the state are visible (probability distribution for each state) – Observations generated by HMM give information about the state sequence 0 1 2 3 a01 a02 a12 a13 a23 a00 0 1 2 3 a01 a02 a12 a13 a23 a00 Z1Z0 b21 b20 Z2 b22
    • 56. © Daniel Roggen www.danielroggen.net droggen@gmail.com Hidden Markov model: parameters 0 1 2 3 a01 a02 a12 a13 a23 a00 aij: state transition probabilities (A={aij}) bij: observation probabilities (B={bij}) Π: initial state probabilities N: number of states (N=4) M: number of symbols (M=3) X: State space, X={x1, x2, x3...} Z: Observations, Z={z1, z2, z3...} a00 a01 a02 a03 a10 a11 a12 a13 a20 a21 a22 a23 a30 a31 a32 a33 b00 b01 b02 b10 b11 b12 b20 b21 b22 b30 b31 b32 Π0 Π1 Π2 Π3 λ(A,B,Π): HMM model Z1Z0 b21 b20 Z2 b22
    • 57. © Daniel Roggen www.danielroggen.net droggen@gmail.com Hidden Markov model: 3 main questions Find most likely sequence of states generating Z: {xi}T • Model parameters λ known, output sequence Z known • Viterbi algorithm HMM training: find the HMM parameters λ • (Set of) Output sequence(s) known • Find the observation prob., state transition prob., .... • Statistics, expectation maximization: Baum-Welch algorithm Find probability of output sequence: P(Z¦ λ) • Model parameters λ known, output sequence Z known • Forward algorithm
    • 58. © Daniel Roggen www.danielroggen.net droggen@gmail.com • Waving hello (by raising the hand) – Raising the arm – Lowering the arm immediately after • Handshake – Raising the arm – Shaking – Lowering the arm Handraise v.s. Handshake • Measurements: angular speed of the lower arm at the elbow • Only 3 discrete values: – <0, negative angular speed – =0, zero angular speed – >0, positive angular speed α > > > > < < < = < < > > = < = = < > < < > < < < < < = < < < Handraise Handshake ?
    • 59. © Daniel Roggen www.danielroggen.net droggen@gmail.com Classification with separate HMMs • Train HMM for each class (HMM0, HMM1, ....) with Baum-Welch – HMM models the gestures • Classify a sequence of observations – Compute the probability of the sequence with each HMM • Forward algorithm: P(Z / HMMi). – Consider the HMM probability as the a priori probability for the classification • In general the class corresponds to the HMM with the highest probability C=0,1 Gesture 0 Gesture 1 HMM0 HMM1 P(G=0) P(G=1) MaxGesture Gesture 2 Gesture 3 HMM2 P(G=2) HMM3 P(G=3) C=0,1,2,3 Training / testing dataset Likelihood estimation Classification w/maximum likelihood
    • 60. © Daniel Roggen www.danielroggen.net droggen@gmail.com Validation of activity recognition [1] • Recognition performance – Confusion matrix – ROC curve – Continuous activity recognition measures – Latency • User-related measures – Comfort / user acceptance – Robustness – Cost • Processing-related measures – Computational complexity, memory – Energy • ... application dependent! [1] Villalonga et al., Bringing Quality of Context into Wearable Human Activity Recognition Systems, First International Workshop on Quality of Context (QuaCon), 2009
    • 61. © Daniel Roggen www.danielroggen.net droggen@gmail.com Performance measures: Confusion matrix • Instance based • Indicates how an instance is classified / what is the true class • Ideally: diagonal matrix • TP / TN: True positive / negative – correctly detected when there is (or isn't) an activity • FP / FN: False positive / negative – detected an activity when there isn't, or not detected when there is • Substitution: correctly detected, but incorrectly classified [1] Villalonga et al., Bringing Quality of Context into Wearable Human Activity Recognition Systems, First International Workshop on Quality of Context (QuaCon), 2009
    • 62. © Daniel Roggen www.danielroggen.net droggen@gmail.com Performance measures: ROC curve • Receiver operating characteristic • Indicates classifier performance when a parameter is varied – E.g. null class rejection threshold • True positive rate (TPR) or Sensitivity – TPR = TP / P = TP / (TP + FN) • True negative rate – FPR = FP / N = FP / (FP + TN) – Specificity = 1 − FPR [1] Villalonga et al., Bringing Quality of Context into Wearable Human Activity Recognition Systems, First International Workshop on Quality of Context (QuaCon), 2009
    • 63. © Daniel Roggen www.danielroggen.net droggen@gmail.com Performance measures: online activity recognition • Problem with previous measures: suited for isolated activity recognition – I.e. the activity is perfectly segmented • Does not reflect performance of online (continuous) recognition • Ward et al introduce [2]: – Overfill / underfill: activities detected as longer/shorted than ground truth – Insertion / deletions – Merge / fragmentation / substitutions [1] Villalonga et al., Bringing Quality of Context into Wearable Human Activity Recognition Systems, First International Workshop on Quality of Context (QuaCon), 2009 [2] Ward et al., Performance metrics for activity recognition, ACM Transactions on Information Systems and Technology, 2(1), 2011 from [2] from [1]
    • 64. © Daniel Roggen www.danielroggen.net droggen@gmail.com Validation Entire dataset Training / evaluation Train set Test set • Optimization of the ARC on the train set • Includes feature selection, classifier training, null class rejection, etc • Never seen during training • Assess generalization • Used only once for testing • (otherwise, indirectly optimizing on test set) Cross-validation Fold 1 Fold 2 Fold 3 Fold 4 • 4-fold cross-validation • Assess whether results generalize to independent dataset
    • 65. © Daniel Roggen www.danielroggen.net droggen@gmail.com Validation • Leave-one-out cross-validation: – Train on the entire samples minus one – Test on the left-out sample • In wearable computing, various goals: – Robustness to multiple user (user-independent) – Robustness to multiple sensor placement (placement-independent) – ... Leave out Assess performance Person User-independent Day, week, ... Time-independent (e.g. if the user can change behavior over time) Sensor placement Sensor-placement-independent Sensor modality Modality-independent ... ...
    • 66. © Daniel Roggen www.danielroggen.net droggen@gmail.com For further reading ARC • Roggen et al., Wearable Computing: Designing and Sharing Activity-Recognition Systems Across Platforms, IEEE Robotics&Automation Magazine, 2011 Activity recognition • Stiefmeier et al, Wearable Activity Tracking in Car Manufacturing, PCM, 2008 • J. Ward, P. Lukowicz, G. Tröster, and T. Starner, “Activity recognition of assembly tasks using body-worn microphones and accelerometers,” • IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 10, pp. 1553–1567, 2006. • L. Bao and S. S. Intille, “Activity recognition from user-annotated acceleration data,” in Pervasive Computing: Proc. of the 2nd Int’l Conference, Apr. 2004, pp. 1–17. • D. Figo, P. C. Diniz, D. R. Ferreira, and J. M. P. Cardoso, “Preprocessing techniques for context recognition from accelerometer data,” Pervasive and Mobile Computing, vol. 14, no. 7, pp. 645–662, 2010. • Roggen et al., An educational and research kit for activity and context recognition from on-body sensors, Int. Conf. on Body Sensor Networks (BSN), 2010 Classification / Machine learning / Pattern recognition • Duda, Hart, Stork, Pattern Classification, Wiley Interscience, 2000 • Bishop, Pattern recognition and machine learning, Springer, 2007 (http://research.microsoft.com/en-us/um/people/cmbishop/prml/) Performance measures • Villalonga et al., Bringing Quality of Context into Wearable Human Activity Recognition Systems, First International Workshop on Quality of Context (QuaCon), 2009 • Ward et al., Performance metrics for activity recognition, ACM Transactions on Information Systems and Technology, 2(1), 2011
    • 67. © Daniel Roggen www.danielroggen.net droggen@gmail.com

    ×