6. Applications Human Behaviour analysis Human posture recognition Video surveillance systems Aware house applications Virtual reality Intelligent user interfaces Sport monitoring
14. Previous work – video sensor Low processing time Independent from the viewpoint Hybrid approach = 2D approaches + 3D approaches High processing time Dependence from the viewpoint Drawbacks Independent from the viewpoint Low processing time Advantages 3D approaches 2D approaches
29. Silhouette Comparison Classification of 2D methods to represent silhouettes --- + Distance transform -- --- Shape from context -- + Skeletonisation - ++ H. & V. projections -- ++ Geometric features -- ++ Hu moments Independence from the silhouette quality Computation rapidity 2D methods
44. Real video – own sequences Current image Binary image Detailed postures General postures not filtered filtered
45. Real video – own sequences General posture recognition rate (%) for the different silhouette representations with “ Watershed algorithm” 93 78 89 100 H. & V. projections 65 82 68 93 Skeletonis-ation 35 27 73 68 Hu moments 83 77 82 94 Geometric features Lying Bending Sitting Standing
46.
47. Real video – gait sequence 78/81 postures correctly recognised New posture of interest: the walking posture Recognised posture Ground-truth posture Recognised postures 2=standing posture 3=walking posture
48. Real video – gait sequence 162/186 (87%) postures correctly recognised For the 5 sequences: 711/911 (78%) postures are correctly recognised Recognised posture Ground-truth posture Recognised postures 2=standing posture 3=walking posture
49.
50. Action recognition – the fall Standing 3 ∞ Bending or sitting 0 10 Lying 3 ∞ Based on general postures 0 0 10 Recognised falling action FN FP TP
51. Action recognition – the walk Standing with arms near the body 2 10 Walking 3 15 Based on detailed postures 3 0 62 Recognised walking action FN FP TP
61. Proposed approach Video stream People detection Contextual Knowledge base People tracking Silhouette 3D position Identifier Posture detector Posture filter Recognised posture Behaviour analysis
77. Proposed approach Posture filter Object segmentation Object classification Person tracking People detection Behaviour analysis Detected silhouette Identifier Filtered posture Posture detector Camera parameters
78.
79. Posture detector – silhouettes generation Camera parameters 3D posture avatars 3D silhouette generator 3D position Virtual camera Generated silhouettes
80. Posture detector – silhouette generation Camera parameters 3D posture avatars 3D silhouette generator 3D position Virtual camera Generated silhouettes
General posture recognition rate authorises that postures belonging to the same general posture can be mixed Detailed posture recognition rate differentiates each detailed postures
We will focus in the next to rotation steps: 36, 45, 90 Representation and comparison times are negligible compared to generation time by considering rotation step superior to 36 degrees. Representation and comparison times are similar for the others repreentations
The GPRR are superior to the DPRR. A rotation step of 36 degrees gives the best recognition rates. The best recognition rates are obtained with the H. & V projections. Hu moments give the worst results. This happens because of the invariance property of this representation. In particular the orientation invariance. For example a standing posture can be mixed with a lying posture.
We are also interested by the problem of intermediate postures which are postures between two postures of interest. We can see on this example the video sequence of a person down her left arm. This video is constituted of two postures of interest: standing with one arm up and standing with arms near the body. We hope to recognise the succession of the three postures: standing, one arm up and standing. The recognition are displaying on the different graphics for each 2D approaches, on the left without temporal filtering and on the right with the temporal filtering. First we can remark that the H.&V. representation recognises correctly the succession of the 3 postures even with no filtering. Second we see that temporal filtering correct wrong recognitions the other representations. Moreover we see that for the Hu moments representation, standing postures is mixed with lying postures.
We have also used synthetic video to identify the ambiguous cases. For example we can see in the table how the T-shape posture is recognised for a given view point.
This graphical interface is composed of 3 parts: The filtered postures can then be used for behaviour analysis.
This table represents the general posture recognition rates for the different 2D approaches according to the watershed algorithm. H & V projections gives the best recognition rates, followed by the geometric features. The recognition is correct with rates superior to 80%. We can notice that Hu moments representation does not work correctly, in particular as seen previously because of the invariance on orientation, and also because when a hole occurs in the silhouette, this error is on all the terms of the Hu moment. Similar results are obtained with the VSIP algorithm. In the next we will focus on H & V projections representation.
We see here the recognition of the detailed postures. Recognition rates are similar for the both segmentations, except for sitting on a chair posture The recognition rates are quite good from 70 up to 80 %
We have also tested our approach on other kind of video sequences. In particular, we are interested in video sequences involved in gait analysis. For this purpose we have introduced a new posture of interest: the walking posture. During the recognition we plan to recognise succession of standing with arms near the body and walking posture. In this video, the silhouettes obtained are good since there is a big contrast between the person and the background. We can see on the graph that the postures are well recognised, and in particular that the gait cycle are well detected
We have also tested our approach on video sequences acquired for the gait competition. In the video the person walk from the right to the left, and the left to the right on a semi ellipse. Even if the silhouettes are noisy, the postures and the gait cycles are well recognised except for a finite cases. On the different videos we have tested … are correctly recognised on … total postures.
Our proposed approach has also been tested for action recognition. We focus on self action i.e. action where only one person is involved.
The first action we have recognised is the fall, which is an important action for medical purpose. For example it can be used for helping elderly person at home. The fall action is characterised by the transition between a standing posture and a lying one. We can see on the video that the falling action is well recognised. Since it is based on genera postures and since these postures are well recognised the action is also well recognised.
The second action we have tested is the walking action. The tests are realised on the sequences taken from the gait challenge competition. The table shows the number of gait cycles correctly recognised. The action is correctly recognised except for a finite number of cases.
In conclusion we can say that the properties highlighting with the synthetical data are verified with the real data. In particular …. The Hu moments are definitely not adapted to or approach. Finally the processing time …
In conclusion, our approach is able to recognise 9 detailed postures which correspond to 4 general postures. The approach have been successfully tested for different type of silhouettes. It has also been tested for self-action recognition. We have identified 4 constraints in the beginning of the introduction. Some work in automated approach and in real time