Privacy Protection in Video Surveillance Systems using Scalable Video Coding


Published on

Privacy Protection in Video Surveillance Systems using Scalable Video Coding.

Paper presented at AVSS 2009 in Genova, Italy.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Privacy Protection in Video Surveillance Systems using Scalable Video Coding

  1. 1. Privacy Protection in Video Surveillance Systems using Scalable Video Coding Hosik Sohn1, Esla T. AnzaKu1, Wesley De Neve1, Yong Man Ro1, and Konstantinos N. Plataniotis2 1Image and Video Systems Lab, Department of Electrical Engineering, KAIST, Daejeon, Korea 2Multimedia Lab, Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada I. INTRODUCTION Layer key generation for scalable video SVC can be used to fulfill the requirement of omnipresence in Let S be a secret key. Then, surveillance systems. In this paper, we propose a privacy-protected K(1,1) = S1,1||S1,2 Spatial layer 1 video surveillance system that makes use of Scalable Video Coding (SVC). To address privacy concerns, face regions are detected and where H represents a cryptographic subsequently scrambled in the compressed domain. K(1,0) = S1,1||S0,2 K(0,1) = S0,1||S1,2 Spatial layer 0 hash function and where “||” represents the concatenation II. PROPOSED VIDEO SURVEILLANCE SYSTEM operator. The parameter k either K(0,0) = S0,1||S0,2 denotes spatial scalability (k=1) or temporal scalability (k=2). Temporal layer 0 Temporal layer 1 A sub-key is computed as, PDA Figure 7. Use of layer keys and sub-keys A layer key for the dth spatial and the ith temporal layer is defined as follows: Surveillance camera Video analyzer & SVC encoder Storage Transmitter PC III. Experiments The video sequence used is “moinas_toni”. This video sequence is Cellular phone designed for single person/face visual detection and tracking. Figure 1. Proposed video surveillance system Encoding was done using two spatial layers {QVGA and VGA} and two temporal layers {12.5 fps and 25 fps}. At VGA resolution, the average 1. Moving ROIs in SVC size of the detected face region is 78 79 pixels (over 200 frames). ROI representation Slice group 1 1. Experimental results ROI : privacy-sensitive area (i.e., a face region) 1) The percentage of increased bits in the ROI for a varying picture FMO type 2 : allows defining rectangular ROIs Slice group 0 quality at 25 fps (foreground / background) Table 1. Percentage of increased bits in the ROI Updating ROIs in a coded bit stream Figure 2. Illustration of FMO type 2 Size / QP 25 28 31 34 37 40 VGA -0.36 -1.89 -0.07 -0.08 -1.42 -3.42 In order to update the location of an ROI in a coded video bit stream, QVGA 0.77 0.77 -0.04 -2.09 -3.88 0.04 additional PPS NAL units need to be inserted. 2) Privacy-protected surveillance video sequence A PPS (Picture Parameter Set) contains the parameters that define an ROI, such as slice group map type, the slice group IDs, and the top-left and bottom-right address of the slice groups. Supplemental enhancement information Video Coding Layer (VCL) NALUs IDR Non-IDR IDR IDR and parameter set NALUs ·· · ·· · ·· · ·· · NALU NALU NALU NALU SEI SPS PPS IDR Non-IDR Insertion of PPS PPS PPS ·· · ·· · ·· · NALU NALU NALU NALU NALU NAL units NALU NALU Figure 3. SVC bit stream structure Figure 4. Insertion of PPS NAL units (b) (a) Figure 8. Privacy-protected surveillance video sequence at (a) QVGA and (b) VGA resolution 2. Privacy protection techniques 3) Simulation of 4) Level of protection against ROI scrambling error concealment attack brute force attack Enhancement For faces with a high layer coding spatial resolution, only Secret key scrambling AC transform Surveillance - ROI coefficients in the luma video, T/Q scrambling CAVLC ROI coordinates domain cannot sufficiently Q-1/ T-1 Compressed distort the silhouette of the bitstream (a) (b) + face. Figure 9. Simulation of Intra pred. (a) the proposed scrambling technique (b) an error concealment attack MC pred. ME Figure 10. The average number of CAVLC levels in the ROI according to the bit rate at IV. CONCLUSIONS QVGA and VGA resolution (a) (a) (b) Figure 5. Simulation of Secret key We proposed a privacy-protected video surveillance system that (a) random sign inversion for luma Extracted ROI + makes use of the SVC standard. For the purpose of privacy protection, CAVLC-1 Q-1/ T-1 MC AC coefficients at 4CIF resolution bitstream unscrambling face regions were detected and subsequently encoded as an ROI (b) an error concealment attack Surveillance video using FMO type 2. ROIs are scrambled using random sign inversion, (b) having a negligible impact on the coding performance of CAVLC Figure 6. Scrambling in SVC: (a) encoder; (b) decoder entropy coding. Further, for each layer in an SVC bit stream, the ROI is scrambled using a different key in order to ensure a high level of In the proposed system, random sign inversion is applied to non-zero protection. Finally, the proposed video surveillance system shows DC and AC transform coefficients of the luma and chroma components. reasonable robustness against brute-force attacks.