COMLAB
                                       Multimedia Arts & Technologies
                                                             Patrizio CAMPISI
                                                                 Marco CARLI
                                                         Emanuele MAIORANA
                                                            Federica BATTISTI
MULTIMEDIA INFORMATION PROCESSING                          Anna Maria VEGNI
                                                             Veronica PALMA
                                                                   Marco LEO
                IN                                            Mauro UGOLINI
                                                            Marina SALATINO

        SMART ENVIRONMENTS                                     Elena MAMMI
                                                                   Paolo SITA’
                                                            Luca COSTANTINI
                                                              Daria LA ROCCA
                 Alessandro Neri


                Engineering Department
                University of “Roma Tre”,
     Via della Vasca Navale 84, 00146 Roma, Italy
                   neri@uniroma3.it
Outline

•   Introduction

•   Smart Environments

•   Feature Extraction

•   Object recognition

•   Distributed Video coding for multiple sources

•   New Imaging Techniques

•   Conclusions
SMART ENVIRONMENT
SMART ENVIRONMENT
insieme di tecnologie basate su una forte integrazione tra
• apparati sensoriali,
• sistemi distribuiti di elaborazione
• tecnologie delle comunicazioni,
che dà luogo ad ambienti (casa, ufficio, ecc.) i cui servizi si
adattano alle condizioni ambientali ed essendo in grado di
reagire opportunamente alla presenza di persone sono in grado
di produrre stimoli e interagire proattivamente con esse, ovvero
anticipandone i desideri senza una mediazione cosciente, al fine
di migliorare la qualità della vita.
SMART ENVIRONMENT
SMART ENVIRONMENT
insieme di tecnologie basate su una forte integrazione tra
• apparati sensoriali,
• sistemi distribuiti di elaborazione
• tecnologie delle comunicazioni,
che dà luogo ad ambienti (casa, ufficio, ecc.) i cui servizi si
adattano alle condizioni ambientali ed essendo in grado di
reagire opportunamente alla presenza di persone sono in grado
di produrre stimoli e interagire proattivamente con esse, ovvero
anticipandone i desideri senza una mediazione cosciente, al fine
di migliorare la qualità della vita.



                    INFORMATION PROCESSING CHAIN

              Filtering &           Parameter              Feature      Semantic
              Denoising             estimation            extraction    Analysis
Image Analysis

•       Need for
    –      an efficient and parsimonious representation of the various relevant
           components of a natural scene such as edges and textures (non
           achievable by means of a unique, non-redundant system).
•       Approach
    –      Adaptation of the basis to the local image contents, by selecting the
           elements from an highly redundant set (wave-form dictionary)
•       Critical elements
    –      dictionary setup
    –      construction of the best local representation (Minimum Description
           Length).
•       Objective
    –      local expansion
    –      efficiently approximated by a few wave-forms based on specific patterns
           of visual relevance (edges, lines, crosses, etc.) whose scale, position and
           orientation can be varied in a parametric way
Gauss-Laguerre Wavelets

Filters   n(r,   )   n = 1, k = 0   n = 2, k = 0   n = 3, k = 0   n = 4, k = 0


  Real part




  Imaginary
    part

                                                                                 1.0


                                                                                 0.5


                                                                                 0.0
 Test image            Edges          Lines        Y-crosses      X-crosses
Surround Inhibition




        Input image               Desired output           Canny edge detector
                                                                 output
•   Natural images may contain both texture and noise
•   Local luminance changes: strong on texture, weak on contours

• Task: suppression of edges due to noise only
•   Human Visual System (HVS) easily discriminates between texture, noise and
    contours
Multiscale Contour Detector
        Output of the Canny edge detector for different scales
                                                            Destroyed junction
                                                              Restored
                                        • Morphological dilation
                                        • Superposition and logic AND




Fine scale (small )                    Coarse scale (large )
   Texture residuals                       Texture residuals
   Well detailed contours                  Well detailed contours
   Preserved Junctions                     Preserved Junctions
Numerical results

Noisy input    Proposed
image          approach
(SNR = 13dB)




  Canny        CARTOON
Results and Comparison




Noisy input image   Proposed approach      Canny
 (SNR = 13dB)




                    Surround inhibition   CARTOON
Results and Comparison




Noisy input image   Proposed approach      Canny
 (SNR = 13dB)




                    Surround inhibition   CARTOON
Object Recognition- Video Browsing



              Image           Ranked Image
              Storing          Collection




                                                 Query Image
                                                 Submission
 Features
Extraction       Image DB

                                    Similarity     Features
                Features DB        Measurement    Extraction
Analisi Multiviste
Key points extraction
Key point matching (invariant with respect scale rotation perspective changes)




                      log2 σ




                               y
                                      L. Sorgi, A. Neri. Keypoints Selection in the Gauss
                                      Laguerre Transformed Domain - BMVC06
   x
KEYPOINTS SELECTION: SYSTEM OUTLINE




                         Pre-processing
 Smoothing and color
         conversion
                          Scalogram
                           building


                          Scalogram
Keypoints scale-space     inspection
              location

                          Descriptors
                          construction

                          Descriptors
Keypoints descriptors    normalization
Image festures
• 2D Patterns: based on Zernike polinomials expansion.

                                                         j
                                              f x
                                                              i
                                                             x0


• Texture: Laguerre-Gauss local expansions hystograms
• Edge: relative phase of Laguerre-Gauss expansions
Position, orientation, and scale estimation


• Extensive retrieval experiments making use of quadtree
  decomposition combined with Gauss-Laguerre CHFs, as well as on
  Zernike's CHF have been performed on the Corel-1000-A Database.




• The average percentage of recovered relevant images is greater
  than 0.96 while the other methods attain at the maximum 0.87 (global
  search)
Distribute Video Coding
Experimental results
        ‘’Breakdancer’’ multiview sequence.
        Source: Veronica Palma, PhD Thesis


                    50

                    48
                                   MDVC_Zernike
                    46
                                   H.264/AVC
                    44
                                   Encoder driven fusion
                                   [1]
                    42
        PSNR (dB)




                    40

                    38

                    36

                    34

                    32

                    30
                              80                            200                            300                            800
                                                                           Kbit/s

[1] M. Ouaret, F. Dufaux and T. Ebrahimi, ‘’ MULTIVIEW DISTRIBUTED VIDEO CODING WITH ENCODER DRIVEN FUSION ‘’. In EUSIPCO Proceedings, 2007

[2]M. Ouaret, F. Dufuax, and T. Ebrahimi. ‘’Recent advances in multi-view distributed video coding’’. In SPIE Mobile Multimedia/Image Processing for
Military and Security Applications, April 2007.
Experimental Results
Objective Video Quality Assessment
Plenoptic cameras

• Misurazione e codifica
  dell’intensità del
  campo ricevuto da
  una data direzione (ad
  una data lunghezza
  d’onda)
PLENOPTIC CAMERA
              Single
           exposure.
            Different
          processing
Plenoptic processing
» Grazie per l’Attenzione
Estrazione e interpretazione di interazioni sociali

Elettronica: Multimedia Information Processing in Smart Environments by Alessandro Neri

  • 1.
    COMLAB Multimedia Arts & Technologies Patrizio CAMPISI Marco CARLI Emanuele MAIORANA Federica BATTISTI MULTIMEDIA INFORMATION PROCESSING Anna Maria VEGNI Veronica PALMA Marco LEO IN Mauro UGOLINI Marina SALATINO SMART ENVIRONMENTS Elena MAMMI Paolo SITA’ Luca COSTANTINI Daria LA ROCCA Alessandro Neri Engineering Department University of “Roma Tre”, Via della Vasca Navale 84, 00146 Roma, Italy neri@uniroma3.it
  • 2.
    Outline • Introduction • Smart Environments • Feature Extraction • Object recognition • Distributed Video coding for multiple sources • New Imaging Techniques • Conclusions
  • 3.
    SMART ENVIRONMENT SMART ENVIRONMENT insiemedi tecnologie basate su una forte integrazione tra • apparati sensoriali, • sistemi distribuiti di elaborazione • tecnologie delle comunicazioni, che dà luogo ad ambienti (casa, ufficio, ecc.) i cui servizi si adattano alle condizioni ambientali ed essendo in grado di reagire opportunamente alla presenza di persone sono in grado di produrre stimoli e interagire proattivamente con esse, ovvero anticipandone i desideri senza una mediazione cosciente, al fine di migliorare la qualità della vita.
  • 4.
    SMART ENVIRONMENT SMART ENVIRONMENT insiemedi tecnologie basate su una forte integrazione tra • apparati sensoriali, • sistemi distribuiti di elaborazione • tecnologie delle comunicazioni, che dà luogo ad ambienti (casa, ufficio, ecc.) i cui servizi si adattano alle condizioni ambientali ed essendo in grado di reagire opportunamente alla presenza di persone sono in grado di produrre stimoli e interagire proattivamente con esse, ovvero anticipandone i desideri senza una mediazione cosciente, al fine di migliorare la qualità della vita. INFORMATION PROCESSING CHAIN Filtering & Parameter Feature Semantic Denoising estimation extraction Analysis
  • 5.
    Image Analysis • Need for – an efficient and parsimonious representation of the various relevant components of a natural scene such as edges and textures (non achievable by means of a unique, non-redundant system). • Approach – Adaptation of the basis to the local image contents, by selecting the elements from an highly redundant set (wave-form dictionary) • Critical elements – dictionary setup – construction of the best local representation (Minimum Description Length). • Objective – local expansion – efficiently approximated by a few wave-forms based on specific patterns of visual relevance (edges, lines, crosses, etc.) whose scale, position and orientation can be varied in a parametric way
  • 6.
    Gauss-Laguerre Wavelets Filters n(r, ) n = 1, k = 0 n = 2, k = 0 n = 3, k = 0 n = 4, k = 0 Real part Imaginary part 1.0 0.5 0.0 Test image Edges Lines Y-crosses X-crosses
  • 7.
    Surround Inhibition Input image Desired output Canny edge detector output • Natural images may contain both texture and noise • Local luminance changes: strong on texture, weak on contours • Task: suppression of edges due to noise only • Human Visual System (HVS) easily discriminates between texture, noise and contours
  • 8.
    Multiscale Contour Detector Output of the Canny edge detector for different scales Destroyed junction Restored • Morphological dilation • Superposition and logic AND Fine scale (small ) Coarse scale (large ) Texture residuals Texture residuals Well detailed contours Well detailed contours Preserved Junctions Preserved Junctions
  • 9.
    Numerical results Noisy input Proposed image approach (SNR = 13dB) Canny CARTOON
  • 10.
    Results and Comparison Noisyinput image Proposed approach Canny (SNR = 13dB) Surround inhibition CARTOON
  • 11.
    Results and Comparison Noisyinput image Proposed approach Canny (SNR = 13dB) Surround inhibition CARTOON
  • 12.
    Object Recognition- VideoBrowsing Image Ranked Image Storing Collection Query Image Submission Features Extraction Image DB Similarity Features Features DB Measurement Extraction
  • 13.
    Analisi Multiviste Key pointsextraction Key point matching (invariant with respect scale rotation perspective changes) log2 σ y L. Sorgi, A. Neri. Keypoints Selection in the Gauss Laguerre Transformed Domain - BMVC06 x
  • 14.
    KEYPOINTS SELECTION: SYSTEMOUTLINE Pre-processing Smoothing and color conversion Scalogram building Scalogram Keypoints scale-space inspection location Descriptors construction Descriptors Keypoints descriptors normalization
  • 15.
    Image festures • 2DPatterns: based on Zernike polinomials expansion. j f x i x0 • Texture: Laguerre-Gauss local expansions hystograms • Edge: relative phase of Laguerre-Gauss expansions
  • 16.
    Position, orientation, andscale estimation • Extensive retrieval experiments making use of quadtree decomposition combined with Gauss-Laguerre CHFs, as well as on Zernike's CHF have been performed on the Corel-1000-A Database. • The average percentage of recovered relevant images is greater than 0.96 while the other methods attain at the maximum 0.87 (global search)
  • 17.
  • 18.
    Experimental results ‘’Breakdancer’’ multiview sequence. Source: Veronica Palma, PhD Thesis 50 48 MDVC_Zernike 46 H.264/AVC 44 Encoder driven fusion [1] 42 PSNR (dB) 40 38 36 34 32 30 80 200 300 800 Kbit/s [1] M. Ouaret, F. Dufaux and T. Ebrahimi, ‘’ MULTIVIEW DISTRIBUTED VIDEO CODING WITH ENCODER DRIVEN FUSION ‘’. In EUSIPCO Proceedings, 2007 [2]M. Ouaret, F. Dufuax, and T. Ebrahimi. ‘’Recent advances in multi-view distributed video coding’’. In SPIE Mobile Multimedia/Image Processing for Military and Security Applications, April 2007.
  • 19.
  • 20.
    Plenoptic cameras • Misurazionee codifica dell’intensità del campo ricevuto da una data direzione (ad una data lunghezza d’onda)
  • 21.
    PLENOPTIC CAMERA Single exposure. Different processing
  • 22.
  • 23.
    » Grazie perl’Attenzione
  • 24.
    Estrazione e interpretazionedi interazioni sociali