MPEG 4:   The ultimate low bit rate format www.chiariglione.org/mpeg/
Overview: ISO/IEC  14496 Coding of audio-visual objects   Low bit rate multimedia system - typically less than MPEG 2 video Object based - each element is coded separately Open ended system that can continue to develop in the future Scalable & Interactive Version 1: October 1998
Versions Version 1 Version 2
Profiles Quality Complexity DVD Video CD Mobiles HDTV Digital cinema MPEG-1 MPEG-2 Advanced Simple Profile Simple Profile MPEG 4
Current uses: 3G mobile phones, Portable devices, PDAs, video iPod Interactive television / IPTV New interactive multimedia formats Web pages Interactive music format Security systems
Basics: Object based system: using  Natural  and/or  Synthetic  objects Makes use of local processing power to recreate sounds and images This makes it one of the most efficient compression systems
Basics:   Object Types Photos - JPE, GIF, PNG, Video - MPEG-2, Divx, AVI, H.264, QuickTime Speech - CELP, HVXC, Text to Speech Music - AAC, MP3, surround Synthetic music Graphics - Java code Text Animated objects, e.g., talking heads
Basics: The selected objects are put together into a 2D or 3D scene  In 3D the viewer can change the shape of the image and view it from other positions in the 3D space. Similar to VRML Each object is compressed using the best and most optimum method for that type of data
Basics: Virtual Studio Virtual production techniques are increasingly used in TV production  Well known chroma-key method using a blue/green screen background Actors are overlaid on to a ‘virtual studio’ background image The composition of the screen image and the sound can now take place in the decoder at home
 
BIFS: Binary Format for Scene descriptor New description language like HTML but written in binary not in English  Has scalable levels for audio and video which can be set by access rights or by interrogation of the receiver to set the best ‘Quality of Service’ (QoS)
Synchronised Streaming Each element can be time stamped to synchronise with other objects in the frame   Flexi Time:  The viewer can vary the time for playback  There are 3 types, set by the producer, minimum, maximum and optimal Audio can be set to change pitch or stay fixed
Compression : Speech HVXC - Harmonic Vector Excitation Coding CELP - Code Excited Linear Prediction 2 - 24 K bit/sec Synthesised speech:   Text to speech synthesis, 200-1200 bit/sec Very low delay, 20 ms, for video phone use MP3 takes too long to encode/decode
Compression : Natural Audio MPEG - AAC (Advanced Audio Coding) MP3, AAC, 5.1 surround 6 - 380 K bit/sec
Parametric audio coding Synthesised audio :  Spectral noise re-synthesised  This process separates unique audio sounds from predictable noise shapes which can then be re-synthesised locally   Signal is represented by three objects: Transients: localized in time, Sinusoids: localized in frequency, Noise: no strict localization…
Parametric audio coding Transients: Castanets
Parametric audio coding Sinusoids: Harpsichord
Parametric audio coding Noise: Heavy metal
Compression : Structured Audio  SAOL  - Structured Audio Orchestra Language (pronounced sail) Down loadable sound fonts Wavetable synth + GM2 type spec. Any kind of virtual instruments Virtual effects algorithms and mixers MIDI data rates e.g. 300 bit/sec
Interactive  Audio  Download and remix tracks Flash interface and compressed audio loops www. yourspins .com
Compression : Video Overall MPEG-4 supports a wide range of standards from very small, poor quality, pictures up to HDTV  MPEG 1 & 2 MPEG 4 - part 2 MPEG 4 - part 10, 'Advanced Video Coding’ AVC H.264 -  high quality video codec Developed jointly by MPEG and the ITU
Compression : Video - H.264 Half to one quarter of the normal bit rate of MPEG-2 Scalable from 3G to HD More advanced B frame operation where the frame can link to any frame in the video sequence Smaller 4X4 grids of pixels 4 Profiles and 16 Levels Bit rates from 64 Kbps - 240 Mbps
Compression : Video Mixture of pixel based and vector graphics Video is no longer a rectangular shape, it can be any shape Synthetic images with bit rates from  5 kbit/sec - 10 Mbit/sec Supports the mapping of video textures onto moving objects and meshes
2D mesh model of a fish By deforming the mesh the fish can be animated
Animated Objects: The Animation Framework eXtension, AFX Face animation: The face models are not part of MPEG-4 only the movement codes for the expressions, eye movement, etc. Body animation works in the same way and can be used in games
Future Options: MPEG-4 is still being developed and all new parts will work with the old formats Studio quality versions for HDTV Digital cinema 45 - 240 Mbit/sec H.264 Home video cameras with MPEG-4 output straight to the web from the hard drive
Future Options: Integrated Service Digital Broadcast  (ISDB) Newspaper + TV + data Integration with MPEG 7 databases Games with 3D texture mapping
Future Options: TeleVision Modelling Language  (TVML) Computer generated TV programs + presenters - Max Headroom??
Future Options: Information booths Talking objects - fridge, cars, toaster? Security cameras over the web Interactive manuals and training materials New downloadable interactive music format, SAOL
MPEG 7
MPEG 7 Multimedia Content Descriptor Standard Database system to automatically define, organise and search for text, pictures, sound FX, graphics, video clips, songs, music, etc. On-line Music library Automatic identification of music Uses XML to store metadata
MPEG 7 Proposed uses: Live broadcast monitoring, radio output Digital libraries e.g., image catalogue, musical dictionary, bio-medical imaging, sound FX, film, video and radio archives Cultural services history museums, art galleries, etc.
MPEG 7 Home entertainment e.g., systems for the management of personal multimedia collections, e.g. music, home video, searching a game, karaoke E-Commerce e.g., personalised advertising, on-line catalogues, directories of e-shops Education e.g., repositories of multimedia courses, multimedia search for support material
MPEG 7 Investigation services e.g., human characteristics recognition, forensics Journalism e.g. searching speeches of a certain politician using their name, voice or face Multimedia directory services e.g. Yellow Pages, tourist information, geographical information systems
MPEG 7 Multimedia editing e.g., personalised electronic news service, media authoring Social e.g. on-line dating services Surveillance e.g., traffic control   http://www.eptascape.com/products/demoflv.htm
MPEG 21
MPEG 21 An infrastructure for the delivery and consumption of multimedia content Users are seen as either creators, consumers, rights holders, content providers, or distributors
MPEG 21 Every media element is defined as a  ‘Digital Item’ Metadata defines what media we can use, what we can do with it and who owns it Designed to work with MPEG 4 files and MPEG 7 database

MPEG 4

  • 1.
    MPEG 4: The ultimate low bit rate format www.chiariglione.org/mpeg/
  • 2.
    Overview: ISO/IEC 14496 Coding of audio-visual objects Low bit rate multimedia system - typically less than MPEG 2 video Object based - each element is coded separately Open ended system that can continue to develop in the future Scalable & Interactive Version 1: October 1998
  • 3.
  • 4.
    Profiles Quality ComplexityDVD Video CD Mobiles HDTV Digital cinema MPEG-1 MPEG-2 Advanced Simple Profile Simple Profile MPEG 4
  • 5.
    Current uses: 3Gmobile phones, Portable devices, PDAs, video iPod Interactive television / IPTV New interactive multimedia formats Web pages Interactive music format Security systems
  • 6.
    Basics: Object basedsystem: using Natural and/or Synthetic objects Makes use of local processing power to recreate sounds and images This makes it one of the most efficient compression systems
  • 7.
    Basics: Object Types Photos - JPE, GIF, PNG, Video - MPEG-2, Divx, AVI, H.264, QuickTime Speech - CELP, HVXC, Text to Speech Music - AAC, MP3, surround Synthetic music Graphics - Java code Text Animated objects, e.g., talking heads
  • 8.
    Basics: The selectedobjects are put together into a 2D or 3D scene In 3D the viewer can change the shape of the image and view it from other positions in the 3D space. Similar to VRML Each object is compressed using the best and most optimum method for that type of data
  • 9.
    Basics: Virtual StudioVirtual production techniques are increasingly used in TV production Well known chroma-key method using a blue/green screen background Actors are overlaid on to a ‘virtual studio’ background image The composition of the screen image and the sound can now take place in the decoder at home
  • 10.
  • 11.
    BIFS: Binary Formatfor Scene descriptor New description language like HTML but written in binary not in English Has scalable levels for audio and video which can be set by access rights or by interrogation of the receiver to set the best ‘Quality of Service’ (QoS)
  • 12.
    Synchronised Streaming Eachelement can be time stamped to synchronise with other objects in the frame Flexi Time: The viewer can vary the time for playback There are 3 types, set by the producer, minimum, maximum and optimal Audio can be set to change pitch or stay fixed
  • 13.
    Compression : SpeechHVXC - Harmonic Vector Excitation Coding CELP - Code Excited Linear Prediction 2 - 24 K bit/sec Synthesised speech: Text to speech synthesis, 200-1200 bit/sec Very low delay, 20 ms, for video phone use MP3 takes too long to encode/decode
  • 14.
    Compression : NaturalAudio MPEG - AAC (Advanced Audio Coding) MP3, AAC, 5.1 surround 6 - 380 K bit/sec
  • 15.
    Parametric audio codingSynthesised audio : Spectral noise re-synthesised This process separates unique audio sounds from predictable noise shapes which can then be re-synthesised locally Signal is represented by three objects: Transients: localized in time, Sinusoids: localized in frequency, Noise: no strict localization…
  • 16.
    Parametric audio codingTransients: Castanets
  • 17.
    Parametric audio codingSinusoids: Harpsichord
  • 18.
    Parametric audio codingNoise: Heavy metal
  • 19.
    Compression : StructuredAudio SAOL - Structured Audio Orchestra Language (pronounced sail) Down loadable sound fonts Wavetable synth + GM2 type spec. Any kind of virtual instruments Virtual effects algorithms and mixers MIDI data rates e.g. 300 bit/sec
  • 20.
    Interactive Audio Download and remix tracks Flash interface and compressed audio loops www. yourspins .com
  • 21.
    Compression : VideoOverall MPEG-4 supports a wide range of standards from very small, poor quality, pictures up to HDTV MPEG 1 & 2 MPEG 4 - part 2 MPEG 4 - part 10, 'Advanced Video Coding’ AVC H.264 - high quality video codec Developed jointly by MPEG and the ITU
  • 22.
    Compression : Video- H.264 Half to one quarter of the normal bit rate of MPEG-2 Scalable from 3G to HD More advanced B frame operation where the frame can link to any frame in the video sequence Smaller 4X4 grids of pixels 4 Profiles and 16 Levels Bit rates from 64 Kbps - 240 Mbps
  • 23.
    Compression : VideoMixture of pixel based and vector graphics Video is no longer a rectangular shape, it can be any shape Synthetic images with bit rates from 5 kbit/sec - 10 Mbit/sec Supports the mapping of video textures onto moving objects and meshes
  • 24.
    2D mesh modelof a fish By deforming the mesh the fish can be animated
  • 25.
    Animated Objects: TheAnimation Framework eXtension, AFX Face animation: The face models are not part of MPEG-4 only the movement codes for the expressions, eye movement, etc. Body animation works in the same way and can be used in games
  • 26.
    Future Options: MPEG-4is still being developed and all new parts will work with the old formats Studio quality versions for HDTV Digital cinema 45 - 240 Mbit/sec H.264 Home video cameras with MPEG-4 output straight to the web from the hard drive
  • 27.
    Future Options: IntegratedService Digital Broadcast (ISDB) Newspaper + TV + data Integration with MPEG 7 databases Games with 3D texture mapping
  • 28.
    Future Options: TeleVisionModelling Language (TVML) Computer generated TV programs + presenters - Max Headroom??
  • 29.
    Future Options: Informationbooths Talking objects - fridge, cars, toaster? Security cameras over the web Interactive manuals and training materials New downloadable interactive music format, SAOL
  • 30.
  • 31.
    MPEG 7 MultimediaContent Descriptor Standard Database system to automatically define, organise and search for text, pictures, sound FX, graphics, video clips, songs, music, etc. On-line Music library Automatic identification of music Uses XML to store metadata
  • 32.
    MPEG 7 Proposeduses: Live broadcast monitoring, radio output Digital libraries e.g., image catalogue, musical dictionary, bio-medical imaging, sound FX, film, video and radio archives Cultural services history museums, art galleries, etc.
  • 33.
    MPEG 7 Homeentertainment e.g., systems for the management of personal multimedia collections, e.g. music, home video, searching a game, karaoke E-Commerce e.g., personalised advertising, on-line catalogues, directories of e-shops Education e.g., repositories of multimedia courses, multimedia search for support material
  • 34.
    MPEG 7 Investigationservices e.g., human characteristics recognition, forensics Journalism e.g. searching speeches of a certain politician using their name, voice or face Multimedia directory services e.g. Yellow Pages, tourist information, geographical information systems
  • 35.
    MPEG 7 Multimediaediting e.g., personalised electronic news service, media authoring Social e.g. on-line dating services Surveillance e.g., traffic control http://www.eptascape.com/products/demoflv.htm
  • 36.
  • 37.
    MPEG 21 Aninfrastructure for the delivery and consumption of multimedia content Users are seen as either creators, consumers, rights holders, content providers, or distributors
  • 38.
    MPEG 21 Everymedia element is defined as a ‘Digital Item’ Metadata defines what media we can use, what we can do with it and who owns it Designed to work with MPEG 4 files and MPEG 7 database