9. User-Interface – Components MPEG-7 Ontology Domain-specific Ontology XML (our application profile) MPEG-7 RDF Web Server Annotea Server RDF Store ABC
19. Conducting Master Class Michael Tilson Thomas, Musical Director for the New World Symphony (Miami) provides long Distance instruction to conducting student Donato Cabrera (New York) (arts.internet2.edu)
The ability to search, browse, discuss and edit video collaboratively and in real time during group discussions is of great interest to the educational sector, e.g. film&media students the audiovisual archives and libraries e-science, e.g. within the biomedical field... and the film&media industry
As mentioned before, GrangeNet provides multi gigabit connections between Access Grid Nodes in Australia. The backbone is also connected the AARNet international network at Sydney providing access to the global research and education networks. Access Grid Nodes are highly powered video conferencing facilities, equipped with large-format displays, cameras, microphones, echo-canceling etc. They are used for large-scale distributed meetings, collaborative work sessions, seminars, lectures, tutorials, and training. Access Grid Nodes are located at various sites in Australia, e.g. at UQ and QUT in Brisbane or at UTS in Sydney. UTS is btw having its opening today as we speak.
This slide illustrates the overall system architecture – assuming deployment within an educational context . The scenario is a live discussion between students and lecturers from Film/Media Departments, communicating with curators, archivists or film/media analysts from audiovisual archives and the creative industries in Australia All of the participants of this hypothetical online videoconference are sharing an application through an Application Server on GrangeNet , The are able to retrieve an MPEG-2 video from one of the Streaming Servers located at the custodial organization, which they can then collaboratively index, browse, and annotate in real-time . This architecture also reflects our assumption of the two separate metadata stores mentioned earlier: <CLICK> One store is for the search and retrieval of video content from the servers based on objective indexing and description data . We assume that this will be provided and maintained by the custodial organization. <CLICK> A separate metadata store is for logging the subjective annotation and discussion data , which could be even treaded questions and answers, managed by the Annotation Server.
The challenges we faced and will face in the future are the MPEG-2 indexing, search, browse and retrieval creating a metadata application profile in XML combining Dublin Core and MPEG-7 Including automatic shot detection Logging of the subjective annotations by using and extending the Annotea Annotation Protocol for video. Annotea is a protocol of the W3C to annotate web-pages, which uses RDF over HTTP. Regarding content we hope to be able to use audiovisual archives that are or will be connected to GrangeNet (e.g. ACRA (Australian Creative Resources Archives), Screensound Australia, ACMI). Storage would be on their side. We haven’t got to the Rights Clearance and Management yet, but we will be using an emerging standard, the MPEG-21 Multimedia Framework Also further down the road will be the development of business models and billing systems, e.g. subscriptions, annual fees, etc.
We started off implementing a standalone prototype with the following three key components of the user interface: The Video Player displays the video content being streamed and provides the usual Video Controls like stop, play and pause; The Description & Indexing component enables the objective and authorized hierarchical segmentation and description of the content using controlled vocabularies. It also integrates search, browsing and retrieval functionalities; The Annotation & Discussion component enables the input, logging, search and retrieval of shared annotations. They could be either in the form of plain text or hyperlinks or even audiovisual data being captured during the videoconferencing.
Here we can see a screen-dump of the prototype we’ve implemented so far: You can see a tree and list view to browse through the video segments, the interface to input the descriptions the actual video screen that also serves as a drawing board when the video is paused so that users are able to annotate specific regions and the interface to search, filter and log the annotations and discussions threads.
After implementing this standalone prototype, the next step was to make it collaborative. The simplest approach would have been to use the application sharing protocol T.120 , probably better know under names like NetMeeting , Microsoft Messenger or VNC, which are all implementations of this protocol. The main advantage is that it provides a single framework to share any application . It is based on capturing and transferring screen display changes as well as mouse and keyboard events within one application. However, this approach is unsuitable for applications that involve high-quality video as it was tailored for low-bandwidth. Besides, it doesn’t make any sense to recapture digital video off the screen if you already have the digital format. The other major disadvantages are the restrictions and the inflexibility that we would have to deal with. Consequently we had to develop our own application-sharing environment from scratch using .NET Remoting.
Since I am running out of time I will give you only a brief example of how .NET Remoting combined with MPEG-2 RTP Streaming allows shared viewing and control of video content. In this scenario the client master who holds the token and is in control, presses the pause button in the player. The event is caught by a mediator that triggers an event through a remote coordinator object that sits on the server. The transmitter subscribes to this event and causes the RTP multicast streaming, that each receiver consumes, to be paused. In same fashion we consume other button or mouse events , except that they trigger an event on the server that each client is subscribed to and that they handle directly and simultaneously . Eventually every event is mirrored at each location.
The benefit of this approach is not only the excellent video quality through the MPEG-2 streaming, but also the flexibility to allow multiple users to be in control at the same time, using colour-coded Mouse Pointers for each client. This causes the application being similar to a Multi-Player Game , where everyone can do what they want which will affect others at the same time. This may sound chaotic within our domain and may not work for all groups as it requires a very disciplined usage, but it’s part of our usability studies to explore this feature.
Eventually the Collaborative FilmEd Application will be used in conjunction with the Video Conferencing tools provided by the AG node. Participants can see and hear each other while using the application together simultaneously and log whatever information they feel.
In the future we will include audiovisual annotation, combined with speech detection algorithms to facilitate the search. We also want to be able to annotate other media types, e.g. images, web pages, text and audio. The rights management and access controls on annotations using MPEG-21. And finally we’ll look at extending the system to be able to edit documents collaboratively.
If you you want to follow up with our project you can visit our website on. Thank you very much for listening.
If you you want to follow up with our project you can visit our website on. Thank you very much for listening.