#SystemX14
Multimedia and Multilingual Analysis for Open Sources
Demonstration of MediaCentric Solution
Olivier Mesnard and Nabil Bouzerna
#SystemX14
Analysis of Open Sources
Project Objectives
2
 To develop an experimental platform for the analysis of unstructured content
(text, audio / video)
 Integration of the software components provided by project partners for the main sectors of
natural language processing and text mining industry (transcription, translation, information
extraction, search...)
 Development of innovative applications prototypes for information retrieval and open sources
monitoring based on those components
 To take into account users needs uncovered by existing applications
 Extension to social networks data
 Extension to multilingual data
 Reduction of costs and delays to adapt processings to a new domain or a new language
#SystemX14
Analysis of Open Sources
Work to be done
3
 At software component level
 Improve the robustness of processing on noisy data (amateur video, blogs, NFWC…)
 Integrate new languages
 Define a process for the limited cost (time and resources) development of linguistic resources, based on
learning from targeted corpus.
 At application level
 Interconnect components, share a metadata repository or indexes
 Deployment and infrastructure
 Anticipate scaling (from a few hundreds of thousands of documents to hundreds of millions of
documents) and establish an architecture and deployment strategy that facilitates the scaling.
 Immediacy
 Process data in constrained time (indexes refreshing strategy)
 User interface and interaction
 Innovation in visualization and interactivity (also to address the need for scalability)
#SystemX14
Analysis of Open Sources
Demonstration of MediaCentric Solution
4

Future@SystemX - Nabil Bouzerna - Experiment IMM Project

  • 1.
    #SystemX14 Multimedia and MultilingualAnalysis for Open Sources Demonstration of MediaCentric Solution Olivier Mesnard and Nabil Bouzerna
  • 2.
    #SystemX14 Analysis of OpenSources Project Objectives 2  To develop an experimental platform for the analysis of unstructured content (text, audio / video)  Integration of the software components provided by project partners for the main sectors of natural language processing and text mining industry (transcription, translation, information extraction, search...)  Development of innovative applications prototypes for information retrieval and open sources monitoring based on those components  To take into account users needs uncovered by existing applications  Extension to social networks data  Extension to multilingual data  Reduction of costs and delays to adapt processings to a new domain or a new language
  • 3.
    #SystemX14 Analysis of OpenSources Work to be done 3  At software component level  Improve the robustness of processing on noisy data (amateur video, blogs, NFWC…)  Integrate new languages  Define a process for the limited cost (time and resources) development of linguistic resources, based on learning from targeted corpus.  At application level  Interconnect components, share a metadata repository or indexes  Deployment and infrastructure  Anticipate scaling (from a few hundreds of thousands of documents to hundreds of millions of documents) and establish an architecture and deployment strategy that facilitates the scaling.  Immediacy  Process data in constrained time (indexes refreshing strategy)  User interface and interaction  Innovation in visualization and interactivity (also to address the need for scalability)
  • 4.
    #SystemX14 Analysis of OpenSources Demonstration of MediaCentric Solution 4