DAEDALUS Content Management

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Notes on slide 1

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS-Data, Decisions and Language, S.A.

    © DAEDALUS - Data, Decisions and Language, S.A.

    © DAEDALUS, S.A.

    Favorites, Groups & Events

    DAEDALUS Content Management - Presentation Transcript

    1. DAEDALUS: Solutions for Content Management Making easier the exploitation of multimedia and multiling ual contents
    2. Index About us STILUS Corrector Search engines Semantic web Other solutions for the media industry Multimedia scenarios Product licensing Other areas of activity 1 2 3 4 5 6 7 8
    3. Index About us STILUS Corrector Search engines Semantic web Other solutions for the media industry Multimedia scenarios Product licensing Other areas of activity 1 2 3 4 5 6 7 8
    4. About us
      • Since 1998 we offer solutions, products and services for the information society
      • Limited Liability Company owned by private Spanish investors
      • Our main lines of activity are around these areas:
        • Linguistic Technology
        • Web Technology
        • Business Intelligence
      • High component of Research and Development
      • Mission: innovation
    5. Main technology competences
      • Language technology:
        • Spell checkers, multilingualism, information extraction , speech recognition, etc.
      • Web Technology:
        • Specialized search engines (text/image/video/audio)
        • Websites’ visitors analysis
        • Semantic web, ontologies , service oriented architectures (SOA)
      • Business Intelligence:
        • Data mining, business rules, prediction, modelling, simulation, optimization, knowledge management
    6. Main clients
      • DAEDALUS supplies specialized solutions for large clients, essentially based on its own technology: Yell , Grupo PRISA, Grupo Unidad Editorial, Grupo Telefónica, Iberdrola, SGAE, Grupo SM, Amper, Instituto Cervantes...
    7. Partners
      • Business partners:
      • Developers/members of :
    8. DAEDALUS: clients and partners
      • Reference clients:
        • Public Administration :
          • Instituto Cervantes, Spanish Tax Agency
        • Media, publishing industry:
          • Grupo PRISA, Unidad Editorial, Grupo SM
        • Information services :
          • YELL Publicidad (Yellow Pages , 11888)
        • Other industries:
          • Telecommunicatio n : Grupo Telef ónica
          • Energ y: Grupo Iberdrola
          • Defence: Amper
      • Partners:
        • INDRA, Future Space, Caja Madrid
    9. DAEDALUS Clients in Language Technology
      • Grupo PRISA :
        • STILUS, s pell, grammar and style checker integrated in Hermes/Newsroom for the newspaper El País
        • Named entities extraction and news classification system for Prisacom
      • Mundinteractivos (Unidad Editorial)
        • STILUS, s pell, grammar and style checker integrated in Linux platform for the digital media
        • Fuzzy search engine
      • Lainformacion.com
        • Automatic classification systems, extract of information, clustering, forum moderation, etc.
      • Grupo SM
        • STILUS, s pell, grammar and style checker integrated with dictionaries to the Elementary, High school and CLAVE in MS Office Word.
      • Yell Publicidad
        • Specialized search engines (fuzzy/semantic/multilingual)
      • Automatic filtering of Inspection Acts
        • Spanish Tax Agency
    10. Index 1 2 3 4 5 6 7 8 About us STILUS Corrector Search engines Semantic web Other solutions for the media industry Multimedia scenarios Product licensing Other areas of activity
    11. DAEDALUS Technology: STILUS Checker
      • Spell, grammar and style checker (available for Spanish and Italian):
        • High quality: coverage and precision
        • Reference to authority sources (Diccionario Panhispánico de Dudas, Diccionario del Español Urgente, etc.)
      • Independent of the electronic format
        • Possibility of recognizing more than 200 formats
      • Adaptable to the client:
        • Integration
        • Terminology / Style book
      • Multiple versions:
        • Interactive (MS Word, MS Explorer) or off-line (reports)
        • Personal or corporate
        • Web service
      • Reference clients: El PAÍS, elmundo.es, Grupo SM, Instituto Cervantes.
      • Available languages: Spanish and Italian (English and French in December 2009).
    12. Successful cases
      • STILUS in EL PAÍS ( within HERMES ® )
      HERMES ® is the News Content Manager of ATEX (previously owned by UNISYS)
    13. STILUS in El Mundo Used in all the digital media of Unidad Editorial (a brand of the Italian RCS Media Group): Marca, Expansi ón, etc.
    14. Successful cases
      • STILUS in Grupo SM
      • Controlled vocabulary
        • Text checking with a dictionary of the desired level
      • Synonyms
    15. STILUS on the Internet
    16. Index About us STILUS Corrector Search engines Semantic web Other solutions for the media industry Multimedia scenarios Product licensing Other areas of activity 1 2 3 4 5 6 7 8
    17. DAEDALUS Technology: search engine s
      • Textual search engine s specialized/multilingual
        • Telefónica Soluciones, Instituto Cervantes
        • Technology: meta search engine , crawlers, language identification, indexation, automatic summaries
      • Search of audiovisual contents
        • Authorship Rights Management Societies: SDAE, AIE
      • Fuzzy search engine s
        • Yell Publicidad (online Yellow Pages)
        • ACC Seguros
      • Image search engine (fuzzy/semantic)
        • LatinStock (commercial photography website )
      • Semantic search engine s
        • Semantic search engine s for the Yellow Pages (Yell Publicidad)
    18. Successful cases
      • Search engine for the Spanish Yellow Pages
    19. Successful cases
      • Search engine in CESyA
      • Centro Español para Subtitulado y Audiodescripción [ Spanish Centre for Subtitling and Audio description ] ( www.cesya.es ):
        • DAEDALUS Technology for fuzzy search in an audiovisual content database
    20. Successful cases
      • Panhispanic Search Engine
    21. Successful cases Search engine in LatinStock
    22. Successful cases: 11888
      • Multilingual information service at 11888 (Yell Publicidad)
        • Queries in Spanish/Catal an/Valencian/Basque/Galician Language-independent search
      • Other languages available:
        • English, French, Italian , Arabic, German
      • Coming soon :
        • Chinese, Hebrew, Russian
    23. Successful cases: Edas Corp
      • K-Site technology integration of indexation and searching in Edas Corp
        • Document Management Platform of eProm.
        • Windows environment
        • Reference client : Instituto Nacional de Estad ística [ Spanish Statistics Office ]
    24. Successful cases: FutureSpace
      • Integratio n of multilingual search technology in specialized applications of Document Management
        • Search in Spanish of documents is Catal an/Valencian/Basque/Galician/English/French/Italian/ Arabic
        • Partner: FutureSpace
        • Final client: Defence sector
    25. Index About us STILUS Corrector Search engines Semantic web Other solutions for the media industry Multimedia scenarios Product licensing Other areas of activity 1 2 3 4 5 6 7 8
    26. Semantic technology
      • Specialized ontologies development:
        • Experience in Information Services (Yell), Banking and Insurance (CajaMadrid), Commercial Photography (LatinStock), Defence (Amper)
      • Integration with Business Rules systems
        • K-Site Rules: Platform for development and integration of systems based on Business Rules about different rules engines
        • Case of use: ITECBAN project
      • Text semantic automatic labelling:
        • Based on standard ontology (SUMO, Suggested Upper Level Ontology, IEEE)
        • Specific ontologies for entities (people, organizations, places)
      • Semantic Search Systems
      • Question Answering Systems
    27. K-Site Rules: Components Rules in performance Business Analyst’s editor Administrator’s tool
    28. Index About us STILUS Corrector Search engines Semantic web Other solutions for the media industry Multimedia scenarios Product licensing Other areas of activity 1 2 3 4 5 6 7 8
    29. Other solutions: Entities labelling
      • Entities identification from own resources:
        • Places: 13.739 (cities, regions, countries, rivers, mountains...)
        • Personalities: 12.921
        • Organizations: 6.353
        • Names/surnames: 28.759
        • Others: 999 (stock market indexes, laws, work titles, products…)
        • Total: 62.771
      • Additional characteristics :
        • Proposal of candidate entities not included previously in the STILUS lexical resources
        • Semantic disambiguation (Ex.: Madrid, geographic entity in several countries, sport team, surname)
        • Referential disambiguation of entities (several references to the same entity with different names in the same text)
      • Labelling technology : STILUS Core ES
      • Extension: entities database (STILUS NER-DB ES)
        • The database would be hand in with a source format including the following fields:
          • Entity/abbreviated form/semantic characterization
        • The database cannot be accessible to third parties through any mean, neither directly nor indirectly.
    30. Other solutions: Geopositioning
      • Integration of terminological geographical resources
      • Objectives:
        • To associate coordinates to each geographical term (latitude and longitude)
      • Information sources
        • www.geonames.org (licence cc-by)
      • Technology : STILUS® Geo
      • Possible extensions:
        • Integrating a street map of Spanish localities
        • Link with mapping services
    31. Other solutions: Keywords for SEO
      • Automatic keyword generation for SEO (Search Engine Optimization) :
        • Those that represent better the content of a piece of news
          • Person’s, organizations’ and places’ names (“Named Entities”)
            • Fernando Alonso, Renault, Formula 1, Montreal, Canada
          • Individual words or multiword expressions (mainly noun phrases)
            • accident, abandonment, human mistake, team strategy
        • Those that optimize the positioning of the web page (piece of news) in search engine s
          • Frequent used keywords, favouring multiword terms (2, 3 or 4 words), which results number would not be neither so big nor so small
          • Key density between 2% and 7%
        • Income: text, language, maximum/minimum number of keywords
        • Outcome: list of keywords
      • Technology : STILUS Core ES
      • Efficiency:
        • This unit requires querying search engine s in order to know the number of matches in the web for each term.
        • The queries are implemented with a cache-based system (each term is consulted only once). This fact, together with the previous linguistic processing, makes that its impact in the performance of the system is reduced and decreased with time
    32. Other solutions: Thematic classification IPTC
      • Taxonomy from IPTC (International Press Telecommunications Council)
        • 3 levels: 17 main topics, divided in subjects (372 altogether) and these at the same time divided in details (976 altogether)
        • TT-MMM-DDD Code
          • Ex.: 15-039-001 [Sport - Motoring - Formula 1]
      • Automatic classification of news in classes. Procedure:
        • Terms extraction
        • Classification: k-Nearest Neighbour (kNN) algorithm through the Euclidean’s/cosine’s distance
      • Possible approaches:
        • Automatic inductive learning (through already classified news)
          • Training: semi supervised assignment of keywords (in every language to discuss) to each taxonomy’s node
        • Assignement of terms to nodes by documentation experts
      • Interface
        • Input: text, language, maximum number of classes
        • Output: IPTC codes list and their relevance
        • Technology: STILUS® Core, STILUS® Glos-IPTC, IPTC terms glossary developed by DAEDALUS, K-Site® Class, tool for classification through ontology
    33. Other solutions: News clustering
      • Automatic news clustering:
        • To detect duplicated news coming from different sources
        • To link news about the same topics
          • Alonso: “We had a terrible strategy mistake”
          • Alonso says that Renault committed “a very big mistake”
      • News (topics) have an associated time frame
      • Method:
        • Processing + vector space model + Fuzzy algorithm C-Means with Euclidean distance + maximum intercluster distance
        • Adaptive training (according to the news’ expiry)
      • Interface
        • Input: text, news expiry
        • Output: similar news list (ID’s and distance)
      • Base technology : STILUS® Core
    34. Other solutions: Advanced search
      • Advanced search in news corpora, including:
        • Fuzzy search (did you mean…)
        • Semantic expansion of the query with synonyms/antonyms
          • accident -> mishap, misfortune, setback, catastrophe, vicissitude, hitch, breakdown, injury, disaster, incident, crash.
      • Interface:
        • Input: query, type of expansion
        • Output: expanded query
      • Technology : STILUS® Sem (ES), K-Site® Fuzzy
    35. Other solutions: forum moderation
      • Tool for automatic moderation of media, blogs, fora, etc.
      • Offensive, illegal, inappropriate or objectionable content filtering
      • Integrable in contents’ managers or as service
    36. Index About us STILUS Corrector Search engines Semantic web Other solutions for the media industry Multimedia scenarios Product licensing Other areas of activity 1 2 3 4 5 6 7 8
    37. Multimedia scenarios: objectives
      • To make easier the data management in audio, video, image and/or text format
        • Analysis, indexing, and subsequent querying.
        • Integration of automatic speech recognition solutions by Sail Labs (Media Mining Indexer)
      • Solutions can be integrated in any environment (hardware platforms, transmission media, operating system)
    38. Multimedia scenarios: audio indexing
      • News search from dialogues
      Transcription Indexing Search Index Contents
    39. Showroom: DALI
      • Digital Audio Library Indexing: http://showroom.daedalus.es
      • Videos of the TV channels in YouTube :
        • RTVE, Antena3 TV, Telecinco, TeleMadrid, Cuatro, Agencia EFE, EuropaPress, etc.
      • Contents processed by automatic means for its indexation and search .
    40. Multimedia scenarios: subtitling
      • Support to subtitling processes
      Transcription TEXT Processing (checking, proofreading , etc.) Storage
    41. Product: STILUS Subtitler
      • Automatic generation of subtitles from news/scripts
      • Collaboration with CESYA (Spanish Centre of Subtitling and Audio description)
      • Objectives:
        • Automatic division of the text into subtitles (UNE 153010 Rule to Subtitling through Teletext)
        • Editor’s notes filtering
        • Integration with FAB Subtitler
        • Optional spell-checking
    42. Index About us STILUS Corrector Search engines Semantic web Other solutions for the media industry Multimedia scenarios Product licensing Other areas of activity 1 2 3 4 5 6 7 8
    43. Product licensing (I)
      • STILUS® Corrector (ES, IT, EN, FR)
      • STILUS® Core (ES, EN, IT)
        • Collection of tools and resources to the processing and advanced analysis of text
        • http://www.daedalus.es/productos/stilus/stilus-core/
      • STILUS® Sem (ES, EN)
        • Library and collection of semantic information linguistic resources (synonymy, antonymy, related words, type of entity, relations between entities)
        • http://www.daedalus.es/productos/stilus/stilus-sem/
      • STILUS® Glos-IPTC (ES, EN)
        • Correspondence unit between terms of a language and nodes of the IPTC hierarchy
      • STILUS® Class
        • Unit to the automatic classification of text
      • K-Site® Fuzzy multidiccionario
        • Functionality “you mean…”
        • http://www.daedalus.es/productos/k-site/k-site-fuzzy /
    44. Product licensing (II)
      • STILUS® Lang
        • Unit to language automatic detection
        • http://www.daedalus.es/productos/stilus/stilus-lang/
      • STILUS® Trans
        • Lybrary software for word-to-word translation of a text in different languages
        • Available languages ( bidirectional translation ): between Spanish and:
          • Other languages of Spain: Catala n, Basque, Galician
          • English, French, German, Italian and Arabic
        • http://www.daedalus.es/productos/stilus/stilus-trans/
    45. Product licensing (III)
      • Licence per machine
        • Usually, associated to an integration project, depending on the clients’ needs
      • Guarantee: 6 months from the date of delivery
      • Maintenance
        • Cost: 15% of the development and licences price
        • Contracted annually
        • Single payment in advance at the beginning of every period
        • It is applied at the end of the guarantee
    46. Third parties technology: automatic translation
      • DAEDALUS is the integrator of Reverso™ automatic translation products (of the company Softissimo)
      • Integration in Windows platforms
      • Available in Spanish, English, French and Italian
      • Possibility of adaptation to the client’s domain
    47. Third parties’ technology: automatic speech recognition
      • DAEDALUS is the integrator of the Media Mining Indexer™’s speech recognition technology (by SAIL LABS)
      • Integration in any platform
      • MMI technology includes:
        • Speaker-independent speech recognition
        • Speaker identification
        • Detection of change of speakers
        • Entities detection (persons, places or organizations)
      • Available languages: Spanish, English (British and American), French, German, Polish, Greek, Norwegian, Russian and Arabic
    48. Index About us STILUS Corrector Search engines Semantic web Other solutions for the media industry Multimedia scenarios Product licensing Other areas of activity 1 2 3 4 5 6 7 8
    49. Other areas of DAEDALUS activities
      • Marketing and Internet
        • Web analytics: websites visitors’ access and behaviour analysis ( www.lawerinto.com )
        • Monitoring of e-mail marketing campaigns
      • Business Intelligence: business rules, data mining, knowledge management, etc.
      • Complex systems: optimization, modelling, simulation, prediction, planning scheduling
    50. Lawerinto ® line
      • System for web analytics: interactive analysis of web traffic in real time:
        • Audience measuring on the Internet
        • E-mail campaigns monitoring
        • Sending of reports through e-mail and SMS
        • Methods: installed product or outsourced service
        • Adaptability to the clients
      • Main client: Grupo SM
        • >20 corporative Webs and >200 client’s Webs
        • >4 million pages/month
      • Telefónica Prize - New Internet Applications 2002
    51. Solutions in Business Intelligence
      • Intelligent Energy
        • Telefónica Soluciones
      • Weather and pollution forecast; prediction of power produced by wind farms
        • Iberinco, Grupo Iberdrola
      • Churning detection in mobile operators
        • TeleSP, Brazil, collaboration with Telefónica R+D
      • Sales prediction for the Pharmaceutical industry
        • Novo Nordisk Pharma, Denmark
      • Works plan ning for the Spanish Hydrological Plan
        • Ministry of the Environment and Rural and Marine Affairs of Spain
      • General characteristics: optimization, modelling, simulation and data mining in complex systems
      • Integration of products of third parties:
        • XPress (optimization and planning, Dash Optimization, United Kingdom)
        • Powersim Studio (modelling and simulation, Powersim, Norway)
        • DataEngine (data mining, MIT GmbH, Germany )
    52. R+D Activities (I)
      • Projects:
        • FLAVIUS (7th Program EU Frame, CIP-ICP-PSP, 2009-2011)
          • Foreign Language Versions of Internet and User-generated Sites
          • Partners: Softissimo, JFG Networks (FR), Language Weaver (RO), Accross, Qype (DE), Videopolis (BE)
          • Investment: 3,7M€ (total), 700K€ (DAEDALUS)
        • Contents à la carte (AVANZA, Ministry of Industry, 2008-2010)
          • Personalized distribution of news and recommendation system
          • Partners: Germinus XXI, EFE Agency
          • Investment: 1,2M€ (total), 132K€ (DAEDALUS)
        • DISUIPA (AVANZA, Ministry of Industry, 2008-2010)
          • Platform for disable people access in public installations of access to the Internet
          • Partners: Fractalia, UC3M, Consultec
          • Investment: 650K€ (total), 200K€ (DAEDALUS)
        • CANTIGA (PROFIT, Ministry of Industry, 2007-08)
          • Cataloguing and federated search of musical digital contents
          • Partners: Germinus XXI, Fundación Albéniz, UPM, UC3M
          • Investment: 1 M€ (global), 140K€ (DAEDALUS)
        • ITECBAN (CENIT, Ministry of Industry, 2006-09)
          • Architecture of information systems for the banking sector
          • Partners: INDRA, CajaMadrid, Sun Microsystems and Grid Syst.
          • Investment: 33,3 M€ (global) 1,5M€ (DAEDALUS)
    53. R+D Activities (II)
      • Projects (cont.)
        • EDDENN (PROFIT, Ministry of Industry, 2004-05)
          • Extraction of information from non-structured digitalized documents
          • Partner: IPSA
          • Investment: 775K€ (global), 235K€ (DAEDALUS)
        • Omnipaper (European Union, IST Program, 2002-04)
          • Distributed and multilingual access to news services
          • Partner: Leuven University (coordinator) and others
          • Investment: 2,8M€ (global), 423K€ (DAEDALUS)
      • European Research Nets and Platforms
        • ONTOWEB: Ontology-based Information Exchange for Knowledge Management and Electronic Commerce
        • KDNET: Knowledge Discovery Network of Excellence
        • CLEF: Cross Language Evaluation Forum
        • FLaReNet: Fostering Language Resources Network
        • NEM Platform: Networked and Electronic Media
        • ARTEMIS Platform: Embedded Systems
    54. Contact
      • Central Office:
      • López de Hoyos 15
      • 28006 Madrid
      • Technical Office:
      • Vallausa II Building
      • Albufera 321
      • 28031 Madrid
      • Tel: +34 913.32.43.01
      • [email_address]
      • http://www.daedalus.es
    SlideShare Zeitgeist 2009

    + DAEDALUS, S.A.DAEDALUS, S.A. Nominate

    custom

    503 views, 0 favs, 0 embeds more stats

    DAEDALUS solutions in the area of content managemen more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 503
      • 503 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 0
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories