1. EXPERT Winter School Partner Introductions

Welcome to the EXPERT
Winter School
11-13th November 2012
Hilton Garden Inn, Brindley Place, Birmingham

University of Wolverhampton

UoW

Ruslan Mitkov
EXPERT Winter School
11 November 2013

Mission Statement

 RIILP produces internationally leading research,
offers first class research supervision and teaching
in the interdisciplinary areas of information and
language processing and delivers cutting-edge
practical (including commercial) applications to the
benefit of the society based on its research output

Structure and context
 Research Group in Computational Linguistics

 Statistical Cybermetrics Research Group
 Benchmark: the very best national and international
expertise in every area
 Both groups enjoy considerable national and
international reputation

 External income generation > £4,000,000 over last
five years

Statistical Cybermetrics
Research Group
 Statistical Cybermetrics entered in Unit of Assessment “Library and
Information Management”: in national context, Wolverhampton
ranked joint second with four more universities.
 According to league tables (The Guardian, The Times and
Research Fortnight), research in Library and Information
Management at the University of Wolverhampton is one of the six
best in the UK.
 Head of SCRG – Prof. Mike Thelwall

 Rated 3rd most successful UK library and information science
researcher of all time (Jan. 2007)

Research Group in
Computational Linguistics
http://clg.wlv.ac.uk

RAE’2008 results
 Computational Linguistics entered in Unit of Assessment
“Linguistics”: Wolverhampton ranked joint third with two
more universities in a large company of old, researchintensive universities.
 According to league tables (The Guardian, The Times
and Research Fortnight), research in Linguistics at the
University of Wolverhampton is one of the six best in the
UK
 Due to Computational Linguistics, in Linguistics we are
ahead of Oxford, Cambridge, UCL, Lancaster,
Manchester, Reading...

Research Group in Computational Linguistics:
People

 Founded in 1997 by Ruslan Mitkov
 Currently:









1 full-time Professor
2 part-time Professors
2 Readers
1 Senior Lecturer
7 research fellow and research associates
12 PhD students
4 Administrators
Project research assistants, Masters students, Visiting
Professors, Honorary Research Fellows, guest
researchers

Research Group in Computational Linguistics: key personnel
Ruslan Mitkov

 Publications: more than 200 publications in areas including:
 anaphora resolution (>2,000 citations, >40 keynote speeches
 generation of multiple-choice tests (>15 keynote speeches)
 Key books:
 Mitkov R. 2002. Anaphora resolution. Longman.
 Mitkov R. (Ed). 2003, 2005. Oxford Handbook of Computational Linguistics. Oxford
University Press.
 Current editorial distinctions:
 Executive Editor of the Journal of Natural Language Engineering (Cambridge University
Press)
 Editor of the Oxford Handbook of Computational Linguistics (Oxford University Press)
 Editor-in-Chief of John Benjamins’ book series in Natural Language Processing (NLP)
 Editor Consultant of Oxford University Press publications in Computational Linguistics
 Chair or member of a number of Programme Committees and Editorial Boards


 Dr. Constantin Orasan (Deputy Head of the Group)

 Reader in Computational Linguistics; 60+ publications, PI on a
number of projects, leading figure in summarisation, extensively
involved in Master programme teaching
 Dr. Michael Oakes
 Reader in Computational Linguistics, leading figure in areas
such as information retrieval, authorship identification, statistical
methods for linguistics and translation


 Prof. Patrick Hanks:
 Professor of Lexicography, leading figure in
(computational) lexicography and corpus-based methods
to dictionary compilation
 Dr. Le An Ha:
 Lecturer in industrial Natural Language Processing, Project
manager of the US NBME-funded project, involved in NLP
application to e-learning and industrial projects
 Richard Evans
 Research fellow, involved in NLP application to healthcare
 Led the FIRST project proposal process

Delivering cutting-edge research in
 Coreference/ anaphora
resolution
 Automatic generation of multiple
choice texts
 Text summarisation
 Question answering
 Temporal processing
 Named entity recognition
 Lexical knowledge acquisition
 Discourse processing
 Information extraction
 Computational lexicography
 Text simplification
 Plagiarism detection
 Evaluation of NLP

•
•
•
•
•
•
•

Topics related to translation
Term extraction
Machine Translation
Multilingual NLP
Translation Memory
Translation Universals
Comparable corpora compilation
for translators
• Statistical methods to translation
• Generation of test items

Some recent highlights
 RAE’2008 feedback: research output internationally leading, internationally
excellent and internationally recognised

 World best performing system in temporal processing (Georgiana Puscasu)
 World best cross-lingual information retrieval system with English as target
language (Iustin Dornescu, Constantin Orasan, Georgiana Puscasu)

 World best GikiP system (a competition dealing with geographical questions
on Wikipedia) (Iustin Dornescu)
 Best anaphora resolution system (Iustin Dornescu)

 Oxford University Press statement that the Oxford Handbook of
Computational Linguistics has been the most successful OUP Handbook
ever.

Project in focus/success story: Rapid Item Generation
 Two projects funded by a major US Board of Medical
Examiners (NBME) on generation of test questions for the
medical domain
 Pioneering computer-aided approach
 First stage successfully passed real user testing
 For English but possibility to extend to other languages
 Since then, the NBME have gone on to request an annual
rolling contract of around £100,000 for us to continue working
on items for them.
 They are currently trialling a second project with us which, if
successful, will bring in an additional £45,000 p.a.

Recent EC-funded projects
 QALL-ME (Question Answering Learning technologies in a
multiLingual and Multimodal Environment)
 Funding body: European Commission FP6 ICT
 Total EC contribution: €2,400,000. WLV share: €700,000.
 Runs from October 2006 – September 2009

 TELL-ME (Towards English Language Learning for
MEdical professionals)
 Lifelong Learning Programme Leonardo da Vinci
 Total EC contribution: €370,401. WLV share: €95,375.
 Runs from January 2012 – December 2013

 FIRST (A Flexible Interactive Reading Support Tool)
 EC FP7
 Total EC contribution: €2,008,754. WLV share €487,440.

Other ongoing projects
 DVC, AHRC, £605,586
 Funding body: AHRC
 Total contribution: £605,586. WLV share: £605,586.
 NBME projects, NBME, more than £1,000,000
 Funding body: NBME
 Total contribution: > £ 1,000,000. WLV share: > £ 1,000,000.
 Runs from January 2004

Strategic topics

 Language technology for medical applications (including
language disorders)
 E-learning
 Translation Technology
 Bridging the gap between academia and the industry
 Impact on society

University of Málaga (Spain)
Research Group in Lexicography and
Translation (Lexytrad, HUM-106)

Index

1. Aims and activities of UMA
2. Research Group HUM-106

3. Expertise - HUM 106
4. Key staff involved in TELL-ME

Aims and activities of UMA
 The Universidad de Málaga (UMA): over 36,000 students
and over 2,500 teaching staff.

 Well established history in regional, national and European
project management: 73 international projects (at present
23 onging European projects).
 National and international patents for the results of its
research.
 UMA is an International Campus of Excellence
(Andalucía TECH) since 2010.
 Watch http://www.youtube.com/watch?v=_nXoV8oiGvo

Research Group HUM 106 (I)
 The research group Lexicography and Translation (HUM-106) at UMA is
an international leader in the field of corpus-based Translation
Studies, E-Learning and Translation Technologies.
 Directed by Prof. Gloria Corpas since 1997.
 The group comprises 14 researchers and is a recognised leader in
areas of E-Learning, Linguistics, Corpus Compilation, Multilingual
Lexicography, Terminology, Translation Training, Translation Studies,
including Revision, Quality Control, Translation Technologies and Usercentred Translation Evaluation.

Research Group HUM 106 (II)

 The group works with a number of languages, including Spanish,
German, Italian, French and English.
 The research group HUM-106 was rated as one of the top
performing units within Arts and Humanities in the 2010
Autonomic assessment exercise by the Andalusian regional
government (97 points out 100).
 Further information at http://www.uma.es/hum106

Expertise - HUM 106 (I)
International R&D Projects
 2004-2006
- Standard Linguistico Europeo per il Settore del Turismo (SLEST) [Linguistic
standard for the tourism industry]. Funding source: European Comission (20042006).
Funding source: Lifelong Learning Programme (LLP)

 2004-2007
- HESPERIA. Repertorio analítico de lexicografía bilingüe: diccionarios italianoespañol y español-italiano. [HESPERIA: Analytical bilingual lexicography index:
Spanish/Italian – Italian/Spanish dictionaries].
Funding source: Italian Ministry of University and Scientific Research (MIUR).

Expertise - HUM 106 (II)
 2005-2008
- ACTUAL: Lingüística contrastiva [Actual: Contrastive Linguistics]. Funding
source: Italian Ministry of University and Scientific Research (MIUR).
 2008-2010
- CHINESECOM – Competences in Elementary Chinese as a mean to improve
competitiveness of European Union companies. Funding source: Lifelong
Learning Programme, (LLP) - Key Activity 2 - Multilateral project.
 2012-2013
- TELL-ME (Towards European Language Learning for MEdical professionals).
Funding source: Lifelong Learning Programme, (LLP) - Key Activity 2 Multilateral project.

Expertise - HUM 106 (III)
National R&D Projects
 1999-2002
- Diseño de un tipologizador textual para la traducción automática de textos jurídicos
español → inglés/alemán/italiano/árabe). [A Textual Typologiser for Machine-Translation
of Legal Texts (Spanish « English/German/Italian/Arabic)].
Funding source: Spanish Ministry of Education: Research & Development National
Programme.

 2003-2006
- TURICOR: Compilación de un corpus de contratos turísticos (alemán, español, inglés,
italiano) para la generación textual multilingüe y la traducción jurídica. [TURICOR: A
multilingual corpus of tourism contracts (German, Spanish, English, Italian) for automatic
text generation and legal translation].
Funding source: Spanish Ministry of Science and Technology.

Expertise - HUM 106 (IV)
 2008-2011
- Espacio único de sistemas de información ontológica y tesauros sobre el
medio ambiente: Ecoturismo
Funding source: Spanish Ministry of Education: Research & Development
National Programme.
 2012-2015
- INTELITERM: Sistema inteligente de gestión terminológica para traductores.
Funding source: Spanish Ministry of Education: Research & Development
National Programme.

Expertise - HUM 106 (V)
Regional R&D Projects
 2006-2009
- La contratación turística electrónica multilingüe como mediación intercultural: aspectos
legales, traductológicos y terminológicos. [Multi-lingual Tourism E-contracts: legal,
translational and terminological aspects].
R&D Project for Excellence. Andalusian Ministry of Education, Science and Technology.
 2008-2012
- Nuevo diccionario de aprendizaje (learners' dictionary) del español como lengua
extranjera de difusión on-line.[New on-line learners’ dictionary of Spanish as a Foreign
Language].
R&D Project for Excellence. Andalusian Ministry of Education, Science and Technology.

Expertise - HUM 106 (VI)
Others
 2 Coordinated Research Activities
 5 Networks
 More than 20 E-learning and Innovation Projects
 More than 20 Thesis Dissertations

 More than 40 M.A. Dissertations
 For further information see http://www.uma.es/hum106/investigacion_en.html

Key staff involved in EXPERT (I)
1. Prof. Gloria CORPAS (gcorpas@uma.es)
-

Professor in Translation and Interpreting at UMA.

-

Prof. G. Corpas in no. 2 in the Spanish national ranking of Translation and Interpreting
(http://hindexscholar.com)

-

She acts as a Ministry advisor on the Bologna Process via the Spanish Agency ANECA.

-

She has been actively involved in the development of the UNE-EN 15038:2006 as AEN/CTN
174 and CEN/BTTF 138 Spanish delegate. Spanish expert for the future ISO Standard (ISO
TC37/SC2-WG6 "Translation and Interpreting").

-

Her publications also deal with didactic innovation, the design of virtual university knowledge
communities for Translation studies, virtual collaborative environments, e-learning platforms and
virtual teaching of subjects specializing in scientific and technical translations.

-

She has one patent (ReCor), and she received in 1995 the Euralex Verbatim Award and in
2007 Spanish Translation Technologies Observatory Award, with Dr. M. Seghiri.

Key staff involved in EXPERT (II)
1. Dr. Jorge LEIVA (leiva@uma.es)
- Senior Lecturer in Translation and Interpreting at UMA
UMA and professional translator.

at

- His research fields include specialised translation and phraseology.

- From September 2008 to March 2009 was awarded a research grant
for Harvard University (Massachusetts, USA).
- He has also been a member of a variety of research projects
focusing on specialised translation, text corpora and e-learning.
- University’s 2005 Ph. D. Best Student Prize.

Key staff involved in EXPERT (III)
3. Dr. Miriam SEGHIRI (seghiri@uma.es)
-

Senior lecturer in Translation and Interpreting at UMA.

-

She has also worked at Dickinson College (PA, USA), the University of Murcia and the
University of Cordoba.

-

She has participated in several European, national and regional R&D projects.

-

She has been awarded several research grants for Dickinson College (PA, USA) and Università
di Perugia (Italy).

-

Her research fields range from specialised translation to corpus linguistics and ICTs, the
outcome of which has been made public in national and international academic conferences
and publications.

-

-She has one patent (ReCor), and she received in 2007 Spanish Translation Technologies
Observatory Award, with Dr. G. Corpas. University’s 2006 Ph. D. Best Student Prize.

Key staff involved in EXPERT (V)
5. ESRs


ESR1:
Anna Zaretskaya, from Russia. Investigation of translators’
requirements from translation technologies (Supervisor: Miriam Serghiri at UMA, and
co-supervised by Elia Yuste from Pangeanic). Permit Visa: pending.



ESR3:
Hernani Costa, from Portugal. Collection and preparation of multilingual
data for multiple corpus-based approaches to translation (Supervisor: Dr. Gloria Corpas
at UMA and co-supervised by Marco Trombetti from Translated and ER1). ERS3 signed
his contract on the 2nd of September 2013 .

6. ERs
 ER1 will work on Investigation of automatic methods for collection and preparation of
multilingual data (Supervisor name: Marco Trombetti at Translated and co-supervised by
Jorge Leiva from UMA).

- An academia-industry research consortium dedicated to delivering
disruptive innovations in digital media and intelligent content such
as multilingual content analysis
- Led by Trinity College Dublin and co-hosted by Dublin City
University
- Sponsor by both Science Foundation Ireland and Industry
Partners including Symantec, DNP, Microsoft, Intel, Xanadu,
WeLocalize, Alchemy

CNGL Research Themes
Tuning Text
Analytics
Event & Opinion
Extraction
Content Aware
Multilingual
Search
Contextualisation
Modality
Independent
Intelligent Machine
Translation
Social Localisation
Intelligent
Post-Editing

CNGL @ Dublin City University







Professor Josef van Genabith: NLP, MT
Professor Qun Liu: MT, NLP
Dr. Gareth Jones: IR, Multi-Modal
Dr. Sharon O'Brien: Translation Technology
Dr. Jennifer Foster: NLP
40+ staffs and PhD students

Welcome to
Hermes Traducciones!

15th company in
Southern Europe and
154th in the world
according to
Common Sense
Advisory’s 2013 listing

hermestr@hermestrans.com
www.hermestrans.com
Madrid Office:
Cólquide, 6 - portal 2, 3.º - I
Edificio Prisma
28230 Las Rozas (Madrid, Spain)
Teléfono: (+34) 91 640 7640
Fax: (+34) 91 637 8023

Malaga Office:
Parque Tecnológico de Andalucía
Av. Juan López Peñalver, 17 - 3.ª - 6
Edificio Centro de Empresas
29590 Campanillas (Malaga, Spain)
Teléfono: (+34) 952 020525
Fax: (+34) 952 020529

COMMITMENT WITH QUALITY:
Cooperation with official agencies

•

Company present in the Spanish Technical Committee #174 at
AENOR for quality translation services, with the support of the
European Committee for Standardisation (CEN), the Spanish

Standardisation Association (AENOR) and the European Union of
Translation Companies Association (EUATC).
•

Juan José Arevalillo, Hermes Traducciones Managing Director, is
the current Chairman of the Spanish Technical Committee #174 in
AENOR for translation and related services.

SGR
PERFORMANCE MANAGEMENT SYSTEM
PRODUCTIVITY AND QUALITY CONTROL
• Daily monitoring of quality and productivity of our team in order to
guarantee an improved control over our translations
• Review, revision and edit of our translations by a second or third
specialist other than the original translator
• Use of proprietary templates for revising,
reviewing and editing our translations in
compliance with EN15038 quality standard
• Use of the LISA QA MODEL standard for
localisation review and SAE J2450
standard for automotive translation review

PLUNET-BASED
TRANSLATION PROJECT MANAGEMENT
• End to end translation project management system through a Plunet
platform
• Compliant with our double quality certification requirements

HERMES DIFFERENCES
•

Founded in 1991 by former employees of the Localisation Group of Digital
Equipment Corporation (currently Hewlett-Packard).

•

Specialising in software and website localisation, as well as technical
translation.

•

70% of our production is done by our own in house resources.

•

Translation services in 30 language pairs.

•

Ongoing training of our staff.

•

End-to-end solutions for our customers.

•

Internal department of applied technology, including MT.
Image of Hermes god at
Louvre Museum.

HERMES EXPERIENCE
• 28 years of localisation experience (22 as a company and 6 at Digital
Equipment Corporation, currently Hewlett-Packard).

• Over 60,000 localisation projects in 22 years, including multi-lingual projects.
• Comprehensive expertise and know-how in computer-assisted translation and localisation-specific
applications: SDL-Trados product family,SDL Studio 2011, memoQ, Déjà-Vu, IBM Translation Manager,
Star Transit, WordFast, Catalyst, Passolo, across, Idiom World Server, Microsoft Helium, Microsoft
Localisation Studio and many others.
• Comprehensive expertise and know-how in quality control programs: HelpQA, HTML HelpQA, Apsic
Xbench, MS Help Workshop, MS HTML Help Workshop and others.
• Comprehensive know-how in DTP, text processing and imaging applications: Adobe FrameMaker,
Microsoft Word, Adobe InDesign, Adobe PageMaker, PaintShop Pro,
Adobe Illustrator, Adobe Photoshop, etc.
• Proprietary terminology database covering more than 1,000,000 entries of different languages and
domains.
• 35 million of managed words per year, and an average of 6,000 translation per year.
• Centralized Plunet-based translation project management system.

•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

HERMES SERVICES
Software Localisation.
Website Localisation.
General and Technical Translation.
Interpreting.
Review, Revision and Editing.
Quality Control and Proof-reading of
Third-Party Translations.
MT Post-editing.
Style Guides.
Desktop Publishing.
Linguistic Consultancy.
Glossary and Terminology Management.
Documentation Management.
Technical Writing.
Multimedia Translation.
Video and Audio Tape Transcription,
including voice recording at studios.
© Benedikt Hohenau
Style Guides.
Course Giving on Translation and Localisation.
Website design, development and internationalization.

ACADEMIC COMMITMENT:
Hermes works with the following Spanish Universities

Juan José Arevalillo Doval
juanjo.arevalillo@hermestrans.com
www.hermestrans.com
@JJ_Arevalillo
@hermestrans

Hermenet

Pangeanic
• Pangeanic took the initial versions of Moses in 2009 as an in-house project
to help translation production needs. It was the first company in the world to
transition Moses from academic to a commercial environment, as reported
in Euromatrixplus.
• The small in-house project grew into a full platform overcoming many of its
limitations with a full set of new features and offering the translation
community the opportunity to have machine translation for the masses.
• The platform now includes full re-training features, glossary upload, a full
TMX / training material management system, the ability to create engines on
the fly as well as the possibility to hybridate with pre- and post- modules.
• Our presentation will describe the tool we have made available for the
project.

Translated srl
Alessandro Cattelan, Director of Localization Operations

Who is Translated?
Web-based Language Service Provider
Since 1999, providing human translation in 80 languages to over

35,000 customers thanks to 70,000 professional translators.

Tech Company
Focus on technology to automate processes and make
translation more efficient.

Workflow Automation

Fully automated translation management system
that connects customers and translators.
Automate all repetitive tasks and focus only on
what brings value to our customers.

Content Reuse
MyMemory
 Largest translation memory server (6 billion words)
 Integrated in most computer-assisted translation tools
 100% Free
Leverage existing linguistic content to make translators
more productive.

Translation Environment
MateCat
 Deep integration of MT - MT technology that learns from the
users in real time
 Collaborative environment - Online translation with multiple
users
 Fast and easy to use - Virtually no learning curve
 Increased privacy protection – Clients’ documents are not
sent out to translators

Translated
 http://www.translated.net

MateCat
 http://www.matecat.com

MyMemory
 http://mymemory.translated.net

Universität des Saarlandes

USAAR

USAAR: institution
# students: 18 500
16% international students
• Dept. of Applied linguistics, Translation
and Interpreting
• Dept. of Computational Linguistics &
Phonetics
• German Research Centre for Artificial
Intelligence
• Cluster of Excellence on Multimodal
Computing and Interaction
• Max Planck Institute for Computer Science
• Max Planck Institute for Software Systems

USAAR: WP4
WP4 Language technology,
domain ontologies and terminologies
Dr. Paul Schmidt
 Chair of Machine Translation
 in charge of scientific and
technical/technological
aspects

Prof. Elke Teich
 Chair of English Linguistics
and Translation Science
 in charge of administrative,
legal and financial aspects

José Manuel Martínez
 research assistant
 administration

USAAR: ESRs
Santanu Pal – ESR2
 Investigation of an ideal translation
workflow for hybrid translation
approaches
 India
 B.Tech, Computer Science &
Engineering
 Certification course on Linguistics
 M.Tech, Computer Technology
 Thesis: “Improved Alignment in
Statistical Machine Translation”

Liling Tan – ESR5
 Use of terminologies and
ontologies to improve corpusbased approaches to translation
 Singapore
 BA in Linguistics
 MA in Computational Linguistics
 Thesis: Examining Crosslingual
Word Sense Disambiguation

University of Sheffield






Natural Language Processing Group
•
Since 1993
•
Areas: language resources and architectures (GATE), information access
•
Q&A, summarisation), foundational topics
•
Collaboration with Machine Learning and Speech groups
•
Newly created MT lab
Academics doing research on MT
• Lucia Specia
• Trevor Cohn
• Rob Gaizauskas
Other MT people
• 3 post-docs, 2 ESRs/PhD students, 5 PhD students

Projects and areas
of interest (I)

•

Modist (EPSRC): Modeling Discourse in Statistical
Translation

•

Barista (EPSRC): Non-Parametric Models of Phrase-based
Machine Translation

•

Expert (EU): EXPloiting Empirical appRoaches to Translation

•

QTLaunchpad (EU): Preparation and Launch of a LargeScale Action for Quality Translation Technology

Projects and areas
of interest (II)
•

SlaTr (Google): A Joint Model of Spoken Language
Translation

•

QuEst (PASCAL2 Harvest): Open source tool for MT
Quality Estimation

•

TaaS (EU): Terminology as a Service

•

ACCURAT (EU): Analysis and Evaluation of Comparable
Corpora for Under Resourced Areas of Machine Translation

1. EXPERT Winter School Partner Introductions

1. EXPERT Winter School Partner Introductions

Recommended

Recommended

More Related Content

What's hot

What's hot (10)

Viewers also liked

Viewers also liked (16)

Similar to 1. EXPERT Winter School Partner Introductions

Similar to 1. EXPERT Winter School Partner Introductions (20)

More from RIILP

More from RIILP (20)

Recently uploaded

Recently uploaded (20)

1. EXPERT Winter School Partner Introductions