SlideShare a Scribd company logo
Localizing International Content for Search,
Data Mining and Analytics Applications
Andrew Rufener
E: andrew.rufener@omniscien.com
Copyright © 2017 Omniscien Technologies. All Rights Reserved.
Agenda
• Who we are and what we do
• Setting the scene – a architecture for our discussion and the key challenges
• The localization workflow and why content localization and search are
intertwined
• Illustrating using a practical example
• Summary & Recommendations
COMPANY OVERVIEW
• Founded in 2007 as Asia Online, changed company name in 2016 to
Omniscien Technologies
• Award winning, leading global supplier of specialized and highly scalable
language processing, machine translation and machine learning solutions
offering in excess of 540 global language pairs
• HQ in Singapore, European operation in The Hague, The Netherlands, Asian
operation in Bangkok, Thailand
• Global customer base in North America, Europe and Asia Pacific
Copyright © 2017 Omniscien Technologies. All Rights Reserved.
MARKETS AND SOLUTIONS
• eCommerce and Online Travel
Automated, high-volume localization of complex product catalogue information as
well as user generated content and reviews
• Online Research System and Digital Publishing
Automated, high-volume tagging, language processing, translation and transliteration
of legal, intellectual property, scientific, financial and business information content as
well as generation of relevant meta data
• Government & Intelligence
Automated, high-volume language identification, entity and entity relationship recognition,
sentiment analysis, linking and translation and transliteration of various information sources
• Technology & Enterprise
Complex language processing, tagging, enriching and localization
• Localization Industry
Support of complex and high-volume localization
• Media and Subtitling
Subtitle extraction and manufacturing from different sources, support of re-writing source
for subtitling, localization and post-editing, automated placement in frames and improvement
• eDiscovery
Automated a high volume content tagging, localization and discovery for
litigation data gathering, analysis and support
Setting the scene and why content localization
and search are intertwined
Copyright © 2017 Omniscien Technologies. All Rights Reserved.
• 31, MARCH 2017
SIMPLIFIED REFERENCE ARCHITECTURE FOR OUR DISCUSSION
Unstructured
Data
Structured
Data
Search “Engine”
HOW DO I KNOW WHAT TO “ASK” FOR?
Unstructured
Data
Structured
Data
Search “Engine”
• How do I construct the right
query / search?
• How do I know what
keywords to use?
• Semantic or Concept Search
• Keyword lists
• Domain classifications
• Keyword based domain
classification (AI)
• …
HOW DO WE DEAL WITH MULTI-LINGUAL CONTENT?
Unstructured
Data
Structured
Data
Search “Engine”
Option 1:
Normalize to a single language
Option 2:
Cross-lingual search
What domain, how do
we maintain quality,
what is quality, what
language do we
normalize to..?
What kind of data, is
normalization or
transliteration needed,
how do we dal with
variants?
THE GENERIC LOCALIZATION WORKFLOW
Extraction Enrichment Translation Enrichment Delivery
1 2 3 4 5
Extract from
source format
to text or XML
Identifying
entities, entity
relationships,
adding meta
data, sentiment
analysis, etc.
Translation
and/or
transliteration,
normalizing
terminology,
maintaining
meta-data
Post-translation
corrections,
additional
enrichment and
classification,
etc.
Delivery to user
/ application
with or without
enrichments
THE GENERIC LOCALIZATION WORKFLOW
Extraction Enrichment Translation Enrichment Delivery
1 2 3 4 5
Extract from
source format
to text or XML
Identifying
entities, entity
relationships,
adding meta
data, sentiment
analysis, etc.
Translation
and/or
transliteration,
normalizing
terminology,
maintaining
meta-data
Post-translation
corrections,
additional
enrichment and
classification,
etc.
Delivery to user
/ application
with or without
enrichments
THE GENERIC LOCALIZATION WORKFLOW
Extraction Enrichment Translation Enrichment Delivery
1 2 3 4 5
Extract from
source format
to text or XML
Identifying
entities, entity
relationships,
adding meta
data, sentiment
analysis, etc.
Translation
and/or
transliteration,
normalizing
terminology,
maintaining
meta-data
Post-translation
corrections,
additional
enrichment and
classification,
etc.
Delivery to user
/ application
with or without
enrichments
- Translation naturally provides the translated source – using either Statistical or Neural
Machine Translation
- However, bi-products and translation capabilities that are interesting in this context
are:
- Ability to normalize terminology
- Pre-processing and enriching content prior to translation (tagging, conversion..)
- Using the term analysis generated during the engine build
Extrémne problémy extrémne problémy extrémne problémy extrémnej problémy
refraktérnym
mnohopočetným
myelómom
refraktérnym mnohopočetným
myelómom
refraktérnym mnohopočetným
myelómom žiaruvzdorné myelómom je mladších
veľkosti nádoru veľkosť nádoru veľkosti nádoru veľkosti nádoru
JA-EN Sample Patent Translations; one is machine, one human
• The coagulation time was determined as described above.
• The setting time was determined as described above.
• The lighting device also typically includes a light source disposed at the end of the light conductor.
• The light device typically also includes a light source arranged at an end of the light guide.
• Such communication between components is but one example of a unidirectional communication system.
• Such communication between components is only one example of a one-way communication system.
• The use of a hearing aid by a healthcare provider is routine.
• The use of a stethoscope by health care providers is routine.
• This can further enhance the electrical and long-term performance of the backsheet.
• This may further increase the electrical properties and long-term performance of the backsheets.
• Initial Binding measurements were performed as described above for Plaque Initial Binding measurements.
• Initial bonding measurements were carried out as described above for Plaque Initial Bonding Measurements.
• The subtractive color mixture selected may depend on the metalized surface area and the resistance material used.
• The subtractive process selected can depend upon the metallized structured surface region and the resist material
utilized.
Copyright © 2017 Omniscien Technologies. All Rights Reserved.
THE GENERIC LOCALIZATION WORKFLOW
Extraction Enrichment Translation Enrichment Delivery
1 2 3 4 5
Extract from
source format
to text or XML
Identifying
entities, entity
relationships,
adding meta
data, sentiment
analysis, etc.
Translation
and/or
transliteration,
normalizing
terminology,
maintaining
meta-data
Post-translation
corrections,
additional
enrichment and
classification,
etc.
Delivery to user
/ application
with or without
enrichments
A REAL-LIFE EXAMPLE APPLICATION
• Example term (n-gram) extraction; extracted from actual human translations. The -gram variants show the
(green) suggested n-gram based on frequency but also the other candidates that were found. ”Distance” is
an available parameter.
• This process provides term variants, distance but also term relationships
• The results can be used for different purposes, amongst others
• Term normalization
• Term suggestions for search
• In conjunction with other meta data, domain identification
• …
Copyright © 2017 Omniscien Technologies. All Rights Reserved.
actual swirl speed Vitesse de rotation réelle la vitesse de turbulence réelle vitesse réelle tourbillon vitesse réelle de remous
high byte octet haut octet de poids fort octet haut byte élevé
non-freezing fluid fluide antigel fluide incongelable sans gel fluide fluide de non-congélation
dental spray jet dentaire pulvérisation dentaire jet dentaire jet dentaire
A REAL-LIFE EXAMPLE APPLICATION (2)
• WIPO Patentscope (Patent Research) uses this
data extensively
• WIPO Pearl is an example application
• Many other examples exist in
• eCommerce (Products, Brands, etc.)
• Business Information (Names,
Locations, etc.)
• Scientific Research Platforms (Medical
Terms, Chemical Compounds, Domain
Identification, etc.)
• ..
Copyright © 2017 Omniscien Technologies. All Rights Reserved.
Source: http://www.wipo.int/wipopearl/search/linguisticSearch.html
A FEW KEY RECOMMENDATIONS
1. Take a holistic view of your workflow end to end
2. Work from the desired application result backwards
3. Ensure you review the data production and localization process, both the
engine build as well as the production workflow. Ensure valuable meta
data is not discarded. The localization team will have a vey different view
on the “value” of certain data elements than the team handling search or
even the application
4. Keep in mind the enrichment capabilities of the localization workflow
ranging from entities, sentiment right to the ability to manipulate data on
the fly and call external data sources and subsequently “locking” the data
in for localization
Copyright © 2017 Omniscien Technologies. All Rights Reserved.
SUMMARY
• The Machine Translation and associated Language Processing workflow
provides a wealth of information that can support search
• Understanding the interaction between the content localization and search
is critical to good search results and allows balancing precision and recall
• With Machine Learning entering translation with Neural Machine
Translation, a number of Machine learning applications are enabled
• Use the localization workflow to your advantage in a multi-lingual
environment
Copyright © 2017 Omniscien Technologies. All Rights Reserved.
Copyright © 2017 Omniscien Technologies. All Rights Reserved.
Q & A

More Related Content

What's hot

II-SDV 2012 Dealing with Large Data Volumes in Statistical Analysis and Text ...
II-SDV 2012 Dealing with Large Data Volumes in Statistical Analysis and Text ...II-SDV 2012 Dealing with Large Data Volumes in Statistical Analysis and Text ...
II-SDV 2012 Dealing with Large Data Volumes in Statistical Analysis and Text ...Dr. Haxel Consult
 
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
Dr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceDr. Haxel Consult
 
II-SDV 2017: What is Innovation and how can we measure it?
II-SDV 2017: What is Innovation and how can we measure it?II-SDV 2017: What is Innovation and how can we measure it?
II-SDV 2017: What is Innovation and how can we measure it?
Dr. Haxel Consult
 
II-SV 2017: How to effectively monitor Technological Developments in IP
II-SV 2017: How to effectively monitor Technological Developments in IPII-SV 2017: How to effectively monitor Technological Developments in IP
II-SV 2017: How to effectively monitor Technological Developments in IP
Dr. Haxel Consult
 
II-SDV 2017: Gridlogics Technologies
II-SDV 2017: Gridlogics TechnologiesII-SDV 2017: Gridlogics Technologies
II-SDV 2017: Gridlogics Technologies
Dr. Haxel Consult
 
AI-SDV 2021: Francisco Webber - Efficiency is the New Precision
AI-SDV 2021: Francisco Webber - Efficiency is the New PrecisionAI-SDV 2021: Francisco Webber - Efficiency is the New Precision
AI-SDV 2021: Francisco Webber - Efficiency is the New Precision
Dr. Haxel Consult
 
IC-SDV 2018: Averbis
IC-SDV 2018: AverbisIC-SDV 2018: Averbis
IC-SDV 2018: Averbis
Dr. Haxel Consult
 
II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...
II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...
II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...
Dr. Haxel Consult
 
IC-SDV 2019: Down-to-earth machine learning: What you always wanted your data...
IC-SDV 2019: Down-to-earth machine learning: What you always wanted your data...IC-SDV 2019: Down-to-earth machine learning: What you always wanted your data...
IC-SDV 2019: Down-to-earth machine learning: What you always wanted your data...
Dr. Haxel Consult
 
II-PIC 2017: Gain insight into technical, legal and business information thro...
II-PIC 2017: Gain insight into technical, legal and business information thro...II-PIC 2017: Gain insight into technical, legal and business information thro...
II-PIC 2017: Gain insight into technical, legal and business information thro...
Dr. Haxel Consult
 
II-SDV 2017: Effective Communication of Complex Monitoring Results: An innova...
II-SDV 2017: Effective Communication of Complex Monitoring Results: An innova...II-SDV 2017: Effective Communication of Complex Monitoring Results: An innova...
II-SDV 2017: Effective Communication of Complex Monitoring Results: An innova...
Dr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceDr. Haxel Consult
 
II-SDV 2017: Deep SEARCH 9
II-SDV 2017: Deep SEARCH 9II-SDV 2017: Deep SEARCH 9
II-SDV 2017: Deep SEARCH 9
Dr. Haxel Consult
 
II-PIC 2017: Product Presentation LexisNexis
II-PIC 2017: Product Presentation LexisNexisII-PIC 2017: Product Presentation LexisNexis
II-PIC 2017: Product Presentation LexisNexis
Dr. Haxel Consult
 
II-SDV 2012 Automatic Query Re-Ranking in a Patent Database by Local Frequenc...
II-SDV 2012 Automatic Query Re-Ranking in a Patent Database by Local Frequenc...II-SDV 2012 Automatic Query Re-Ranking in a Patent Database by Local Frequenc...
II-SDV 2012 Automatic Query Re-Ranking in a Patent Database by Local Frequenc...Dr. Haxel Consult
 
II-PIC 2017: The Use of Patent Information for Innovation and Competitive Int...
II-PIC 2017: The Use of Patent Information for Innovation and Competitive Int...II-PIC 2017: The Use of Patent Information for Innovation and Competitive Int...
II-PIC 2017: The Use of Patent Information for Innovation and Competitive Int...
Dr. Haxel Consult
 
SciBite
SciBiteSciBite
IC-SDV 2018: Deep Search 9
IC-SDV 2018: Deep Search 9IC-SDV 2018: Deep Search 9
IC-SDV 2018: Deep Search 9
Dr. Haxel Consult
 
Effieient Algorithms to Find Frequent Itemset using Data Mining
Effieient Algorithms to Find Frequent Itemset using Data MiningEffieient Algorithms to Find Frequent Itemset using Data Mining
Effieient Algorithms to Find Frequent Itemset using Data Mining
IRJET Journal
 

What's hot (20)

II-SDV 2012 Dealing with Large Data Volumes in Statistical Analysis and Text ...
II-SDV 2012 Dealing with Large Data Volumes in Statistical Analysis and Text ...II-SDV 2012 Dealing with Large Data Volumes in Statistical Analysis and Text ...
II-SDV 2012 Dealing with Large Data Volumes in Statistical Analysis and Text ...
 
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2017: What is Innovation and how can we measure it?
II-SDV 2017: What is Innovation and how can we measure it?II-SDV 2017: What is Innovation and how can we measure it?
II-SDV 2017: What is Innovation and how can we measure it?
 
II-SV 2017: How to effectively monitor Technological Developments in IP
II-SV 2017: How to effectively monitor Technological Developments in IPII-SV 2017: How to effectively monitor Technological Developments in IP
II-SV 2017: How to effectively monitor Technological Developments in IP
 
II-SDV 2017: Gridlogics Technologies
II-SDV 2017: Gridlogics TechnologiesII-SDV 2017: Gridlogics Technologies
II-SDV 2017: Gridlogics Technologies
 
AI-SDV 2021: Francisco Webber - Efficiency is the New Precision
AI-SDV 2021: Francisco Webber - Efficiency is the New PrecisionAI-SDV 2021: Francisco Webber - Efficiency is the New Precision
AI-SDV 2021: Francisco Webber - Efficiency is the New Precision
 
IC-SDV 2018: Averbis
IC-SDV 2018: AverbisIC-SDV 2018: Averbis
IC-SDV 2018: Averbis
 
II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...
II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...
II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...
 
IC-SDV 2019: Down-to-earth machine learning: What you always wanted your data...
IC-SDV 2019: Down-to-earth machine learning: What you always wanted your data...IC-SDV 2019: Down-to-earth machine learning: What you always wanted your data...
IC-SDV 2019: Down-to-earth machine learning: What you always wanted your data...
 
II-PIC 2017: Gain insight into technical, legal and business information thro...
II-PIC 2017: Gain insight into technical, legal and business information thro...II-PIC 2017: Gain insight into technical, legal and business information thro...
II-PIC 2017: Gain insight into technical, legal and business information thro...
 
II-SDV 2017: Effective Communication of Complex Monitoring Results: An innova...
II-SDV 2017: Effective Communication of Complex Monitoring Results: An innova...II-SDV 2017: Effective Communication of Complex Monitoring Results: An innova...
II-SDV 2017: Effective Communication of Complex Monitoring Results: An innova...
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2017: Deep SEARCH 9
II-SDV 2017: Deep SEARCH 9II-SDV 2017: Deep SEARCH 9
II-SDV 2017: Deep SEARCH 9
 
II-PIC 2017: Product Presentation LexisNexis
II-PIC 2017: Product Presentation LexisNexisII-PIC 2017: Product Presentation LexisNexis
II-PIC 2017: Product Presentation LexisNexis
 
II-SDV 2012 Automatic Query Re-Ranking in a Patent Database by Local Frequenc...
II-SDV 2012 Automatic Query Re-Ranking in a Patent Database by Local Frequenc...II-SDV 2012 Automatic Query Re-Ranking in a Patent Database by Local Frequenc...
II-SDV 2012 Automatic Query Re-Ranking in a Patent Database by Local Frequenc...
 
II-PIC 2017: The Use of Patent Information for Innovation and Competitive Int...
II-PIC 2017: The Use of Patent Information for Innovation and Competitive Int...II-PIC 2017: The Use of Patent Information for Innovation and Competitive Int...
II-PIC 2017: The Use of Patent Information for Innovation and Competitive Int...
 
SciBite
SciBiteSciBite
SciBite
 
IC-SDV 2018: Deep Search 9
IC-SDV 2018: Deep Search 9IC-SDV 2018: Deep Search 9
IC-SDV 2018: Deep Search 9
 
Effieient Algorithms to Find Frequent Itemset using Data Mining
Effieient Algorithms to Find Frequent Itemset using Data MiningEffieient Algorithms to Find Frequent Itemset using Data Mining
Effieient Algorithms to Find Frequent Itemset using Data Mining
 

Similar to II-SDV 2017: Localizing International Content for Search, Data Mining and Analytics Applications

Introduction to Anzo Unstructured
Introduction to Anzo UnstructuredIntroduction to Anzo Unstructured
Introduction to Anzo Unstructured
Cambridge Semantics
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - finallucenerevolution
 
Shrinking the Haystack" using Solr and OpenNLP
Shrinking the Haystack" using Solr and OpenNLPShrinking the Haystack" using Solr and OpenNLP
Shrinking the Haystack" using Solr and OpenNLP
lucenerevolution
 
NLP, Expert system and pattern recognition
NLP, Expert system and pattern recognitionNLP, Expert system and pattern recognition
NLP, Expert system and pattern recognition
Mohammad Ilyas Malik
 
Chatbots: Automated Conversational Model using Machine Learning
Chatbots: Automated Conversational Model using Machine LearningChatbots: Automated Conversational Model using Machine Learning
Chatbots: Automated Conversational Model using Machine Learning
AlgoAnalytics Financial Consultancy Pvt. Ltd.
 
Semantic Search: Fast Results from Large, Non-Native Language Corpora with Ro...
Semantic Search: Fast Results from Large, Non-Native Language Corpora with Ro...Semantic Search: Fast Results from Large, Non-Native Language Corpora with Ro...
Semantic Search: Fast Results from Large, Non-Native Language Corpora with Ro...
Databricks
 
Enterprise data science at scale
Enterprise data science at scaleEnterprise data science at scale
Enterprise data science at scale
Carolyn Duby
 
Groundbreaking and Game-changing Enterprise Search Webinar
Groundbreaking and Game-changing Enterprise Search WebinarGroundbreaking and Game-changing Enterprise Search Webinar
Groundbreaking and Game-changing Enterprise Search Webinar
Concept Searching, Inc
 
Precision Content™ Tools, Techniques, and Technology
Precision Content™ Tools, Techniques, and TechnologyPrecision Content™ Tools, Techniques, and Technology
Precision Content™ Tools, Techniques, and Technology
dclsocialmedia
 
Expert systems
Expert systemsExpert systems
Expert systems
Dr. C.V. Suresh Babu
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
RajkiranVeluri
 
ICIC 2014 New Product Introduction Gridlogisc
ICIC 2014 New Product Introduction GridlogiscICIC 2014 New Product Introduction Gridlogisc
ICIC 2014 New Product Introduction Gridlogisc
Dr. Haxel Consult
 
Data science workshop
Data science workshopData science workshop
Data science workshop
Hortonworks
 
Introduction to Enterprise Search
Introduction to Enterprise SearchIntroduction to Enterprise Search
Introduction to Enterprise Search
Findwise
 
Open AI Tools for Data Analytics
Open AI Tools for Data AnalyticsOpen AI Tools for Data Analytics
Open AI Tools for Data Analytics
Mohammad Usman
 
Big Data + Sentiment Analysis = Awesome
Big Data + Sentiment Analysis = AwesomeBig Data + Sentiment Analysis = Awesome
Big Data + Sentiment Analysis = Awesome
Adel Rahimi
 
Oracle analytics cloud overview feb 2017
Oracle analytics cloud overview   feb 2017Oracle analytics cloud overview   feb 2017
Oracle analytics cloud overview feb 2017
aioughydchapter
 
An Introduction to Natural Language Processing
An Introduction to Natural Language ProcessingAn Introduction to Natural Language Processing
An Introduction to Natural Language Processing
Tyrone Systems
 
Semantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISemantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBI
Simon Jupp
 

Similar to II-SDV 2017: Localizing International Content for Search, Data Mining and Analytics Applications (20)

Introduction to Anzo Unstructured
Introduction to Anzo UnstructuredIntroduction to Anzo Unstructured
Introduction to Anzo Unstructured
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
 
Shrinking the Haystack" using Solr and OpenNLP
Shrinking the Haystack" using Solr and OpenNLPShrinking the Haystack" using Solr and OpenNLP
Shrinking the Haystack" using Solr and OpenNLP
 
File000162
File000162File000162
File000162
 
NLP, Expert system and pattern recognition
NLP, Expert system and pattern recognitionNLP, Expert system and pattern recognition
NLP, Expert system and pattern recognition
 
Chatbots: Automated Conversational Model using Machine Learning
Chatbots: Automated Conversational Model using Machine LearningChatbots: Automated Conversational Model using Machine Learning
Chatbots: Automated Conversational Model using Machine Learning
 
Semantic Search: Fast Results from Large, Non-Native Language Corpora with Ro...
Semantic Search: Fast Results from Large, Non-Native Language Corpora with Ro...Semantic Search: Fast Results from Large, Non-Native Language Corpora with Ro...
Semantic Search: Fast Results from Large, Non-Native Language Corpora with Ro...
 
Enterprise data science at scale
Enterprise data science at scaleEnterprise data science at scale
Enterprise data science at scale
 
Groundbreaking and Game-changing Enterprise Search Webinar
Groundbreaking and Game-changing Enterprise Search WebinarGroundbreaking and Game-changing Enterprise Search Webinar
Groundbreaking and Game-changing Enterprise Search Webinar
 
Precision Content™ Tools, Techniques, and Technology
Precision Content™ Tools, Techniques, and TechnologyPrecision Content™ Tools, Techniques, and Technology
Precision Content™ Tools, Techniques, and Technology
 
Expert systems
Expert systemsExpert systems
Expert systems
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
 
ICIC 2014 New Product Introduction Gridlogisc
ICIC 2014 New Product Introduction GridlogiscICIC 2014 New Product Introduction Gridlogisc
ICIC 2014 New Product Introduction Gridlogisc
 
Data science workshop
Data science workshopData science workshop
Data science workshop
 
Introduction to Enterprise Search
Introduction to Enterprise SearchIntroduction to Enterprise Search
Introduction to Enterprise Search
 
Open AI Tools for Data Analytics
Open AI Tools for Data AnalyticsOpen AI Tools for Data Analytics
Open AI Tools for Data Analytics
 
Big Data + Sentiment Analysis = Awesome
Big Data + Sentiment Analysis = AwesomeBig Data + Sentiment Analysis = Awesome
Big Data + Sentiment Analysis = Awesome
 
Oracle analytics cloud overview feb 2017
Oracle analytics cloud overview   feb 2017Oracle analytics cloud overview   feb 2017
Oracle analytics cloud overview feb 2017
 
An Introduction to Natural Language Processing
An Introduction to Natural Language ProcessingAn Introduction to Natural Language Processing
An Introduction to Natural Language Processing
 
Semantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISemantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBI
 

More from Dr. Haxel Consult

AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementAI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
Dr. Haxel Consult
 
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
Dr. Haxel Consult
 
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
Dr. Haxel Consult
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
Dr. Haxel Consult
 
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
Dr. Haxel Consult
 
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
Dr. Haxel Consult
 
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
Dr. Haxel Consult
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
Dr. Haxel Consult
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
Dr. Haxel Consult
 
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
Dr. Haxel Consult
 
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
Dr. Haxel Consult
 
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
Dr. Haxel Consult
 
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
Dr. Haxel Consult
 
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
Dr. Haxel Consult
 
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
Dr. Haxel Consult
 
AI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterAI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance Center
Dr. Haxel Consult
 
AI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IPAI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IP
Dr. Haxel Consult
 
AI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCAI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOC
Dr. Haxel Consult
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
Dr. Haxel Consult
 
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
Dr. Haxel Consult
 

More from Dr. Haxel Consult (20)

AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementAI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
 
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
 
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
 
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
 
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
 
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
 
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
 
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
 
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
 
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
 
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
 
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
 
AI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterAI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance Center
 
AI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IPAI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IP
 
AI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCAI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOC
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
 
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
 

Recently uploaded

Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
JungkooksNonexistent
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
eutxy
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
Gal Baras
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
keoku
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
nirahealhty
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Brad Spiegel Macon GA
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
Javier Lasa
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
natyesu
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
Rogerio Filho
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
ufdana
 
Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
GTProductions1
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
laozhuseo02
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
JeyaPerumal1
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptxInternet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
VivekSinghShekhawat2
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Sanjeev Rampal
 

Recently uploaded (20)

Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
 
Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptxInternet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
 

II-SDV 2017: Localizing International Content for Search, Data Mining and Analytics Applications

  • 1. Localizing International Content for Search, Data Mining and Analytics Applications Andrew Rufener E: andrew.rufener@omniscien.com Copyright © 2017 Omniscien Technologies. All Rights Reserved.
  • 2. Agenda • Who we are and what we do • Setting the scene – a architecture for our discussion and the key challenges • The localization workflow and why content localization and search are intertwined • Illustrating using a practical example • Summary & Recommendations
  • 3. COMPANY OVERVIEW • Founded in 2007 as Asia Online, changed company name in 2016 to Omniscien Technologies • Award winning, leading global supplier of specialized and highly scalable language processing, machine translation and machine learning solutions offering in excess of 540 global language pairs • HQ in Singapore, European operation in The Hague, The Netherlands, Asian operation in Bangkok, Thailand • Global customer base in North America, Europe and Asia Pacific Copyright © 2017 Omniscien Technologies. All Rights Reserved.
  • 4. MARKETS AND SOLUTIONS • eCommerce and Online Travel Automated, high-volume localization of complex product catalogue information as well as user generated content and reviews • Online Research System and Digital Publishing Automated, high-volume tagging, language processing, translation and transliteration of legal, intellectual property, scientific, financial and business information content as well as generation of relevant meta data • Government & Intelligence Automated, high-volume language identification, entity and entity relationship recognition, sentiment analysis, linking and translation and transliteration of various information sources • Technology & Enterprise Complex language processing, tagging, enriching and localization • Localization Industry Support of complex and high-volume localization • Media and Subtitling Subtitle extraction and manufacturing from different sources, support of re-writing source for subtitling, localization and post-editing, automated placement in frames and improvement • eDiscovery Automated a high volume content tagging, localization and discovery for litigation data gathering, analysis and support
  • 5. Setting the scene and why content localization and search are intertwined Copyright © 2017 Omniscien Technologies. All Rights Reserved. • 31, MARCH 2017
  • 6. SIMPLIFIED REFERENCE ARCHITECTURE FOR OUR DISCUSSION Unstructured Data Structured Data Search “Engine”
  • 7. HOW DO I KNOW WHAT TO “ASK” FOR? Unstructured Data Structured Data Search “Engine” • How do I construct the right query / search? • How do I know what keywords to use? • Semantic or Concept Search • Keyword lists • Domain classifications • Keyword based domain classification (AI) • …
  • 8. HOW DO WE DEAL WITH MULTI-LINGUAL CONTENT? Unstructured Data Structured Data Search “Engine” Option 1: Normalize to a single language Option 2: Cross-lingual search What domain, how do we maintain quality, what is quality, what language do we normalize to..? What kind of data, is normalization or transliteration needed, how do we dal with variants?
  • 9. THE GENERIC LOCALIZATION WORKFLOW Extraction Enrichment Translation Enrichment Delivery 1 2 3 4 5 Extract from source format to text or XML Identifying entities, entity relationships, adding meta data, sentiment analysis, etc. Translation and/or transliteration, normalizing terminology, maintaining meta-data Post-translation corrections, additional enrichment and classification, etc. Delivery to user / application with or without enrichments
  • 10. THE GENERIC LOCALIZATION WORKFLOW Extraction Enrichment Translation Enrichment Delivery 1 2 3 4 5 Extract from source format to text or XML Identifying entities, entity relationships, adding meta data, sentiment analysis, etc. Translation and/or transliteration, normalizing terminology, maintaining meta-data Post-translation corrections, additional enrichment and classification, etc. Delivery to user / application with or without enrichments
  • 11. THE GENERIC LOCALIZATION WORKFLOW Extraction Enrichment Translation Enrichment Delivery 1 2 3 4 5 Extract from source format to text or XML Identifying entities, entity relationships, adding meta data, sentiment analysis, etc. Translation and/or transliteration, normalizing terminology, maintaining meta-data Post-translation corrections, additional enrichment and classification, etc. Delivery to user / application with or without enrichments - Translation naturally provides the translated source – using either Statistical or Neural Machine Translation - However, bi-products and translation capabilities that are interesting in this context are: - Ability to normalize terminology - Pre-processing and enriching content prior to translation (tagging, conversion..) - Using the term analysis generated during the engine build Extrémne problémy extrémne problémy extrémne problémy extrémnej problémy refraktérnym mnohopočetným myelómom refraktérnym mnohopočetným myelómom refraktérnym mnohopočetným myelómom žiaruvzdorné myelómom je mladších veľkosti nádoru veľkosť nádoru veľkosti nádoru veľkosti nádoru
  • 12. JA-EN Sample Patent Translations; one is machine, one human • The coagulation time was determined as described above. • The setting time was determined as described above. • The lighting device also typically includes a light source disposed at the end of the light conductor. • The light device typically also includes a light source arranged at an end of the light guide. • Such communication between components is but one example of a unidirectional communication system. • Such communication between components is only one example of a one-way communication system. • The use of a hearing aid by a healthcare provider is routine. • The use of a stethoscope by health care providers is routine. • This can further enhance the electrical and long-term performance of the backsheet. • This may further increase the electrical properties and long-term performance of the backsheets. • Initial Binding measurements were performed as described above for Plaque Initial Binding measurements. • Initial bonding measurements were carried out as described above for Plaque Initial Bonding Measurements. • The subtractive color mixture selected may depend on the metalized surface area and the resistance material used. • The subtractive process selected can depend upon the metallized structured surface region and the resist material utilized. Copyright © 2017 Omniscien Technologies. All Rights Reserved.
  • 13. THE GENERIC LOCALIZATION WORKFLOW Extraction Enrichment Translation Enrichment Delivery 1 2 3 4 5 Extract from source format to text or XML Identifying entities, entity relationships, adding meta data, sentiment analysis, etc. Translation and/or transliteration, normalizing terminology, maintaining meta-data Post-translation corrections, additional enrichment and classification, etc. Delivery to user / application with or without enrichments
  • 14. A REAL-LIFE EXAMPLE APPLICATION • Example term (n-gram) extraction; extracted from actual human translations. The -gram variants show the (green) suggested n-gram based on frequency but also the other candidates that were found. ”Distance” is an available parameter. • This process provides term variants, distance but also term relationships • The results can be used for different purposes, amongst others • Term normalization • Term suggestions for search • In conjunction with other meta data, domain identification • … Copyright © 2017 Omniscien Technologies. All Rights Reserved. actual swirl speed Vitesse de rotation réelle la vitesse de turbulence réelle vitesse réelle tourbillon vitesse réelle de remous high byte octet haut octet de poids fort octet haut byte élevé non-freezing fluid fluide antigel fluide incongelable sans gel fluide fluide de non-congélation dental spray jet dentaire pulvérisation dentaire jet dentaire jet dentaire
  • 15. A REAL-LIFE EXAMPLE APPLICATION (2) • WIPO Patentscope (Patent Research) uses this data extensively • WIPO Pearl is an example application • Many other examples exist in • eCommerce (Products, Brands, etc.) • Business Information (Names, Locations, etc.) • Scientific Research Platforms (Medical Terms, Chemical Compounds, Domain Identification, etc.) • .. Copyright © 2017 Omniscien Technologies. All Rights Reserved. Source: http://www.wipo.int/wipopearl/search/linguisticSearch.html
  • 16. A FEW KEY RECOMMENDATIONS 1. Take a holistic view of your workflow end to end 2. Work from the desired application result backwards 3. Ensure you review the data production and localization process, both the engine build as well as the production workflow. Ensure valuable meta data is not discarded. The localization team will have a vey different view on the “value” of certain data elements than the team handling search or even the application 4. Keep in mind the enrichment capabilities of the localization workflow ranging from entities, sentiment right to the ability to manipulate data on the fly and call external data sources and subsequently “locking” the data in for localization Copyright © 2017 Omniscien Technologies. All Rights Reserved.
  • 17. SUMMARY • The Machine Translation and associated Language Processing workflow provides a wealth of information that can support search • Understanding the interaction between the content localization and search is critical to good search results and allows balancing precision and recall • With Machine Learning entering translation with Neural Machine Translation, a number of Machine learning applications are enabled • Use the localization workflow to your advantage in a multi-lingual environment Copyright © 2017 Omniscien Technologies. All Rights Reserved.
  • 18. Copyright © 2017 Omniscien Technologies. All Rights Reserved. Q & A