Industrial Natural Language Processing & Information Extraction: a research area of the chair for technologies and management of digital transformation from the university of Wuppertal, Germany.
For more information, see here: https://www.tmdt.uni-wuppertal.de/de
3. 3
Industrial Natural Language Processing & Information Extraction
Chair of Technologies and Management of Digital Transformation, University of Wuppertal
Industrial Natural Language Processing
Overview
NLP
NLU
summarization
semantic
parsing
sentiment
analysis
dialogue
agents
natural
language
inference
question
answering
machine
translation
text
categorization
syntactic
parsing
POS
tagging
keyword
extraction
named
entity
recognition
topic
recognition
Natural Language Processing
Developing and applying techniques
and methods for the automatic
processing of text
Industrial Natural Language
Processing
Developing and applying techniques
and methods for the automatic
processing of text in industry by
explicitly considering the
requirements and circumstances of
industrial environments
4. 4
Industrial Natural Language Processing & Information Extraction
Chair of Technologies and Management of Digital Transformation, University of Wuppertal
Industrial Natural Language Processing
Research Goals
Reliable application and deployment of NLP in industrial
environments
Anonymization of textual data in order to be able to forward
it to third parties
Exploration of new areas for the use of natural language
processing in the wild
5. 5
Industrial Natural Language Processing & Information Extraction
Chair of Technologies and Management of Digital Transformation, University of Wuppertal
Industrial Natural Language Processing
NLP in Industrial Environments
Reliable application and deployment of NLP in industrial
environments
Design, develop and evaluate software architectures to make NLP useable
by non-technical users
Improve the process of deploying and using NLP in industrial environments
Analyze textual data based on state-of-the-art NLP approaches
Software Architectures Usability Analyze
6. 6
Industrial Natural Language Processing & Information Extraction
Chair of Technologies and Management of Digital Transformation, University of Wuppertal
Industrial Natural Language Processing
Anonymization
Anonymization of textual data in order to be able to forward
it to third parties
The analysis of unstructured company data in the cloud is either undesired
by the company itself or even forbidden by law (DSGVO)
Cloud services provide more accurate and sophisticated analytics
Develop new anonymization approaches by using machine learning as the
current approaches are too inaccurate
7. 7
Industrial Natural Language Processing & Information Extraction
Chair of Technologies and Management of Digital Transformation, University of Wuppertal
Industrial Natural Language Processing
Exploiting new Application Domains
Exploration of new areas for the use of natural language
processing in the wild
Most of the data that is available comprises unstructured data and especially
textual documents
Identify available data sources and derive meaningful use cases from it
Develop appropriate models and applications for the identified use cases that
generate an additional value
Identify Data Sources Derive Use Cases Explore possible Solutions
9. 9
Industrial Natural Language Processing & Information Extraction
Chair of Technologies and Management of Digital Transformation, University of Wuppertal
Industrial Natural Language Processing
Integrating External Data into an Enterprise Information System
External Information Extraction Tool
Personalized quick and easy access to a large amount of
data from several different sources within a single tool
Identifying relevant data sources (e.g., new websites,
social media, internal enterprise data)
Integrating data into a common data storage
Creation of dedicated analytical services for specific user
requirements, like
Natural Language Processing
Translations
Overview of business knowledge graph
Sentiment analysis
Recommendation of relevant data
Results
Integrating internal & external data into an enterprise
information system to gain faster insights into changing
markets, relations etc.
Approach
News Websites Social Media
Enterprise Information System
Information
Data
Goal
10. 10
Industrial Natural Language Processing & Information Extraction
Chair of Technologies and Management of Digital Transformation, University of Wuppertal
Industrial Natural Language Processing
Utilizing Textual Maintenance Data from Production
Maintenance Data Insights Tool
Tool for
assisted generation of maintenance report texts
supported finding of solutions
visualization of errors and costs
Extracting textual reports from maintenance staff
Classify text into description of symptoms, causes and
solutions
Calculation of relevant statistics
Creation of dedicated analytical services for staff and
decision makers, like
Occurrence of similar error descriptions over time and
location
Costs per machine location
Troubleshooting proposal for specified symptoms
Results
Utilizing unstructured textual information from machines’
maintenance protocols to gain insights and optimize
processes
Approach
Maintenance Data Platform
Information
Goal
„… defect, please check“
„… part was exchanged“
„… machine losing oil“
„… spare part ordered “
Solution Hints
11. 11
Industrial Natural Language Processing & Information Extraction
Chair of Technologies and Management of Digital Transformation, University of Wuppertal
Industrial Natural Language Processing
Anonymization of Enterprise Documents using the Cloud
Hybrid Anonymizer
Functional hybrid system for the automatic
anonymization/pseudonymization of textual data
Enabled the use of cloud analysis for textual documents
Development of an anonymization approach based on
predefined rules and deep learning
Implementation and testing of the hybrid anonymizer
Deployment of the anonymizer within the customers
ecosystem
Methods: Natural Language Processing, Deep Learning,
Micro-Service Architecture
Results
Enable the usage of cloud services for data processing and
analysis without revealing sensitive information
Approach
Goal
12. 12
Industrial Natural Language Processing & Information Extraction
Chair of Technologies and Management of Digital Transformation, University of Wuppertal
Industrial Natural Language Processing
AISLE – Support learning academic phrases
AISLE
Web platform that is actively used by students to improve
their vocabulary
User studies showed the system's positive impact on
vocabulary growth
Construct a large domain and target group specific text
corpus using NLP methods
Use recent methods in the area of natural language
processing for extracting and evaluating words and
phrases based on their relevance
Development of an adaptive learning system to improve
vocabulary on the basis of a developed learning algorithm
and the built up corpora
Results
Support students at the beginning of their studies in reading
and understanding scientific publications
Approach
Goal
Interact
Enter
Word:
Vocabulary
Size
Evaluate &
Select Words
View
Results
Analyze
Results
14. 14
Industrial Natural Language Processing & Information Extraction
Chair of Technologies and Management of Digital Transformation, University of Wuppertal
Information Extraction
Overview
POS
tagging
Unstructured
Data
Information Extraction
“… Application of methods from practical computer science,
artificial intelligence and computational linguistics to the
problem of automatic machine processing of unstructured
information … ” Source: Wikipedia
Different Types of
Unstructured Data
named
entity
recognition
Data-Specific
Processing
Structured
Data
Structured
Datanamed
entity
recognition
Data
Analysis
Results
15. 15
Industrial Natural Language Processing & Information Extraction
Chair of Technologies and Management of Digital Transformation, University of Wuppertal
Information Extraction
Research Goals
Leveraging of machine learning techniques to improve
information extraction
Transformation of unstructured data into useful structured
information and knowledge
16. 16
Industrial Natural Language Processing & Information Extraction
Chair of Technologies and Management of Digital Transformation, University of Wuppertal
Information Extraction
Structuring unstructured data
Transformation of unstructured data into useful structured
information and knowledge
Identify all the relevant information that need to be extracted
Identify approaches for extracting information from unstructured data and
turning it into valuable knowledge
Develop processing pipelines to automatically extract the identified
information and make them accessible in a structured way
Identify Information Choose Approaches
Develop Processing
Pipeline
17. 17
Industrial Natural Language Processing & Information Extraction
Chair of Technologies and Management of Digital Transformation, University of Wuppertal
Information Extraction
Machine Learning based Information Extraction
Leveraging of machine learning techniques to improve
information extraction
Combine Machine Learning &
Classical Approaches
Data Annotation Model Training & Refinement
Combine or substitute classical information extraction approaches with
machine learning
Development of tools to improve the process of annotating unstructured data
in order to create a suitable data set for the training of ML models
Development & Refinement of ML models
19. 19
Industrial Natural Language Processing & Information Extraction
Chair of Technologies and Management of Digital Transformation, University of Wuppertal
Information Extraction
Structuring PDF Documents
PDF Analyzer
Tool for identifying document elements in PDF files
Header
Text body
Tables
Figures
Formulas and Algorithms
First approaches for deriving information from diagrams
and tables exist
Classifying diagram types
Extracting values, axis labels etc.
Consider additional context
Using Deep Learning (CNNs) to detect different elements
within PDF documents
Extract additional information from diagrams and tables
for further processing
Results
Structuring of unstructured PDF documents to extract
additional information and prepare the data for further
analytics
Approach
Goal
20. Your Contact Person:
André Pomp, M.Sc.
Tel: +49 (0)202 439 1153
pomp@uni-wuppertal.de
Chair for Technologies and Management of Digital Transformation
Univ. Prof. Dr. Ing. Tobias Meisen
https://www.tmdt.uni-wuppertal.de/
Campus Freudenberg
Rainer-Gruenter-Str. 21
D-42119 Wuppertal
Germany
University of Wuppertal
School of Electrical, Information and Media Engineering