SlideShare a Scribd company logo
1 of 52
Download to read offline
http://www.cit.ie
Computer Science Department
Haithem. afli@cit.ie
@AfliHaithem
Natural Language Engineering in
the Golden Age of Artificial
Intelligence
Dr Haithem Afli
CIT Seminar Series
February 7th , 2020
If you think the language
industry is new
haithem.afli@cit.ie 2
If you think the language
industry is new, think again!
haithem.afli@cit.ie 3
Rosetta Stone (British Museum)
Natural Language :
An age-old industry ?
§ For as far back as we can see, human has needed to
communicate → so the origin of language industry is closely
intertwined with the need of communication itself
04/02/2020 haithem.afli@cit.ie 4
The Tower of Babel and The House of Wisdom in Bagdad (Bait-al-Hikma)
The importance of Language
Processing
07/02/2020 haithem.afli@cit.ie 5
Media agencies and translators interpreted the word “treat with silent contempt” or “take
into account” (to ignore), as the categorical rejection by the Prime Minister.
The Americans understood that there would never be a diplomatic end to the war and
were naturally annoyed by what they considered the arrogant tone used in the Japanese
translation of the Prime Minister’s response. International news agencies reported to the
world that in the eyes of the Japanese government the ultimatum was “not worthy of
comment.”
Artificial intelligence (AI)
Beyond the Hype
haithem.afli@cit.ie 6
Graph from Tobias Bohnhoff
https://nativevideotube.blogspot.com/
haithem.afli@cit.ie 7
NLP - the language industry
The Rise of Natural Language Processing
(NLP), and How it is Changing the Way we
Retrieve Information
07/02/2020 haithem.afli@cit.ie 8
The 'creator' of Bitcoin, Satoshi Nakamoto, is
the world's most elusive billionaire. Very few
people outside of the Department of
Homeland Security know Satoshi's real
name. Satoshi has taken great care to keep
his identity secret employing the latest
encryption and obfuscation methods in his
communications.
Despite these efforts Satoshi Nakamoto gave
investigators the only tool they needed to find him -
- his own words. Using NLP, NSA (and everyone!)
was able to compare texts to determine authorship
of a particular work.
More info: https://tech.slashdot.org/story/17/08/28/1725232/how-the-nsa-identified-satoshi-
nakamoto
Timeline of (modern) AI
haithem.afli@cit.ie
Graph from The University Of Queensland Brain Institute
The 1st AI
Winter
The second AI
Winter
Including CIT MSc in AI
https://www.cit.ie/course/CRKARIN9
9
The first AI winter
Haithem.afli@cit.ie
By 1964, the National Research Council (NRC)
had become concerned about the lack of progress
and formed the Automatic Language Processing
Advisory Committee (ALPAC) to look into the
problem.
They concluded, in a famous 1966 report, that
machine translation was more expensive, less
accurate and slower than human translation.
After spending some 20 million dollars, the NRC
ended all support.
Image from Wikipedia
Haithem.afli@cit.ie
In 1984, John McCarthy criticized expert systems because they lacked common sense
and knowledge about their own limitations.
Schwarz, Director of DARPA ISTO from 1987 to 1989 concluded that AI research has
always had
“… very limited success in particular areas, followed immediately by failure to reach the
broader goal at which these initial successes seem at first to hint…”.
Ø Decrease in funding in AI research.
Ø Many AI companies closed their doors.
Ø The AAAI conference that attracted over 6000
visitors in 1986 quickly decreased to just 2000
by 1991.
The second AI winter
The survivors
The Deep Learning God Fathers
Haithem.afli@cit.ie
Turing Award given for:
• “The conceptual and engineering breakthroughs that have made deep neural
networks a critical component of computing.”
Deep Learning Era
haithem.afli@cit.ie 13
2014: Generative Adversarial
Networks
§ The neural network at
the top is the
discriminator, and its task
is to distinguish the
training set’s real
information from the
generator’s creations.
§ In the simplest GAN
structure, the generator
starts with random data
and learns to transform
this noise into
information that matches
the distribution of the
real data.
haithem.afli@cit.ie 14
Do you know this person?
Haithem.afli@cit.ie
https://thispersondoesnotexist.com/
04/02/2020 haithem.afli@cit.ie 16
2018: StyleGAN
Haithem.afli@cit.ie
Failure Cases
04/02/2020 haithem.afli@cit.ie 18
CycleGAN (Zhu et al., 2017)
DeepFake
§ The development of
deepfakes has taken place
to a large extent in two
settings: research at
academic institutions, and
development by amateurs
in online communities.
haithem.afli@cit.ie 19
GAN
Applications of GANs
ØGANs for Image Editing
ØUsing GANs for Security
(SSGAN: Secure Steganography Based on GAN)
ØDe-aging Robert De Niro!
(Martin Scorsese spent millions of Netflix's money
to digitally de-age De Niro, Pacino, and Pesci so they could portray these men throughout
different parts of their lives.)
Haithem.afli@cit.ie
2016: Sequence to Sequence
Learning with Attention
haithem.afli@cit.ie
This mechanism allows the
network to refer back to the input
sequence, instead of forcing it to
encode all information into one
fixed-length vector
21
Attending the Unattainable
haithem.afli@cit.ie 22
Challenges in Machine Translation
Haithem.afli@cit.ie
Pre-trained models: BERT
haithem.afli@cit.ie
BERT makes use of Transformer, an
attention mechanism that learns
contextual relations between words (or
sub-words) in a text.
24
From BERT to ALBERT
haithem.afli@cit.ie 25
• BERT (Google)
• XLNet (Google/CMU)
• RoBERTa (Facebook)
• DistilBERT (HuggingFace)
• CTRL (Salesforce)
• GPT-2 (OpenAI)
• Megatron (NVIDIA)
• ALBERT (Google)
2019: OpenAI GPT2
haithem.afli@cit.ie 26
Haithem.afli@cit.ie
OpenAI GPT2
OpenAI GPT2
Haithem.afli@cit.ie
Challenges with automatically
generated texts
haithem.afli@cit.ie 29
Addressing commensense problem
haithem.afli@cit.ie 30
Cunxiang Wang, Shuailong Liang , Yue Zhang , Xiaonan Li and Tian Gao. Does It Make Sense?
And Why? A Pilot Study for Sense Making and Explanation.
Addressing real-world challenges
§ AI Technologies
- Natural Language Processing (NLP)
- Social Media and UGC Analysis
- Computer Vision (CV)
- Machine/Deep Learning (ML-DL)
§ Applications
- Digital Humanities
- Fintech
- Digital Health and Life-science
- Social Science and Psychology
- Security and Cybersecurity
31haithem.afli@cit.ie
NLP and ML to Address the
European migration crisis
§ ITFLOWS will model migration to the EU in two stages:
07/02/2020 haithem.afli@cit.ie 32
The first stage comprises
migration flows from third
countries to the EU borders.
Within this first stage,
migration flows are broadly
differentiated into regular
and irregular flows. ITFLOWS
will focus on predicting
irregular flows at this stage,
as regular migration is
authorised and regulated by
the receiving countries, in
this case the EU member
states.
§ ITFLOWS will model migration to the EU in two stages:
07/02/2020 haithem.afli@cit.ie 33
The second stage of
movement takes place
between the crossing of the
borders into the EU and the
final settlement of migrants
in the EU member states.
Ø Models for the accurate prediction of irregular migration flows from regions in five
countries of origin to the EU, and
Ø A holistic global model that will give predictions of the arrivals of irregular migrants
in all EU Member States.
NLP and ML to Address the European
migration crisis
07/02/2020 haithem.afli@cit.ie 34
NLP and ML to Address the European
migration crisis
07/02/2020 haithem.afli@cit.ie 35
https://data2.unhcr.org/en/situations/syria
NLP and ML to Address the European
migration crisis
Ethics and Data Privacy
§ The collection of tweets related to the countries of origin
will be based mainly on the language (and dialect) and an
estimated location. If we take the example of Syrian users,
ITFLOWS will be focusing on collecting public data of users
of Levantine Arabic (spoken in Lebanon, Jordan, Syria,
Palestine, and Israel) language who are located (based on
the Twitter API information) at least in the following
locations: https://data2.unhcr.org/en/situations/syria .
§ Since the location is only approximated, there will be no
discrimination based on the nationality in this task.
07/02/2020 haithem.afli@cit.ie 36
Ethics and Data Privacy
§ De-identification methods (Authorship Obfustication) for
natural language processing tasks: multiple steps need to be
addressed. ITFLOWS technological partners (CIT and FIZ) will
extract identifiers from text, and they will anonymise the
data set used for NLP tasks. For example, all addresses,
names, and so on by using named entity recogniser will be
removed.
§ This practice will be conducted according to the EU data
protection laws and, from a technical point of view, it will
be based on Differential Privacy for Text Document.
07/02/2020 haithem.afli@cit.ie 37
CIT team
07/02/2020 haithem.afli@cit.ie 38
Dr Haithem Afli
Computer
science Dep.
RIOMH
ADAPT@CIT
Eileen Crowley
Halpin Centre
for Research &
Innovation
CIT team received €528k H2020 fund
and will be led by
Thanks to
07/02/2020 haithem.afli@cit.ie 39
http://www.cit.ie
Computer Science Department
Haithem. afli@cit.ie
@AfliHaithem
Thank you
ML meets NLP to address Digital
Health challenges
07/02/2020 haithem.afli@cit.ie 41
The STOP project is addressing the
health societal challenge of
obesity through the foundation of
an innovative platform
to support Persons with Obesity
(PwO) with better nutrition under
the supervision of Healthcare
Professionals.
https://cordis.europa.eu/project/rcn/218245/factsheet/en
07/02/2020 haithem.afli@cit.ie 42
ML meets NLP to address Digital
Health challenges
07/02/2020 haithem.afli@cit.ie 43
The STOP Platform will capture
various PwO data from different kind
of smart sensor streams and Chatbot
technology, manage and enrich
available data with existing
knowledge bases and fuse these by
machine learned driven Data Fusion
approaches for sophisticated AI data
analysis.
https://cordis.europa.eu/project/rcn/218245/factsheet/en
07/02/2020 haithem.afli@cit.ie 44
CIT team
07/02/2020 haithem.afli@cit.ie 45
Yanxin Wu
PhD candidate in computer
science
Ryan Donovan
PhD candidate in psychology
Dr Haithem Afli
Principal Investigator
07/02/2020 haithem.afli@cit.ie 46
07/02/2020 haithem.afli@cit.ie 47
Interne Orange
Digital Service Provider (DSP)
E2E eHealth_slice:{type: eMBB}
Vertical
National Ambulance Service
ML meets CV to address the limitations of
current network infrastructures
Network Service Provider (NSP) B
RAN
Core
IP/MPLS MECCore DC EPC
NSSI: core slice
NSSI: RAN slice
NSI2: [RAN, Core IP/MPLS] Network Slice
https://slicenet.eu/
Interne Orange
Digital Service Provider (DSP)
E2E eHealth_slice:{type: eMBB}
Vertical
National Ambulance Service
Network Service Provider (NSP) B
RAN
Core
IP/MPLS MECCore DC EPC
NSSI: core slice
NSSI: RAN slice
NSI2: [RAN, Core IP/MPLS] Network Slice
https://slicenet.eu/
ML meets CV to address the limitations of
current network infrastructures
Interne Orange
Digital Service Provider (DSP)
Network Service Provider (NSP) A
RAN
EPCMEC Core DC
NSSI: RAN slice
NSSI: MEC slice
NSI1: [RAN + EPC + Core DC + Core IP/MPLS] Network Slice
NSSI: Core slice
Core
IP/MPLS
E2E eHealth_slice:{type: eMBB}
Vertical
National Ambulance Service
Network Service Provider (NSP) B
RAN
Core
IP/MPLS MECCore DC EPC
NSSI: core slice
NSSI: RAN slice
NSI2: [RAN, Core IP/MPLS] Network Slice
QoE: Perceived SNR, RSRP and RSRQ measurements
…
The signal quality will
be degraded for the
future 5 minutes
One Stop API/
P&P
Vertical feedback
https://slicenet.eu/
ML meets CV to address the limitations of
current network infrastructures
Microbiability in Beef Cattle
Archae
a Bacteri
a
Protozo
a Fung
i
Feed and hidric efficiency
Meat tenderness
Environmental impact
A better cattle
Variations in the microbiome
Can make
ML meets DA to address
Microbiability challenges
- Investigate the relation between the microbiome
components.
- Investigate the impact ot the microbiome
components in the cattle biology.
- Characterize the microbiome composition.
Rumen
Feces
N = 52 animals
- Several phenotypes measured.
- Microbial relative abundances
- Nelore is the predominant breed in Brazil.
Dr Bruno Gabriel
Abdrade Collecting
samples...
ML meets DA to address
Microbiability challenges

More Related Content

Similar to Natural Language Engineering in the Golden Age of Artificial Intelligence

Speak Out - Speak Digital
Speak Out - Speak DigitalSpeak Out - Speak Digital
Speak Out - Speak DigitalBorya3
 
Understanding everyday users’ perception of socio-technical issues through s...
Understanding everyday users’ perception of  socio-technical issues through s...Understanding everyday users’ perception of  socio-technical issues through s...
Understanding everyday users’ perception of socio-technical issues through s...Ahreum lee
 
Big Data, Open data, IOT
Big Data, Open data, IOTBig Data, Open data, IOT
Big Data, Open data, IOTPaolo Nesi
 
Marsden Regulating Disinformation Kluge 342020
Marsden Regulating Disinformation Kluge 342020Marsden Regulating Disinformation Kluge 342020
Marsden Regulating Disinformation Kluge 342020Chris Marsden
 
Work/Technology 2050: Scenarios and Actions
Work/Technology 2050: Scenarios and ActionsWork/Technology 2050: Scenarios and Actions
Work/Technology 2050: Scenarios and ActionsJerome Glenn
 
Work/Technology 2050: Scenarios and Actions (Dubai talk)
Work/Technology 2050: Scenarios and Actions (Dubai talk)Work/Technology 2050: Scenarios and Actions (Dubai talk)
Work/Technology 2050: Scenarios and Actions (Dubai talk)Jerome Glenn
 
Internet of Things (IoT) - Hafedh Alyahmadi - May 29, 2015.pdf
Internet of Things (IoT) - Hafedh Alyahmadi - May 29, 2015.pdfInternet of Things (IoT) - Hafedh Alyahmadi - May 29, 2015.pdf
Internet of Things (IoT) - Hafedh Alyahmadi - May 29, 2015.pdfImXaib
 
Future of AI Smart Networks
Future of AI Smart NetworksFuture of AI Smart Networks
Future of AI Smart NetworksMelanie Swan
 
Artificial Intelligence, other emerging technologies, and social inventions
Artificial Intelligence, other emerging technologies, and social inventionsArtificial Intelligence, other emerging technologies, and social inventions
Artificial Intelligence, other emerging technologies, and social inventionsJerome Glenn
 
Io t malta_2013 Internet of Things IoT Webinar Dec 2013 #iot @Des
Io t malta_2013 Internet of Things IoT Webinar Dec 2013 #iot @DesIo t malta_2013 Internet of Things IoT Webinar Dec 2013 #iot @Des
Io t malta_2013 Internet of Things IoT Webinar Dec 2013 #iot @DesDesiree Miloshevic
 
Complexity of IOT/IOE Architectures for Smart Service Infrastructures Panel:...
Complexity of IOT/IOE Architectures for  Smart Service Infrastructures Panel:...Complexity of IOT/IOE Architectures for  Smart Service Infrastructures Panel:...
Complexity of IOT/IOE Architectures for Smart Service Infrastructures Panel:...Paolo Nesi
 
Service Design Days 2017 - Keynote Jon Rogers (University of Dundee)
Service Design Days 2017 - Keynote Jon Rogers (University of Dundee)Service Design Days 2017 - Keynote Jon Rogers (University of Dundee)
Service Design Days 2017 - Keynote Jon Rogers (University of Dundee)SERVICE DESIGN DAYS
 
Algocracy and the state of AI in public administrations.
Algocracy and the state of AI in public administrations.Algocracy and the state of AI in public administrations.
Algocracy and the state of AI in public administrations.Sandra Bermúdez
 
Introduction - Lecture 1 - Advanced Topics in Information Systems (4016792ENR)
Introduction - Lecture 1 - Advanced Topics in Information Systems (4016792ENR)Introduction - Lecture 1 - Advanced Topics in Information Systems (4016792ENR)
Introduction - Lecture 1 - Advanced Topics in Information Systems (4016792ENR)Beat Signer
 
BSides Rochester 2018: Timothy Duffy: Civic and Humanitarian Open Source
BSides Rochester 2018: Timothy Duffy: Civic and Humanitarian Open SourceBSides Rochester 2018: Timothy Duffy: Civic and Humanitarian Open Source
BSides Rochester 2018: Timothy Duffy: Civic and Humanitarian Open SourceJosephTesta9
 
Technology evolves so fast
Technology evolves so fast Technology evolves so fast
Technology evolves so fast Jyrki Kasvi
 
Human-Centered Machine Learning: Harnessing Visualization and Interactivity f...
Human-Centered Machine Learning: Harnessing Visualization and Interactivity f...Human-Centered Machine Learning: Harnessing Visualization and Interactivity f...
Human-Centered Machine Learning: Harnessing Visualization and Interactivity f...Denis Parra Santander
 
State of AI Report 2023 - Air Street Capital
State of AI Report 2023 - Air Street CapitalState of AI Report 2023 - Air Street Capital
State of AI Report 2023 - Air Street CapitalAI Geek (wishesh)
 

Similar to Natural Language Engineering in the Golden Age of Artificial Intelligence (20)

Speak Out - Speak Digital
Speak Out - Speak DigitalSpeak Out - Speak Digital
Speak Out - Speak Digital
 
Understanding everyday users’ perception of socio-technical issues through s...
Understanding everyday users’ perception of  socio-technical issues through s...Understanding everyday users’ perception of  socio-technical issues through s...
Understanding everyday users’ perception of socio-technical issues through s...
 
Big Data, Open data, IOT
Big Data, Open data, IOTBig Data, Open data, IOT
Big Data, Open data, IOT
 
Marsden Regulating Disinformation Kluge 342020
Marsden Regulating Disinformation Kluge 342020Marsden Regulating Disinformation Kluge 342020
Marsden Regulating Disinformation Kluge 342020
 
S0-Stephen.pptx
S0-Stephen.pptxS0-Stephen.pptx
S0-Stephen.pptx
 
Work/Technology 2050: Scenarios and Actions
Work/Technology 2050: Scenarios and ActionsWork/Technology 2050: Scenarios and Actions
Work/Technology 2050: Scenarios and Actions
 
Work/Technology 2050: Scenarios and Actions (Dubai talk)
Work/Technology 2050: Scenarios and Actions (Dubai talk)Work/Technology 2050: Scenarios and Actions (Dubai talk)
Work/Technology 2050: Scenarios and Actions (Dubai talk)
 
Internet of Things (IoT) - Hafedh Alyahmadi - May 29, 2015.pdf
Internet of Things (IoT) - Hafedh Alyahmadi - May 29, 2015.pdfInternet of Things (IoT) - Hafedh Alyahmadi - May 29, 2015.pdf
Internet of Things (IoT) - Hafedh Alyahmadi - May 29, 2015.pdf
 
Future of AI Smart Networks
Future of AI Smart NetworksFuture of AI Smart Networks
Future of AI Smart Networks
 
Artificial Intelligence, other emerging technologies, and social inventions
Artificial Intelligence, other emerging technologies, and social inventionsArtificial Intelligence, other emerging technologies, and social inventions
Artificial Intelligence, other emerging technologies, and social inventions
 
Io t malta_2013 Internet of Things IoT Webinar Dec 2013 #iot @Des
Io t malta_2013 Internet of Things IoT Webinar Dec 2013 #iot @DesIo t malta_2013 Internet of Things IoT Webinar Dec 2013 #iot @Des
Io t malta_2013 Internet of Things IoT Webinar Dec 2013 #iot @Des
 
top 10 Data Mining Algorithms
top 10 Data Mining Algorithmstop 10 Data Mining Algorithms
top 10 Data Mining Algorithms
 
Complexity of IOT/IOE Architectures for Smart Service Infrastructures Panel:...
Complexity of IOT/IOE Architectures for  Smart Service Infrastructures Panel:...Complexity of IOT/IOE Architectures for  Smart Service Infrastructures Panel:...
Complexity of IOT/IOE Architectures for Smart Service Infrastructures Panel:...
 
Service Design Days 2017 - Keynote Jon Rogers (University of Dundee)
Service Design Days 2017 - Keynote Jon Rogers (University of Dundee)Service Design Days 2017 - Keynote Jon Rogers (University of Dundee)
Service Design Days 2017 - Keynote Jon Rogers (University of Dundee)
 
Algocracy and the state of AI in public administrations.
Algocracy and the state of AI in public administrations.Algocracy and the state of AI in public administrations.
Algocracy and the state of AI in public administrations.
 
Introduction - Lecture 1 - Advanced Topics in Information Systems (4016792ENR)
Introduction - Lecture 1 - Advanced Topics in Information Systems (4016792ENR)Introduction - Lecture 1 - Advanced Topics in Information Systems (4016792ENR)
Introduction - Lecture 1 - Advanced Topics in Information Systems (4016792ENR)
 
BSides Rochester 2018: Timothy Duffy: Civic and Humanitarian Open Source
BSides Rochester 2018: Timothy Duffy: Civic and Humanitarian Open SourceBSides Rochester 2018: Timothy Duffy: Civic and Humanitarian Open Source
BSides Rochester 2018: Timothy Duffy: Civic and Humanitarian Open Source
 
Technology evolves so fast
Technology evolves so fast Technology evolves so fast
Technology evolves so fast
 
Human-Centered Machine Learning: Harnessing Visualization and Interactivity f...
Human-Centered Machine Learning: Harnessing Visualization and Interactivity f...Human-Centered Machine Learning: Harnessing Visualization and Interactivity f...
Human-Centered Machine Learning: Harnessing Visualization and Interactivity f...
 
State of AI Report 2023 - Air Street Capital
State of AI Report 2023 - Air Street CapitalState of AI Report 2023 - Air Street Capital
State of AI Report 2023 - Air Street Capital
 

More from Haithem Afli

How NLP is reshaping Fintech
How NLP is reshaping Fintech How NLP is reshaping Fintech
How NLP is reshaping Fintech Haithem Afli
 
Looking Beyond the AI & IoT Research and Industrial Opportunities: How two Br...
Looking Beyond the AI & IoTResearch and Industrial Opportunities:How two Br...Looking Beyond the AI & IoTResearch and Industrial Opportunities:How two Br...
Looking Beyond the AI & IoT Research and Industrial Opportunities: How two Br...Haithem Afli
 
Affective Analytics and Visualization for Ensemble event-driven stock market ...
Affective Analytics and Visualization for Ensemble event-driven stock market ...Affective Analytics and Visualization for Ensemble event-driven stock market ...
Affective Analytics and Visualization for Ensemble event-driven stock market ...Haithem Afli
 
Introduction to Natural Language Processing
Introduction to Natural Language Processing  Introduction to Natural Language Processing
Introduction to Natural Language Processing Haithem Afli
 
Présentation de thèse Haithem AFLI
Présentation de thèse Haithem AFLIPrésentation de thèse Haithem AFLI
Présentation de thèse Haithem AFLIHaithem Afli
 
Parallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corporaParallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corporaHaithem Afli
 

More from Haithem Afli (7)

How NLP is reshaping Fintech
How NLP is reshaping Fintech How NLP is reshaping Fintech
How NLP is reshaping Fintech
 
Looking Beyond the AI & IoT Research and Industrial Opportunities: How two Br...
Looking Beyond the AI & IoTResearch and Industrial Opportunities:How two Br...Looking Beyond the AI & IoTResearch and Industrial Opportunities:How two Br...
Looking Beyond the AI & IoT Research and Industrial Opportunities: How two Br...
 
Affective Analytics and Visualization for Ensemble event-driven stock market ...
Affective Analytics and Visualization for Ensemble event-driven stock market ...Affective Analytics and Visualization for Ensemble event-driven stock market ...
Affective Analytics and Visualization for Ensemble event-driven stock market ...
 
Introduction to Natural Language Processing
Introduction to Natural Language Processing  Introduction to Natural Language Processing
Introduction to Natural Language Processing
 
Analytics2017
Analytics2017Analytics2017
Analytics2017
 
Présentation de thèse Haithem AFLI
Présentation de thèse Haithem AFLIPrésentation de thèse Haithem AFLI
Présentation de thèse Haithem AFLI
 
Parallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corporaParallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corpora
 

Recently uploaded

Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 

Recently uploaded (20)

Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 

Natural Language Engineering in the Golden Age of Artificial Intelligence

  • 1. http://www.cit.ie Computer Science Department Haithem. afli@cit.ie @AfliHaithem Natural Language Engineering in the Golden Age of Artificial Intelligence Dr Haithem Afli CIT Seminar Series February 7th , 2020
  • 2. If you think the language industry is new haithem.afli@cit.ie 2
  • 3. If you think the language industry is new, think again! haithem.afli@cit.ie 3 Rosetta Stone (British Museum)
  • 4. Natural Language : An age-old industry ? § For as far back as we can see, human has needed to communicate → so the origin of language industry is closely intertwined with the need of communication itself 04/02/2020 haithem.afli@cit.ie 4 The Tower of Babel and The House of Wisdom in Bagdad (Bait-al-Hikma)
  • 5. The importance of Language Processing 07/02/2020 haithem.afli@cit.ie 5 Media agencies and translators interpreted the word “treat with silent contempt” or “take into account” (to ignore), as the categorical rejection by the Prime Minister. The Americans understood that there would never be a diplomatic end to the war and were naturally annoyed by what they considered the arrogant tone used in the Japanese translation of the Prime Minister’s response. International news agencies reported to the world that in the eyes of the Japanese government the ultimatum was “not worthy of comment.”
  • 6. Artificial intelligence (AI) Beyond the Hype haithem.afli@cit.ie 6 Graph from Tobias Bohnhoff https://nativevideotube.blogspot.com/
  • 7. haithem.afli@cit.ie 7 NLP - the language industry
  • 8. The Rise of Natural Language Processing (NLP), and How it is Changing the Way we Retrieve Information 07/02/2020 haithem.afli@cit.ie 8 The 'creator' of Bitcoin, Satoshi Nakamoto, is the world's most elusive billionaire. Very few people outside of the Department of Homeland Security know Satoshi's real name. Satoshi has taken great care to keep his identity secret employing the latest encryption and obfuscation methods in his communications. Despite these efforts Satoshi Nakamoto gave investigators the only tool they needed to find him - - his own words. Using NLP, NSA (and everyone!) was able to compare texts to determine authorship of a particular work. More info: https://tech.slashdot.org/story/17/08/28/1725232/how-the-nsa-identified-satoshi- nakamoto
  • 9. Timeline of (modern) AI haithem.afli@cit.ie Graph from The University Of Queensland Brain Institute The 1st AI Winter The second AI Winter Including CIT MSc in AI https://www.cit.ie/course/CRKARIN9 9
  • 10. The first AI winter Haithem.afli@cit.ie By 1964, the National Research Council (NRC) had become concerned about the lack of progress and formed the Automatic Language Processing Advisory Committee (ALPAC) to look into the problem. They concluded, in a famous 1966 report, that machine translation was more expensive, less accurate and slower than human translation. After spending some 20 million dollars, the NRC ended all support. Image from Wikipedia
  • 11. Haithem.afli@cit.ie In 1984, John McCarthy criticized expert systems because they lacked common sense and knowledge about their own limitations. Schwarz, Director of DARPA ISTO from 1987 to 1989 concluded that AI research has always had “… very limited success in particular areas, followed immediately by failure to reach the broader goal at which these initial successes seem at first to hint…”. Ø Decrease in funding in AI research. Ø Many AI companies closed their doors. Ø The AAAI conference that attracted over 6000 visitors in 1986 quickly decreased to just 2000 by 1991. The second AI winter
  • 12. The survivors The Deep Learning God Fathers Haithem.afli@cit.ie Turing Award given for: • “The conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing.”
  • 14. 2014: Generative Adversarial Networks § The neural network at the top is the discriminator, and its task is to distinguish the training set’s real information from the generator’s creations. § In the simplest GAN structure, the generator starts with random data and learns to transform this noise into information that matches the distribution of the real data. haithem.afli@cit.ie 14
  • 15. Do you know this person? Haithem.afli@cit.ie https://thispersondoesnotexist.com/
  • 18. Failure Cases 04/02/2020 haithem.afli@cit.ie 18 CycleGAN (Zhu et al., 2017)
  • 19. DeepFake § The development of deepfakes has taken place to a large extent in two settings: research at academic institutions, and development by amateurs in online communities. haithem.afli@cit.ie 19
  • 20. GAN Applications of GANs ØGANs for Image Editing ØUsing GANs for Security (SSGAN: Secure Steganography Based on GAN) ØDe-aging Robert De Niro! (Martin Scorsese spent millions of Netflix's money to digitally de-age De Niro, Pacino, and Pesci so they could portray these men throughout different parts of their lives.) Haithem.afli@cit.ie
  • 21. 2016: Sequence to Sequence Learning with Attention haithem.afli@cit.ie This mechanism allows the network to refer back to the input sequence, instead of forcing it to encode all information into one fixed-length vector 21
  • 23. Challenges in Machine Translation Haithem.afli@cit.ie
  • 24. Pre-trained models: BERT haithem.afli@cit.ie BERT makes use of Transformer, an attention mechanism that learns contextual relations between words (or sub-words) in a text. 24
  • 25. From BERT to ALBERT haithem.afli@cit.ie 25 • BERT (Google) • XLNet (Google/CMU) • RoBERTa (Facebook) • DistilBERT (HuggingFace) • CTRL (Salesforce) • GPT-2 (OpenAI) • Megatron (NVIDIA) • ALBERT (Google)
  • 29. Challenges with automatically generated texts haithem.afli@cit.ie 29
  • 30. Addressing commensense problem haithem.afli@cit.ie 30 Cunxiang Wang, Shuailong Liang , Yue Zhang , Xiaonan Li and Tian Gao. Does It Make Sense? And Why? A Pilot Study for Sense Making and Explanation.
  • 31. Addressing real-world challenges § AI Technologies - Natural Language Processing (NLP) - Social Media and UGC Analysis - Computer Vision (CV) - Machine/Deep Learning (ML-DL) § Applications - Digital Humanities - Fintech - Digital Health and Life-science - Social Science and Psychology - Security and Cybersecurity 31haithem.afli@cit.ie
  • 32. NLP and ML to Address the European migration crisis § ITFLOWS will model migration to the EU in two stages: 07/02/2020 haithem.afli@cit.ie 32 The first stage comprises migration flows from third countries to the EU borders. Within this first stage, migration flows are broadly differentiated into regular and irregular flows. ITFLOWS will focus on predicting irregular flows at this stage, as regular migration is authorised and regulated by the receiving countries, in this case the EU member states.
  • 33. § ITFLOWS will model migration to the EU in two stages: 07/02/2020 haithem.afli@cit.ie 33 The second stage of movement takes place between the crossing of the borders into the EU and the final settlement of migrants in the EU member states. Ø Models for the accurate prediction of irregular migration flows from regions in five countries of origin to the EU, and Ø A holistic global model that will give predictions of the arrivals of irregular migrants in all EU Member States. NLP and ML to Address the European migration crisis
  • 34. 07/02/2020 haithem.afli@cit.ie 34 NLP and ML to Address the European migration crisis
  • 36. Ethics and Data Privacy § The collection of tweets related to the countries of origin will be based mainly on the language (and dialect) and an estimated location. If we take the example of Syrian users, ITFLOWS will be focusing on collecting public data of users of Levantine Arabic (spoken in Lebanon, Jordan, Syria, Palestine, and Israel) language who are located (based on the Twitter API information) at least in the following locations: https://data2.unhcr.org/en/situations/syria . § Since the location is only approximated, there will be no discrimination based on the nationality in this task. 07/02/2020 haithem.afli@cit.ie 36
  • 37. Ethics and Data Privacy § De-identification methods (Authorship Obfustication) for natural language processing tasks: multiple steps need to be addressed. ITFLOWS technological partners (CIT and FIZ) will extract identifiers from text, and they will anonymise the data set used for NLP tasks. For example, all addresses, names, and so on by using named entity recogniser will be removed. § This practice will be conducted according to the EU data protection laws and, from a technical point of view, it will be based on Differential Privacy for Text Document. 07/02/2020 haithem.afli@cit.ie 37
  • 38. CIT team 07/02/2020 haithem.afli@cit.ie 38 Dr Haithem Afli Computer science Dep. RIOMH ADAPT@CIT Eileen Crowley Halpin Centre for Research & Innovation CIT team received €528k H2020 fund and will be led by
  • 40. http://www.cit.ie Computer Science Department Haithem. afli@cit.ie @AfliHaithem Thank you
  • 41. ML meets NLP to address Digital Health challenges 07/02/2020 haithem.afli@cit.ie 41 The STOP project is addressing the health societal challenge of obesity through the foundation of an innovative platform to support Persons with Obesity (PwO) with better nutrition under the supervision of Healthcare Professionals. https://cordis.europa.eu/project/rcn/218245/factsheet/en
  • 43. ML meets NLP to address Digital Health challenges 07/02/2020 haithem.afli@cit.ie 43 The STOP Platform will capture various PwO data from different kind of smart sensor streams and Chatbot technology, manage and enrich available data with existing knowledge bases and fuse these by machine learned driven Data Fusion approaches for sophisticated AI data analysis. https://cordis.europa.eu/project/rcn/218245/factsheet/en
  • 45. CIT team 07/02/2020 haithem.afli@cit.ie 45 Yanxin Wu PhD candidate in computer science Ryan Donovan PhD candidate in psychology Dr Haithem Afli Principal Investigator
  • 48. Interne Orange Digital Service Provider (DSP) E2E eHealth_slice:{type: eMBB} Vertical National Ambulance Service ML meets CV to address the limitations of current network infrastructures Network Service Provider (NSP) B RAN Core IP/MPLS MECCore DC EPC NSSI: core slice NSSI: RAN slice NSI2: [RAN, Core IP/MPLS] Network Slice https://slicenet.eu/
  • 49. Interne Orange Digital Service Provider (DSP) E2E eHealth_slice:{type: eMBB} Vertical National Ambulance Service Network Service Provider (NSP) B RAN Core IP/MPLS MECCore DC EPC NSSI: core slice NSSI: RAN slice NSI2: [RAN, Core IP/MPLS] Network Slice https://slicenet.eu/ ML meets CV to address the limitations of current network infrastructures
  • 50. Interne Orange Digital Service Provider (DSP) Network Service Provider (NSP) A RAN EPCMEC Core DC NSSI: RAN slice NSSI: MEC slice NSI1: [RAN + EPC + Core DC + Core IP/MPLS] Network Slice NSSI: Core slice Core IP/MPLS E2E eHealth_slice:{type: eMBB} Vertical National Ambulance Service Network Service Provider (NSP) B RAN Core IP/MPLS MECCore DC EPC NSSI: core slice NSSI: RAN slice NSI2: [RAN, Core IP/MPLS] Network Slice QoE: Perceived SNR, RSRP and RSRQ measurements … The signal quality will be degraded for the future 5 minutes One Stop API/ P&P Vertical feedback https://slicenet.eu/ ML meets CV to address the limitations of current network infrastructures
  • 51. Microbiability in Beef Cattle Archae a Bacteri a Protozo a Fung i Feed and hidric efficiency Meat tenderness Environmental impact A better cattle Variations in the microbiome Can make ML meets DA to address Microbiability challenges
  • 52. - Investigate the relation between the microbiome components. - Investigate the impact ot the microbiome components in the cattle biology. - Characterize the microbiome composition. Rumen Feces N = 52 animals - Several phenotypes measured. - Microbial relative abundances - Nelore is the predominant breed in Brazil. Dr Bruno Gabriel Abdrade Collecting samples... ML meets DA to address Microbiability challenges