SlideShare a Scribd company logo
1 of 24
Download to read offline
Convolutional neural networks for
text classification
Lidia Pivovarova
Research Seminar in Language Technology
1st June 2017
PULS Project
● Web-scale surveillance of news
● Current topic: Business. (Previous topics: Security,
Epidemics, ...)
● Tracking news from thousands of news sites about
business activities
– ≈6000–8000 news items per day
– among hundreds of thousands of entities:
● companies, persons, products, organizations, ...
– tracking many kinds of activities:
● merger, buyout, bankruptcy, layoff, product launch
and recall, ...
TEKES Project, led by Roman Yangarber
http://newsweb.cs.helsinki.fi
Neural network
● Each node computes a function on its inputs to produce an
output
● A network ~ a huge formula with many parameters
● Adjustment of parameters given an output and a true value
(back propagation)
● A network structure and the inference are separated
An image from http://www.opennn.net/
Convolutional neural networks
images found in the data science blog: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
Convolutional neural networks
images found in the data science blog: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
Convolutional neural networks
images found in the data science blog: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
Convolutional neural networks
images found in the data science blog: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
CNN for NLP
images found in the WildML blog:
http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/
also very good tutorial on CNN for NLP with Tensorflow
http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/
Polarity detection in business news
General Motors:
Daimler:
● The task: to determine the sentiment polarity of a
mention of a given company in a business news
article
● Similar to (aspect-based) sentiment detection
● But:
– business news articles typically do not aim to express
emotions or subjectivity
– business news contains genre-specific word usages
● Thus we cannot apply existing resources
(dictionaries, labelled corpora) developed for more
general sentiment analysis
Polarity detection in business news
Manual annotation
● ~18,000 documents, ~20,000 names
● annotation by groups (Escoter at. el. EACL 2017)
● post-proccessing:
D1: Valeant Pharmaceuticals International Inc., the embattled
Canadian drugmaker, agreed to sell about $2.1 billion in assets to
get cash to streamline its businesses and begin easing its debt
burden.
D2: L’Oreal to buy three skincare brands from Valeant for $1.3 billion.
The French cosmetics giant paid nearly eight times the brand’s
combined annual revenue of $168 million
Knowledge transfer
● 2M collection of short business reports
● Manuall annotation with events labels
– 291 labels in total
– 26 labels imply positive polarity: Investment, New
Product, Sponsorship,...
– 12 labels imply negative polarity: Fraud, Layoff,
Bankruptcy,...
– 200,000 documents with exactly one company name
and non-ambiguous polarity
● High-level feature transfer:
– train a model for event labels
– replace the last layer of the network and continue
training for polarity labels
Token-based model
– Y.Kim (EMNLP 2014)
+ focus – position of the target company
Region-based model
– R. Johnson & T. Zhang (NIPS 2015, NAACL 2015)
+ focus – position of the target company
Experiments
● Tune: train on 200,000 documents with mapped polarity labels
when tune using 12,000 manually annotated documents
● Combine: train on 200,000 mapped + 12,000 manually
annotated documents
● Feature transfer: train on 2M documents with original event
labels, replace the last layer, tune to 12,000 manually annotated
documents
Examples
● Valeant to sell Dendreon unit to China’s
Sanpower for $820 million. Canada’s Valeant
Pharmaceuticals International Inc. said its
affiliate will sell its Dendreon cancer business to
China’s Sanpower Group Co. Ltd. for $819.9
million, as the drugmaker continues to shed its
non-core assets to repay debt.
● True score: -1.0
● With focus: 0.022
● Without focus: -0.322
Examples
● Valued less than Toshiba in 2004, Apple today
has a market capitalization of US$700-billion,
as shares hit a record high close yesterday.
Toshiba shares could soon be relatively
worthless, as they may have to declare
bankruptcy.
● True score: 1.0
● With focus: 0.310
● Without focus: -0.197
Examples
● Bailed-out Lloyds Banking Group reports highest annual profit for
ten years. Bottom line profits at the taxpayer-backed lender more
than doubled to £4.24 billion last year, partly due to lower PPI
compensationpayouts. The result marks its best performance at the
UK’s biggest retail banking group since 2006. The government put
£20.3b into the banking group, acquiring a 43 per cent stake to save it
from collapse at the height of the financial crisis. This has now
reduced to less than five per cent following a series of share sales
and the government has indicated that it aims to shed its remaining
stake this year. Announcing the results, Lloyds shares jumped 3.6
per cent and the group said its performance was “inextricably linked
to the health of the UK economy, which has been more resilient than
the market expected” since the referendum on EU membership.
● True score: 0.4
● With focus: 0.179
● Without focus: -0.162
Examples
● Facebook CEO Mark Zuckerberg and his wife
are dropping controversial suits they filed in
December to buy small plots of land that are
part of a 700-acre waterfront estate they own
on the island of Kauai in Hawaii.
● True score: 0.0
● With focus: 0.-743
● Without focus: -0.397
Examples
● Valued less than Toshiba in 2004, Apple today
has a market capitalization of US$700-billion,
as shares hit a record high close yesterday.
Toshiba shares could soon be relatively
worthless, as they may have to declare
bankruptcy.
● True score: 1.0
● With focus: 0.310
● Without focus: -0.197
Examples
● Valued less than Toshiba in 2004, Apple today
has a market capitalization of US$700-billion,
as shares hit a record high close yesterday.
Toshiba shares could soon be relatively
worthless, as they may have to declare
bankruptcy.
● True score: 1.0
● With focus: 0.310
● Without focus: -0.197
Behind the scenes
Thanks for your attention!
● More details can be found in:
– L. Pivovarova, L. Escoter, A. Klami, & R. Yangarber SemEval 2017
– Also in my future publications...

More Related Content

Similar to Convolutional neural networks for text classification

Junk Van - Part 1Junk Van - Part 1CriteriaRatingsPtsThis
Junk Van - Part 1Junk Van - Part 1CriteriaRatingsPtsThisJunk Van - Part 1Junk Van - Part 1CriteriaRatingsPtsThis
Junk Van - Part 1Junk Van - Part 1CriteriaRatingsPtsThisTatianaMajor22
 
2014 Tech M&A Monthly - New World of Buyers
2014 Tech M&A Monthly - New World of Buyers2014 Tech M&A Monthly - New World of Buyers
2014 Tech M&A Monthly - New World of BuyersCorum Group
 
I Bytes Business Services lndustry
I Bytes Business Services lndustryI Bytes Business Services lndustry
I Bytes Business Services lndustryEGBG Services
 
DealMarket Digest Issue91 - 19th April 2013
DealMarket Digest Issue91 - 19th April 2013DealMarket Digest Issue91 - 19th April 2013
DealMarket Digest Issue91 - 19th April 2013Urs Haeusler
 
Enterprise profiles
Enterprise profilesEnterprise profiles
Enterprise profileseschizas
 
Creating Global Ventures - The Next Challenge for Japan's Evolving Venture Ha...
Creating Global Ventures - The Next Challenge for Japan's Evolving Venture Ha...Creating Global Ventures - The Next Challenge for Japan's Evolving Venture Ha...
Creating Global Ventures - The Next Challenge for Japan's Evolving Venture Ha...Allen Miner
 
FinTech Industry Report 2016
FinTech Industry Report 2016FinTech Industry Report 2016
FinTech Industry Report 2016Bernard Moon
 
Douglas P Hansen Resume 6-18-15
Douglas P Hansen Resume  6-18-15Douglas P Hansen Resume  6-18-15
Douglas P Hansen Resume 6-18-15Douglas Hansen
 
XBRL - Experience and Best Practice
XBRL - Experience and Best PracticeXBRL - Experience and Best Practice
XBRL - Experience and Best PracticeWorkiva
 
DealMarket Digest Issue 131 - 7 March 2014
DealMarket Digest Issue 131 - 7 March 2014DealMarket Digest Issue 131 - 7 March 2014
DealMarket Digest Issue 131 - 7 March 2014Urs Haeusler
 
Introduction international business
Introduction   international businessIntroduction   international business
Introduction international businessPimsat University
 
Chap 3 global context of business new
Chap 3 global context of business newChap 3 global context of business new
Chap 3 global context of business newMemoona Qadeer
 
Chap3globalcontextofbusinessnew 120305021020-phpapp01
Chap3globalcontextofbusinessnew 120305021020-phpapp01Chap3globalcontextofbusinessnew 120305021020-phpapp01
Chap3globalcontextofbusinessnew 120305021020-phpapp01Pimsat.University.Karachi
 
Delighting the Customer - The New Business Normal
Delighting the Customer - The New Business NormalDelighting the Customer - The New Business Normal
Delighting the Customer - The New Business NormalPeter Coffee
 
151111 BASE ELN 151112 CIO Big Data Collaboration
151111 BASE ELN 151112 CIO Big Data Collaboration151111 BASE ELN 151112 CIO Big Data Collaboration
151111 BASE ELN 151112 CIO Big Data CollaborationDr. Bill Limond
 

Similar to Convolutional neural networks for text classification (20)

Junk Van - Part 1Junk Van - Part 1CriteriaRatingsPtsThis
Junk Van - Part 1Junk Van - Part 1CriteriaRatingsPtsThisJunk Van - Part 1Junk Van - Part 1CriteriaRatingsPtsThis
Junk Van - Part 1Junk Van - Part 1CriteriaRatingsPtsThis
 
2014 Tech M&A Monthly - New World of Buyers
2014 Tech M&A Monthly - New World of Buyers2014 Tech M&A Monthly - New World of Buyers
2014 Tech M&A Monthly - New World of Buyers
 
I bytes Technology
I bytes Technology I bytes Technology
I bytes Technology
 
I Bytes Business Services lndustry
I Bytes Business Services lndustryI Bytes Business Services lndustry
I Bytes Business Services lndustry
 
DealMarket Digest Issue91 - 19th April 2013
DealMarket Digest Issue91 - 19th April 2013DealMarket Digest Issue91 - 19th April 2013
DealMarket Digest Issue91 - 19th April 2013
 
Final ppt
Final pptFinal ppt
Final ppt
 
2014 IA Background
2014 IA Background2014 IA Background
2014 IA Background
 
Enterprise profiles
Enterprise profilesEnterprise profiles
Enterprise profiles
 
Creating Global Ventures - The Next Challenge for Japan's Evolving Venture Ha...
Creating Global Ventures - The Next Challenge for Japan's Evolving Venture Ha...Creating Global Ventures - The Next Challenge for Japan's Evolving Venture Ha...
Creating Global Ventures - The Next Challenge for Japan's Evolving Venture Ha...
 
FinTech Industry Report 2016
FinTech Industry Report 2016FinTech Industry Report 2016
FinTech Industry Report 2016
 
Douglas P Hansen Resume 6-18-15
Douglas P Hansen Resume  6-18-15Douglas P Hansen Resume  6-18-15
Douglas P Hansen Resume 6-18-15
 
XBRL - Experience and Best Practice
XBRL - Experience and Best PracticeXBRL - Experience and Best Practice
XBRL - Experience and Best Practice
 
DealMarket Digest Issue 131 - 7 March 2014
DealMarket Digest Issue 131 - 7 March 2014DealMarket Digest Issue 131 - 7 March 2014
DealMarket Digest Issue 131 - 7 March 2014
 
Introduction international business
Introduction   international businessIntroduction   international business
Introduction international business
 
Chap 3 global context of business new
Chap 3 global context of business newChap 3 global context of business new
Chap 3 global context of business new
 
Chap3globalcontextofbusinessnew 120305021020-phpapp01
Chap3globalcontextofbusinessnew 120305021020-phpapp01Chap3globalcontextofbusinessnew 120305021020-phpapp01
Chap3globalcontextofbusinessnew 120305021020-phpapp01
 
Technopark.st.petersburg.business.incubator.ingria
Technopark.st.petersburg.business.incubator.ingriaTechnopark.st.petersburg.business.incubator.ingria
Technopark.st.petersburg.business.incubator.ingria
 
Republic services stock pitch
Republic services stock pitchRepublic services stock pitch
Republic services stock pitch
 
Delighting the Customer - The New Business Normal
Delighting the Customer - The New Business NormalDelighting the Customer - The New Business Normal
Delighting the Customer - The New Business Normal
 
151111 BASE ELN 151112 CIO Big Data Collaboration
151111 BASE ELN 151112 CIO Big Data Collaboration151111 BASE ELN 151112 CIO Big Data Collaboration
151111 BASE ELN 151112 CIO Big Data Collaboration
 

More from Lidia Pivovarova

Classification and clustering in media monitoring: from knowledge engineering...
Classification and clustering in media monitoring: from knowledge engineering...Classification and clustering in media monitoring: from knowledge engineering...
Classification and clustering in media monitoring: from knowledge engineering...Lidia Pivovarova
 
Grouping business news stories based on salience of named entities
Grouping business news stories based on salience of named entitiesGrouping business news stories based on salience of named entities
Grouping business news stories based on salience of named entitiesLidia Pivovarova
 
Интеллектуальный анализ текста
Интеллектуальный анализ текстаИнтеллектуальный анализ текста
Интеллектуальный анализ текстаLidia Pivovarova
 
AINL 2016: Bodrunova, Blekanov, Maksimov
AINL 2016: Bodrunova, Blekanov, MaksimovAINL 2016: Bodrunova, Blekanov, Maksimov
AINL 2016: Bodrunova, Blekanov, MaksimovLidia Pivovarova
 
AINL 2016: Rykov, Nagornyy, Koltsova, Natta, Kremenets, Manovich, Cerrone, Cr...
AINL 2016: Rykov, Nagornyy, Koltsova, Natta, Kremenets, Manovich, Cerrone, Cr...AINL 2016: Rykov, Nagornyy, Koltsova, Natta, Kremenets, Manovich, Cerrone, Cr...
AINL 2016: Rykov, Nagornyy, Koltsova, Natta, Kremenets, Manovich, Cerrone, Cr...Lidia Pivovarova
 
AINL 2016: Shavrina, Selegey
AINL 2016: Shavrina, SelegeyAINL 2016: Shavrina, Selegey
AINL 2016: Shavrina, SelegeyLidia Pivovarova
 

More from Lidia Pivovarova (20)

Classification and clustering in media monitoring: from knowledge engineering...
Classification and clustering in media monitoring: from knowledge engineering...Classification and clustering in media monitoring: from knowledge engineering...
Classification and clustering in media monitoring: from knowledge engineering...
 
Grouping business news stories based on salience of named entities
Grouping business news stories based on salience of named entitiesGrouping business news stories based on salience of named entities
Grouping business news stories based on salience of named entities
 
Интеллектуальный анализ текста
Интеллектуальный анализ текстаИнтеллектуальный анализ текста
Интеллектуальный анализ текста
 
AINL 2016: Yagunova
AINL 2016: YagunovaAINL 2016: Yagunova
AINL 2016: Yagunova
 
AINL 2016: Kuznetsova
AINL 2016: KuznetsovaAINL 2016: Kuznetsova
AINL 2016: Kuznetsova
 
AINL 2016: Bodrunova, Blekanov, Maksimov
AINL 2016: Bodrunova, Blekanov, MaksimovAINL 2016: Bodrunova, Blekanov, Maksimov
AINL 2016: Bodrunova, Blekanov, Maksimov
 
AINL 2016: Boldyreva
AINL 2016: BoldyrevaAINL 2016: Boldyreva
AINL 2016: Boldyreva
 
AINL 2016: Rykov, Nagornyy, Koltsova, Natta, Kremenets, Manovich, Cerrone, Cr...
AINL 2016: Rykov, Nagornyy, Koltsova, Natta, Kremenets, Manovich, Cerrone, Cr...AINL 2016: Rykov, Nagornyy, Koltsova, Natta, Kremenets, Manovich, Cerrone, Cr...
AINL 2016: Rykov, Nagornyy, Koltsova, Natta, Kremenets, Manovich, Cerrone, Cr...
 
AINL 2016: Kozerenko
AINL 2016: Kozerenko AINL 2016: Kozerenko
AINL 2016: Kozerenko
 
AINL 2016: Shavrina, Selegey
AINL 2016: Shavrina, SelegeyAINL 2016: Shavrina, Selegey
AINL 2016: Shavrina, Selegey
 
AINL 2016: Khudobakhshov
AINL 2016: KhudobakhshovAINL 2016: Khudobakhshov
AINL 2016: Khudobakhshov
 
AINL 2016: Proncheva
AINL 2016: PronchevaAINL 2016: Proncheva
AINL 2016: Proncheva
 
AINL 2016:
AINL 2016: AINL 2016:
AINL 2016:
 
AINL 2016: Bugaychenko
AINL 2016: BugaychenkoAINL 2016: Bugaychenko
AINL 2016: Bugaychenko
 
AINL 2016: Grigorieva
AINL 2016: GrigorievaAINL 2016: Grigorieva
AINL 2016: Grigorieva
 
AINL 2016: Muravyov
AINL 2016: MuravyovAINL 2016: Muravyov
AINL 2016: Muravyov
 
AINL 2016: Just AI
AINL 2016: Just AIAINL 2016: Just AI
AINL 2016: Just AI
 
AINL 2016: Moskvichev
AINL 2016: MoskvichevAINL 2016: Moskvichev
AINL 2016: Moskvichev
 
AINL 2016: Goncharov
AINL 2016: GoncharovAINL 2016: Goncharov
AINL 2016: Goncharov
 
AINL 2016: Malykh
AINL 2016: MalykhAINL 2016: Malykh
AINL 2016: Malykh
 

Recently uploaded

Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 

Recently uploaded (20)

Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 

Convolutional neural networks for text classification

  • 1. Convolutional neural networks for text classification Lidia Pivovarova Research Seminar in Language Technology 1st June 2017
  • 2. PULS Project ● Web-scale surveillance of news ● Current topic: Business. (Previous topics: Security, Epidemics, ...) ● Tracking news from thousands of news sites about business activities – ≈6000–8000 news items per day – among hundreds of thousands of entities: ● companies, persons, products, organizations, ... – tracking many kinds of activities: ● merger, buyout, bankruptcy, layoff, product launch and recall, ... TEKES Project, led by Roman Yangarber http://newsweb.cs.helsinki.fi
  • 3. Neural network ● Each node computes a function on its inputs to produce an output ● A network ~ a huge formula with many parameters ● Adjustment of parameters given an output and a true value (back propagation) ● A network structure and the inference are separated An image from http://www.opennn.net/
  • 4. Convolutional neural networks images found in the data science blog: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
  • 5. Convolutional neural networks images found in the data science blog: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
  • 6. Convolutional neural networks images found in the data science blog: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
  • 7. Convolutional neural networks images found in the data science blog: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
  • 8. CNN for NLP images found in the WildML blog: http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/ also very good tutorial on CNN for NLP with Tensorflow http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/
  • 9. Polarity detection in business news General Motors: Daimler:
  • 10. ● The task: to determine the sentiment polarity of a mention of a given company in a business news article ● Similar to (aspect-based) sentiment detection ● But: – business news articles typically do not aim to express emotions or subjectivity – business news contains genre-specific word usages ● Thus we cannot apply existing resources (dictionaries, labelled corpora) developed for more general sentiment analysis Polarity detection in business news
  • 11. Manual annotation ● ~18,000 documents, ~20,000 names ● annotation by groups (Escoter at. el. EACL 2017) ● post-proccessing: D1: Valeant Pharmaceuticals International Inc., the embattled Canadian drugmaker, agreed to sell about $2.1 billion in assets to get cash to streamline its businesses and begin easing its debt burden. D2: L’Oreal to buy three skincare brands from Valeant for $1.3 billion. The French cosmetics giant paid nearly eight times the brand’s combined annual revenue of $168 million
  • 12. Knowledge transfer ● 2M collection of short business reports ● Manuall annotation with events labels – 291 labels in total – 26 labels imply positive polarity: Investment, New Product, Sponsorship,... – 12 labels imply negative polarity: Fraud, Layoff, Bankruptcy,... – 200,000 documents with exactly one company name and non-ambiguous polarity ● High-level feature transfer: – train a model for event labels – replace the last layer of the network and continue training for polarity labels
  • 13. Token-based model – Y.Kim (EMNLP 2014) + focus – position of the target company
  • 14. Region-based model – R. Johnson & T. Zhang (NIPS 2015, NAACL 2015) + focus – position of the target company
  • 15. Experiments ● Tune: train on 200,000 documents with mapped polarity labels when tune using 12,000 manually annotated documents ● Combine: train on 200,000 mapped + 12,000 manually annotated documents ● Feature transfer: train on 2M documents with original event labels, replace the last layer, tune to 12,000 manually annotated documents
  • 16. Examples ● Valeant to sell Dendreon unit to China’s Sanpower for $820 million. Canada’s Valeant Pharmaceuticals International Inc. said its affiliate will sell its Dendreon cancer business to China’s Sanpower Group Co. Ltd. for $819.9 million, as the drugmaker continues to shed its non-core assets to repay debt. ● True score: -1.0 ● With focus: 0.022 ● Without focus: -0.322
  • 17. Examples ● Valued less than Toshiba in 2004, Apple today has a market capitalization of US$700-billion, as shares hit a record high close yesterday. Toshiba shares could soon be relatively worthless, as they may have to declare bankruptcy. ● True score: 1.0 ● With focus: 0.310 ● Without focus: -0.197
  • 18. Examples ● Bailed-out Lloyds Banking Group reports highest annual profit for ten years. Bottom line profits at the taxpayer-backed lender more than doubled to £4.24 billion last year, partly due to lower PPI compensationpayouts. The result marks its best performance at the UK’s biggest retail banking group since 2006. The government put £20.3b into the banking group, acquiring a 43 per cent stake to save it from collapse at the height of the financial crisis. This has now reduced to less than five per cent following a series of share sales and the government has indicated that it aims to shed its remaining stake this year. Announcing the results, Lloyds shares jumped 3.6 per cent and the group said its performance was “inextricably linked to the health of the UK economy, which has been more resilient than the market expected” since the referendum on EU membership. ● True score: 0.4 ● With focus: 0.179 ● Without focus: -0.162
  • 19. Examples ● Facebook CEO Mark Zuckerberg and his wife are dropping controversial suits they filed in December to buy small plots of land that are part of a 700-acre waterfront estate they own on the island of Kauai in Hawaii. ● True score: 0.0 ● With focus: 0.-743 ● Without focus: -0.397
  • 20. Examples ● Valued less than Toshiba in 2004, Apple today has a market capitalization of US$700-billion, as shares hit a record high close yesterday. Toshiba shares could soon be relatively worthless, as they may have to declare bankruptcy. ● True score: 1.0 ● With focus: 0.310 ● Without focus: -0.197
  • 21. Examples ● Valued less than Toshiba in 2004, Apple today has a market capitalization of US$700-billion, as shares hit a record high close yesterday. Toshiba shares could soon be relatively worthless, as they may have to declare bankruptcy. ● True score: 1.0 ● With focus: 0.310 ● Without focus: -0.197
  • 23.
  • 24. Thanks for your attention! ● More details can be found in: – L. Pivovarova, L. Escoter, A. Klami, & R. Yangarber SemEval 2017 – Also in my future publications...