April 10th of 2018 budapest presentation

Selective Waves: 
Transfer Learning for
Sentiment Classification
 
Ammar Rashed# and Ahmet Bulut#,*
ammarrashed@std.sehir.edu.tr, ahmetbulut@sehir.edu.tr 
#Media Lab, *Data Science Lab
Istanbul Sehir University 
34865 Dragos, Istanbul
April 10th, 2018 Budapest, Hungary

Sentiment
https://www.crimsonhexagon.com/sentiment-analysis/

Sentiment Classification
• Given a piece of text (review, complaint, tweet, response,
email, status update, text message, and etc.), can we predict
its author’s tone of voice, i.e., its sentiment? 
 
- A customer complaining about your company’s product. 
- A user tweeting negatively on a newly released movie. 
- A press conference bolstering your company’s public image.

Supervised Learning
Movie Reviews
Star Ratings

Supervised Learning
• When you have a large labeled dataset, learning a predictor is
pretty straightforward.
• We can learn a multi-class classifier using multinomial logistic
regression, neural network, decision trees, random forests, and
etc.
• What if you do not have a large enough labeled dataset?
• Use crowdsourcing, e.g., Mechanical Turk.
• Hire annotators and pay them by the hour to label for you.

Supervised Learning
• What if you do not have a large enough labeled dataset?
• This is especially true for most languages except English.
• If we are going to build a sentiment analyser for Turkish but
we only have a small dataset for it, then we should make
use of the sentiment data available in English.
• We denote this approach as Transfer Learning.

Technical Details
• Facebook’s fastText is used for efficient word representations.
For any word in a given language, it outputs a word
embeddings vector of a desired dimension (e.g., d = 300).
• If a review contains two words with word vectors [1,2,3],
[4,5,6], then the BOW average would be computed as:  
[(1+4)/2, (2+5)/2, (3+6)/2] = [2.5, 3.5, 4.5]
• Since our sentiment classification is a multi-class classification
problem, we use softmax at the last layer.
• softmax returns the probabilities of each class and the target
class will have the highest probability.
• The sum of all the probabilities equals to 1.

Technical Details
• During model training, we used cross-entropy loss function in
order to compute the deltas used in updating the weight
vectors through back propagation.

Technical Details
• In order to quantify the accuracy of predicted scores, we used
the Mean Absolute Error (MAE) as the error metric.
• We used k-fold cross validation (k = 10) for reporting test
results.

Results
Selective Waves
(1)
Selective Waves
(2)
Training Time
(secs)
23 260
Prediction
Accuracy (%)
78.2 81.8

Thank you!
 
Ammar Rashed# and Ahmet Bulut#,*
ammarrashed@std.sehir.edu.tr, ahmetbulut@sehir.edu.tr 
#Media Lab, *Data Science Lab
Istanbul Sehir University 
34865 Dragos, Istanbul

April 10th of 2018 budapest presentation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to April 10th of 2018 budapest presentation

Similar to April 10th of 2018 budapest presentation (20)

Recently uploaded

Recently uploaded (20)

April 10th of 2018 budapest presentation