SlideShare a Scribd company logo
1 of 15
Download to read offline
1
Cork Institute of Technology - Candidate for Master of Science Degree 1
Using Big Data Analytics in a Social
Domain
Master’s in Cloud Computing 2013/2014
Ahmed Abdel-Aziz
May 2015
EMCCAe, CISSP, PMP
Cork Institute of Technology - Candidate for Master of Science Degree 2
Objective
1) Social Media, Analytics and the Marketing Campaign
2) Sentiment Analysis – Methodology & Techniques
3) The Need for Case-Study & an Analytics Prototype
4) Learning Outcomes & Future Work
Cork Institute of Technology - Candidate for Master of Science Degree 3
Social Media & Social Analytics
•  Social media sites offspring of Web 2.0 Movement – Based
on cloud computing model (Software-as-a-Service)
•  Percentage of companies using social media for marketing is
88%
Section 1 of 4
Cork Institute of Technology - Candidate for Master of Science Degree 4
Marketing Campaign Lifecycle
Section 1 of 4
•  Consists of 5 phases
•  Social analytics answers
social questions for each
phase
Ex: What is sentiment trend?
Cork Institute of Technology - Candidate for Master of Science Degree 5
•  Social analytic projects based on sentiment analysis
benefit from a well thought out methodology
Section 2 of 4
Sentiment Analysis Methodology &
Techniques
Cork Institute of Technology - Candidate for Master of Science Degree 6
Section 2 of 4
Sentiment Analysis Methodology &
Techniques
•  Social sentiment analysis starts with social listening
–  Social listening can be performed using a variety of
open source tools such as PostgreSQL, R, Wordle,
and Circos, as well as tools such as Attensity 360 and
Analyze.
•  Social data comes from 3 main categories of sources
–  Social user’s account – analytic capability limited by social
media provider (FB, Twitter, LinkedIn)
–  Social APIs – social media provider offers API to tap into
social data. Allows development of unique analytic programs
–  3rd party tools – provides very fast results but does not
offer same level of analytic capability of a custom program
SANS Technology Institute - Candidate for Master of Science Degree 7
Section 2 of 4
Sentiment Analysis Methodology
& Techniques
•  Sentiment analysis techniques grouped into two main
categories:
•  Supervised machine learning method
•  Unsupervised method
•  Supervised learning method learns features/words that
correlate with +ve/-ve sentiment. Can identify new text
sentiment
•  Unsupervised methods a lexicon is used with words pre-
scored for polarity values. Sum of scores gives sentiment
•  Both techniques widely used and offer comparable results
Cork Institute of Technology - Candidate for Master of Science Degree 7
Cork Institute of Technology - Candidate for Master of Science Degree 8
•  Company launched new product to market – Marketing
campaign already launched long ago and in Account
Performance Phase
•  Marketing team needs to measure upticks in sentiment
trend regarding new product to take appropriate actions
•  Data science team believes continuous user surveys are
ineffective and a computational approach is necessary ->
Better results and much less intrusive
Need for Case-Study/Analytics Prototype
Section 3 of 4
Cork Institute of Technology - Candidate for Master of Science Degree 9
Need for Case-Study/Analytics Prototype
•  Decision made to build a prototype for tool to measure
sentiment trend on Twitter specifically as start
•  Twitter found to be the social network of choice regarding
brand and product sentiment topics à Thus Twitter
•  Data science team key objectives:
–  Produce useful results quickly and cost efficiently
Cloud Computing Value Proposition!?
–  Get buy-in from marketing management to build full App
Section 3 of 4
Cork Institute of Technology - Candidate for Master of Science Degree 10
•  Applying the Analytic Project Lifecycle to the Prototype
–  Data Preparation
Input Data: raw tweets
Output Data: clean tweet text ready for sentiment analysis
–  Analytic Model Planning & Building
Input Data: clean tweet text and learnt Naïve Bayesian model
Output Data: sentiment of analyzed tweets
–  Communicate Results
Input Data: sentiment of analyzed tweets and tweets
Output Data: sentiment trend graph for both +ve and –ve
sentiments
Section 3 of 4
Need for Case-Study/Analytics Prototype
Cork Institute of Technology - Candidate for Master of Science Degree 11
Need for Case-Study/Analytics Prototype
Section 3 of 4
•  Technology decisions made by data science team
–  R programming language for social listening
–  Twitter Social APIs for source of social data
–  Leverage ready-made R packages to accelerate building time
–  R programming for data preparation
–  Leverage analytics cloud services offered such as Datumbox –
supervised machine learning method using Naïve Bayesian
–  R programming to build main body for prototype analytics
application
–  Making use of R plotting capabilities to present easy to understand
results for non-technical members of Marketing team
–  Settling on the technologies to use to build the full blown
application dealing with much larger data sets – GPText/Pivotal HD
Cork Institute of Technology - Candidate for Master of Science Degree 12
Need for Case-Study/Analytics Prototype
Section 3 of 4
•  Snippet R code for analytics application – Main Loop
•  > possentiments = 0
•  > negsentiments = 0
•  > for (i in 1:"5")
•  > {
•  > tweets = searchTwitter("iPhone", n=5, lang="en”)t
•  > tweet_txt = sapply(tweets, function(x) x$getText())
•  > tweet_clean = clean.text(tweet_txt)
•  > tweet_num = length(tweet_clean)
•  ……..
•  > for (i in 1:tweet_num)
•  > {
•  > tmp = getSentiment(tweet_clean[i], "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa")
•  > tweet_df$sentiment[i] = tmp$sentiment
•  ………..
•  > }
•  > possentiments <- c(possentiments, sum(tweet_df$sentiment=="positive"))
•  > negsentiments <- c(negsentiments, sum(tweet_df$sentiment=="negative"))
•  > Sys.sleep(5)
Cork Institute of Technology - Candidate for Master of Science Degree 13
Need for Case-Study/Analytics Prototype
•  Plotting the trend of both positive and negative sentiments
Section 3 of 4
SANS Technology Institute - Candidate for Master of Science Degree 14
Learning Outcomes
•  Initial State
–  Good foundation in cloud computing and data analytics
–  Very little knowledge in social domain – Not even FB account J
–  Last coding experience was Java 13 years back
•  Initial research project stages
–  Social media university
–  Addictive analytics workshop -> Introduction to Marketing domain
–  Pivotal workshop to learn data analytics in social domain ->
Relevant Pivotal Data Analytics Platforms: GPText and Pivotal HD
•  Later research project stages – practical
–  Learning enough about R to build small scale analytics application
–  How to leverage Datumbox analytics-as-a-service offering
Section 4 of 4
Cork Institute of Technology - Candidate for Master of Science Degree 14
Cork Institute of Technology - Candidate for Master of Science Degree 15
Summary
•  Cloud, social, and Data Analytics synergy serve Marketing
•  Is there an uptick in +ve/-ve sentiments of my product?
Is a question strategically important in the Account
Performance phase of a Marketing Campaign
•  The research answered the question using a computational
approach based on a supervised learning method for
sentiment analysis that is cloud based
•  Data source and data analytics in the cloud. Data preparation
and data presentation on-premise using R. Future work:
Optimize & Tune for Large Datasets -> Can be all Cloud

More Related Content

Viewers also liked

50 Must Follow Social Media influencer on Twitter
50 Must Follow Social Media influencer on Twitter50 Must Follow Social Media influencer on Twitter
50 Must Follow Social Media influencer on TwitterNavneet Sau
 
A Tour of the Data Science Process, a Case Study Using Movie Industry Data
A Tour of the Data Science Process, a Case Study Using Movie Industry DataA Tour of the Data Science Process, a Case Study Using Movie Industry Data
A Tour of the Data Science Process, a Case Study Using Movie Industry DataDomino Data Lab
 
Sentiment Analysis of Film-Related Messages on Social Media
Sentiment Analysis of Film-Related Messages on Social MediaSentiment Analysis of Film-Related Messages on Social Media
Sentiment Analysis of Film-Related Messages on Social MediaDomino Data Lab
 
Sentiment analysis of tweets
Sentiment analysis of tweetsSentiment analysis of tweets
Sentiment analysis of tweetsVasu Jain
 
Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Rachit Goel
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in TwitterAyushi Dalmia
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSumit Raj
 

Viewers also liked (7)

50 Must Follow Social Media influencer on Twitter
50 Must Follow Social Media influencer on Twitter50 Must Follow Social Media influencer on Twitter
50 Must Follow Social Media influencer on Twitter
 
A Tour of the Data Science Process, a Case Study Using Movie Industry Data
A Tour of the Data Science Process, a Case Study Using Movie Industry DataA Tour of the Data Science Process, a Case Study Using Movie Industry Data
A Tour of the Data Science Process, a Case Study Using Movie Industry Data
 
Sentiment Analysis of Film-Related Messages on Social Media
Sentiment Analysis of Film-Related Messages on Social MediaSentiment Analysis of Film-Related Messages on Social Media
Sentiment Analysis of Film-Related Messages on Social Media
 
Sentiment analysis of tweets
Sentiment analysis of tweetsSentiment analysis of tweets
Sentiment analysis of tweets
 
Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitter
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 

Similar to Thesis Presentation V4

e3f55595181f7cad006f26db820fb78ec146e00e-1646623528083 (1).pdf
e3f55595181f7cad006f26db820fb78ec146e00e-1646623528083 (1).pdfe3f55595181f7cad006f26db820fb78ec146e00e-1646623528083 (1).pdf
e3f55595181f7cad006f26db820fb78ec146e00e-1646623528083 (1).pdfSILVIUSyt
 
fINAL Lesson_1_Course_Introduction_v1.pptx
fINAL Lesson_1_Course_Introduction_v1.pptxfINAL Lesson_1_Course_Introduction_v1.pptx
fINAL Lesson_1_Course_Introduction_v1.pptxdataKarthik
 
Master of Technology in Enterprise Business Analytics
Master of Technology in Enterprise Business AnalyticsMaster of Technology in Enterprise Business Analytics
Master of Technology in Enterprise Business AnalyticsNUS-ISS
 
Data Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfData Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfSujata Gupta
 
216328327 nilesh-and-teams-project
216328327 nilesh-and-teams-project216328327 nilesh-and-teams-project
216328327 nilesh-and-teams-projecthomeworkping8
 
An Najah University IT Market Skill Needs Survey
An Najah University IT Market Skill Needs SurveyAn Najah University IT Market Skill Needs Survey
An Najah University IT Market Skill Needs SurveyLaith Kassis
 
Citihub Open Source and Cloud approach to Social Media Listening
Citihub Open Source and Cloud approach to Social Media ListeningCitihub Open Source and Cloud approach to Social Media Listening
Citihub Open Source and Cloud approach to Social Media ListeningChris Allison
 
2019 DSA 105 Introduction to Data Science Week 3
2019 DSA 105 Introduction to Data Science Week 32019 DSA 105 Introduction to Data Science Week 3
2019 DSA 105 Introduction to Data Science Week 3Ferdin Joe John Joseph PhD
 
Data Science Training and Placement
Data Science Training and PlacementData Science Training and Placement
Data Science Training and PlacementAkhilGGM
 
Enterprise Project Management Essential #3
Enterprise Project Management Essential #3Enterprise Project Management Essential #3
Enterprise Project Management Essential #3Nah Wee Yang
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)SayyedYusufali
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)SayyedYusufali
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)SayyedYusufali
 
Learn data science with r programming
Learn data science with r programmingLearn data science with r programming
Learn data science with r programmingRonikSharma1
 

Similar to Thesis Presentation V4 (20)

e3f55595181f7cad006f26db820fb78ec146e00e-1646623528083 (1).pdf
e3f55595181f7cad006f26db820fb78ec146e00e-1646623528083 (1).pdfe3f55595181f7cad006f26db820fb78ec146e00e-1646623528083 (1).pdf
e3f55595181f7cad006f26db820fb78ec146e00e-1646623528083 (1).pdf
 
CSEIT- ALL.pptx
CSEIT- ALL.pptxCSEIT- ALL.pptx
CSEIT- ALL.pptx
 
Hiring
HiringHiring
Hiring
 
fINAL Lesson_1_Course_Introduction_v1.pptx
fINAL Lesson_1_Course_Introduction_v1.pptxfINAL Lesson_1_Course_Introduction_v1.pptx
fINAL Lesson_1_Course_Introduction_v1.pptx
 
Master of Technology in Enterprise Business Analytics
Master of Technology in Enterprise Business AnalyticsMaster of Technology in Enterprise Business Analytics
Master of Technology in Enterprise Business Analytics
 
Data Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfData Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdf
 
216328327 nilesh-and-teams-project
216328327 nilesh-and-teams-project216328327 nilesh-and-teams-project
216328327 nilesh-and-teams-project
 
Software Analytics
Software AnalyticsSoftware Analytics
Software Analytics
 
LTB Demo - Healthcare Evaluation
LTB Demo - Healthcare EvaluationLTB Demo - Healthcare Evaluation
LTB Demo - Healthcare Evaluation
 
Ranjeeta_Resume
Ranjeeta_ResumeRanjeeta_Resume
Ranjeeta_Resume
 
J sai subrahmanyam_insofe
J sai subrahmanyam_insofeJ sai subrahmanyam_insofe
J sai subrahmanyam_insofe
 
An Najah University IT Market Skill Needs Survey
An Najah University IT Market Skill Needs SurveyAn Najah University IT Market Skill Needs Survey
An Najah University IT Market Skill Needs Survey
 
Citihub Open Source and Cloud approach to Social Media Listening
Citihub Open Source and Cloud approach to Social Media ListeningCitihub Open Source and Cloud approach to Social Media Listening
Citihub Open Source and Cloud approach to Social Media Listening
 
2019 DSA 105 Introduction to Data Science Week 3
2019 DSA 105 Introduction to Data Science Week 32019 DSA 105 Introduction to Data Science Week 3
2019 DSA 105 Introduction to Data Science Week 3
 
Data Science Training and Placement
Data Science Training and PlacementData Science Training and Placement
Data Science Training and Placement
 
Enterprise Project Management Essential #3
Enterprise Project Management Essential #3Enterprise Project Management Essential #3
Enterprise Project Management Essential #3
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)
 
Learn data science with r programming
Learn data science with r programmingLearn data science with r programming
Learn data science with r programming
 

Thesis Presentation V4

  • 1. 1 Cork Institute of Technology - Candidate for Master of Science Degree 1 Using Big Data Analytics in a Social Domain Master’s in Cloud Computing 2013/2014 Ahmed Abdel-Aziz May 2015 EMCCAe, CISSP, PMP
  • 2. Cork Institute of Technology - Candidate for Master of Science Degree 2 Objective 1) Social Media, Analytics and the Marketing Campaign 2) Sentiment Analysis – Methodology & Techniques 3) The Need for Case-Study & an Analytics Prototype 4) Learning Outcomes & Future Work
  • 3. Cork Institute of Technology - Candidate for Master of Science Degree 3 Social Media & Social Analytics •  Social media sites offspring of Web 2.0 Movement – Based on cloud computing model (Software-as-a-Service) •  Percentage of companies using social media for marketing is 88% Section 1 of 4
  • 4. Cork Institute of Technology - Candidate for Master of Science Degree 4 Marketing Campaign Lifecycle Section 1 of 4 •  Consists of 5 phases •  Social analytics answers social questions for each phase Ex: What is sentiment trend?
  • 5. Cork Institute of Technology - Candidate for Master of Science Degree 5 •  Social analytic projects based on sentiment analysis benefit from a well thought out methodology Section 2 of 4 Sentiment Analysis Methodology & Techniques
  • 6. Cork Institute of Technology - Candidate for Master of Science Degree 6 Section 2 of 4 Sentiment Analysis Methodology & Techniques •  Social sentiment analysis starts with social listening –  Social listening can be performed using a variety of open source tools such as PostgreSQL, R, Wordle, and Circos, as well as tools such as Attensity 360 and Analyze. •  Social data comes from 3 main categories of sources –  Social user’s account – analytic capability limited by social media provider (FB, Twitter, LinkedIn) –  Social APIs – social media provider offers API to tap into social data. Allows development of unique analytic programs –  3rd party tools – provides very fast results but does not offer same level of analytic capability of a custom program
  • 7. SANS Technology Institute - Candidate for Master of Science Degree 7 Section 2 of 4 Sentiment Analysis Methodology & Techniques •  Sentiment analysis techniques grouped into two main categories: •  Supervised machine learning method •  Unsupervised method •  Supervised learning method learns features/words that correlate with +ve/-ve sentiment. Can identify new text sentiment •  Unsupervised methods a lexicon is used with words pre- scored for polarity values. Sum of scores gives sentiment •  Both techniques widely used and offer comparable results Cork Institute of Technology - Candidate for Master of Science Degree 7
  • 8. Cork Institute of Technology - Candidate for Master of Science Degree 8 •  Company launched new product to market – Marketing campaign already launched long ago and in Account Performance Phase •  Marketing team needs to measure upticks in sentiment trend regarding new product to take appropriate actions •  Data science team believes continuous user surveys are ineffective and a computational approach is necessary -> Better results and much less intrusive Need for Case-Study/Analytics Prototype Section 3 of 4
  • 9. Cork Institute of Technology - Candidate for Master of Science Degree 9 Need for Case-Study/Analytics Prototype •  Decision made to build a prototype for tool to measure sentiment trend on Twitter specifically as start •  Twitter found to be the social network of choice regarding brand and product sentiment topics à Thus Twitter •  Data science team key objectives: –  Produce useful results quickly and cost efficiently Cloud Computing Value Proposition!? –  Get buy-in from marketing management to build full App Section 3 of 4
  • 10. Cork Institute of Technology - Candidate for Master of Science Degree 10 •  Applying the Analytic Project Lifecycle to the Prototype –  Data Preparation Input Data: raw tweets Output Data: clean tweet text ready for sentiment analysis –  Analytic Model Planning & Building Input Data: clean tweet text and learnt Naïve Bayesian model Output Data: sentiment of analyzed tweets –  Communicate Results Input Data: sentiment of analyzed tweets and tweets Output Data: sentiment trend graph for both +ve and –ve sentiments Section 3 of 4 Need for Case-Study/Analytics Prototype
  • 11. Cork Institute of Technology - Candidate for Master of Science Degree 11 Need for Case-Study/Analytics Prototype Section 3 of 4 •  Technology decisions made by data science team –  R programming language for social listening –  Twitter Social APIs for source of social data –  Leverage ready-made R packages to accelerate building time –  R programming for data preparation –  Leverage analytics cloud services offered such as Datumbox – supervised machine learning method using Naïve Bayesian –  R programming to build main body for prototype analytics application –  Making use of R plotting capabilities to present easy to understand results for non-technical members of Marketing team –  Settling on the technologies to use to build the full blown application dealing with much larger data sets – GPText/Pivotal HD
  • 12. Cork Institute of Technology - Candidate for Master of Science Degree 12 Need for Case-Study/Analytics Prototype Section 3 of 4 •  Snippet R code for analytics application – Main Loop •  > possentiments = 0 •  > negsentiments = 0 •  > for (i in 1:"5") •  > { •  > tweets = searchTwitter("iPhone", n=5, lang="en”)t •  > tweet_txt = sapply(tweets, function(x) x$getText()) •  > tweet_clean = clean.text(tweet_txt) •  > tweet_num = length(tweet_clean) •  …….. •  > for (i in 1:tweet_num) •  > { •  > tmp = getSentiment(tweet_clean[i], "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa") •  > tweet_df$sentiment[i] = tmp$sentiment •  ……….. •  > } •  > possentiments <- c(possentiments, sum(tweet_df$sentiment=="positive")) •  > negsentiments <- c(negsentiments, sum(tweet_df$sentiment=="negative")) •  > Sys.sleep(5)
  • 13. Cork Institute of Technology - Candidate for Master of Science Degree 13 Need for Case-Study/Analytics Prototype •  Plotting the trend of both positive and negative sentiments Section 3 of 4
  • 14. SANS Technology Institute - Candidate for Master of Science Degree 14 Learning Outcomes •  Initial State –  Good foundation in cloud computing and data analytics –  Very little knowledge in social domain – Not even FB account J –  Last coding experience was Java 13 years back •  Initial research project stages –  Social media university –  Addictive analytics workshop -> Introduction to Marketing domain –  Pivotal workshop to learn data analytics in social domain -> Relevant Pivotal Data Analytics Platforms: GPText and Pivotal HD •  Later research project stages – practical –  Learning enough about R to build small scale analytics application –  How to leverage Datumbox analytics-as-a-service offering Section 4 of 4 Cork Institute of Technology - Candidate for Master of Science Degree 14
  • 15. Cork Institute of Technology - Candidate for Master of Science Degree 15 Summary •  Cloud, social, and Data Analytics synergy serve Marketing •  Is there an uptick in +ve/-ve sentiments of my product? Is a question strategically important in the Account Performance phase of a Marketing Campaign •  The research answered the question using a computational approach based on a supervised learning method for sentiment analysis that is cloud based •  Data source and data analytics in the cloud. Data preparation and data presentation on-premise using R. Future work: Optimize & Tune for Large Datasets -> Can be all Cloud