SlideShare a Scribd company logo
1 of 23
Download to read offline
Wednesday, November 25, 2015
TEXT MINING
AU FOOTBALL FACEBOOK PAGE
Sai Praneeth Reddy
Auburn University
Executive Summary
Auburn athletic department is interested in analyzing the facebook posts on the AU football
page in order to get an idea of the topics that people are mostly talking about. They would also
like to know the sentiment of the people towards auburn football, if it is either positive or
negative.
In addition to that they also want to know which player or coach they are mostly talking about
and if the comments about them are positive or negative. The athletic department would like to
use this answers in order to improve there football team.
We use text analytics to answer the above mention questions and then perform sentimental
analysis to find if the general outlook towards the team is positive or negative.
Table of Contents
Introduction … … … … … … … … … … … … … … 4
Description of Business Problem
Methodology … … … … … … … … … … … … … …4
Text Mining
Analysis and Results … … … … … … … … … …5-12
Tables and figures
Analysis
Conclusions … … … … … … … … … … … … … 13


Introduction
In order to improve Auburn’s football team performance the Auburn athletic department is interested in
analyzing the facebook posts, comments and replies on the AU football page. The athletic department
would like to know the general outlook of the public towards there football team. In addition to that the
athletic department wants to know the players that are being talked about the most and the opinion of
public towards these players.
Methodology
One of the most commonly used methodology to analyze texts is the text mining. in our case we
perform the analysis of the facebook posts using SAS E-miner. We first import the posts, comments
and replies on the the AU facebook into an EXCEL file using Web Crawler. Once the posts are imported
into an EXCEL file it is converted into a SAS readable format file using File import.
We then use the Text parsing node to remove unwanted words followed by text filtering were we group
words that are synonyms and also drop certain words that we are not interested in. In the text filtering
node we can also get the snippets of the text of the word we are interested in analyzing.The text cluster
node groups the terms into clusters where each cluster represents the terms that occur together.
!4
Analysis and RESULTS
DATA PREPARATION
The facebook posts are imported into an EXCEL file using web crawling, the EXCEL file is then
imported in SAS and is converted into SAS readable format using file import node.
TEXT PARSING
All the variables in the input data are set to rejected except for the post id whose role is set to
id and message whose role is role is set to Text. The text parsing node enables us to parse the
text and analyze the the number of terms and documents by frequency. In our Text parsing
node we dropped all words except for nouns, proper nouns and adjectives.
fig 1
The above ZIPF plot shows shows that Gus Malzahn is one of the widely disused topic, along
with Jermey Johnson and Will Muschamp.
!5
Some of the most widely discussed players and coaches are as follows:
Table 1
fig 2
The above Number of documents by weight plot shows that Jermy Johnson has relatively
heigh weight compared to all other players.
Names Weight
Malzahn 0.354
Jonathan wallace 0.534
Rhett Lashlee 0.614
Jermy Johnson 0.618
Carl Lawson 0.608
Will Muschamp 0.510
Sean White 0.6
!6
Some of the most widely discussed topics / words are:
Table 2
TEXT FILTERING
Text filter node is used to keep/drop terms that are either are too frequent or highly infrequent
as these terms are not of much use in grouping topics.The node also helps us in grouping
words that are similar to one another (i.e synonyms).
Using the interactive text filter we can also know in what context people are using someones
name are a word we are interested in. It will help us understand the sentiment of the people
towards a particular person or a topic.Using text filtering it is also possible to know which
words are strongly associated based on the concept link diagram which shows relationship
towards terms.
Topics Number of Documents
Defence 27
Offence 18
good 30
QB 163
Running 12
Receiver 10
!7
Sentiment Analysis:
• Gus Malzahn
Table 3
The above text snippets indicate that there is a negative perception among lot of people about
coach Gus Malzahn, lot of people seem to be blaming Gus Malzahn for the defeat.
!8
• Jermy Johnson
Table 4
From the above text snippets it appears that even though Jeremy Johnson did not have a great year lot of people
still seem to trust his abilities. It also appears that people think Auburn’s offense is better when Jermy Johnson is
the quarter back rather then Sean white.
• Sean White
!9
From the above text snippet it appears that there seems to be a no clear favorite quarter back,
as there is a lot of divided opinion on who the starting quarter back should be.
• Offence
Table 6
It appears that lot of people seem to blame the offense for the Auburns bad performance.
There seem to be a general opinion that the defense is doing better and the offense is letting
the team down.
• Defence
Table 7
!10
From the above text snippet it appears that there seem to be a generally positive outlook about
Auburn’s defense. They think that defense has improved a lot under Muschamp and it is the
offense that is letting them down.
TEXT CLUSTER
The text cluster node groups the terms into clusters where each cluster represents the related
terms that occur together. This can be particularly useful in the sense that the related terms are
grouped into clusters and the biggest sector into the circle represents the topic that most
customers are talking about.
Table 8
The words defense, explosive and Muschamp are placed in a single cluster indicating that
there that people are generally happy with the defense and attribute this improvement in
performance to Will Muschamp.
The words don’t, improvement, and Lashlee are used together a lot indicating that people in
general want the offense and the offense coach Rhett Lashlee to do better.
!11
Fig 3
Fig 4
!12
TEXT TOPIC
From the text topic node output, we can find the terms that are grouped together and there
cutoffs . The text topic node can be refined further by using text cluster node. The text topic
node performs cluster analysis to combine words that are interesting to analysts.
Table 9
!13
CONCLUSION
From the analysis of the facebook posts it appears that people are in general disappointed with
the overall performance of the team. Though they feel that the defense has done better then
last season it is the offense that let them down.
It also appears that people prefer Jermy Johnson as the teams Quarter back over Sean White.
In addition to that majority of the people seem to blame the head coach Gus Malzahn for the
teams failure and think that the defense coach Will Muschamp has done a good job. 

!14
APPENDIX
!15
!16
3
4
6
14
13
14
15

More Related Content

Similar to Text mining

Help With College Application Essay
Help With College Application EssayHelp With College Application Essay
Help With College Application EssayTracy Aldridge
 
Data presentation 2.pptx
Data presentation 2.pptxData presentation 2.pptx
Data presentation 2.pptxShayanHaider14
 
The leader's pack for meeting individuals when taking over a team
The leader's pack for meeting individuals when taking over a teamThe leader's pack for meeting individuals when taking over a team
The leader's pack for meeting individuals when taking over a teampositivespirit
 
An Aging Fan Base: Using Twitter to Develop a New Generation of Baseball Fans
An Aging Fan Base: Using Twitter to Develop a New Generation of Baseball FansAn Aging Fan Base: Using Twitter to Develop a New Generation of Baseball Fans
An Aging Fan Base: Using Twitter to Develop a New Generation of Baseball FansAllison Levin
 
Outline InstructionsHere is the template that should help an.docx
Outline InstructionsHere is the template that should help an.docxOutline InstructionsHere is the template that should help an.docx
Outline InstructionsHere is the template that should help an.docxalfred4lewis58146
 
Identifying Key Factors in Winning MLB Games Using a Data-Mining Approach
Identifying Key Factors in Winning MLB Games Using a Data-Mining ApproachIdentifying Key Factors in Winning MLB Games Using a Data-Mining Approach
Identifying Key Factors in Winning MLB Games Using a Data-Mining ApproachJoelDabady
 
Bus 308Week 3 Discussion 3Read Lecture 3. React to the mater
Bus 308Week 3 Discussion 3Read Lecture 3. React to the materBus 308Week 3 Discussion 3Read Lecture 3. React to the mater
Bus 308Week 3 Discussion 3Read Lecture 3. React to the materVannaSchrader3
 
http://qa.us/aaaaG9 is a link Multi channel content from new page
http://qa.us/aaaaG9 is a link Multi channel content from new pagehttp://qa.us/aaaaG9 is a link Multi channel content from new page
http://qa.us/aaaaG9 is a link Multi channel content from new pagenikhilawareness
 
Harry Potter 7-2 3D tonight!!! http://4rd.ca/aaaj6w
Harry Potter 7-2 3D tonight!!! http://4rd.ca/aaaj6wHarry Potter 7-2 3D tonight!!! http://4rd.ca/aaaj6w
Harry Potter 7-2 3D tonight!!! http://4rd.ca/aaaj6wnikhilawareness
 
Go to all channels so that I may test your stats tom
Go to all channels so that I may test your stats tomGo to all channels so that I may test your stats tom
Go to all channels so that I may test your stats tomnikhilawareness
 
This is going everywhere
This is going everywhereThis is going everywhere
This is going everywherenikhilawareness
 
All channels minus Awareness channel
All channels minus Awareness channelAll channels minus Awareness channel
All channels minus Awareness channelnikhilawareness
 
Staging - CD Page - Super Content
Staging - CD Page - Super ContentStaging - CD Page - Super Content
Staging - CD Page - Super Contentnikhilawareness
 
Compare And Contrast Persuasive Essay
Compare And Contrast Persuasive EssayCompare And Contrast Persuasive Essay
Compare And Contrast Persuasive EssayKeri Sanders
 
Solo Taxonomy: An introduction by Jack Cassidy
Solo Taxonomy: An introduction by Jack CassidySolo Taxonomy: An introduction by Jack Cassidy
Solo Taxonomy: An introduction by Jack CassidyLiz Smith
 
Assignment Breif.pdfAssignment Brief Module Code THF 58 .docx
Assignment Breif.pdfAssignment Brief Module Code THF 58 .docxAssignment Breif.pdfAssignment Brief Module Code THF 58 .docx
Assignment Breif.pdfAssignment Brief Module Code THF 58 .docxrock73
 

Similar to Text mining (20)

Help With College Application Essay
Help With College Application EssayHelp With College Application Essay
Help With College Application Essay
 
Data presentation 2.pptx
Data presentation 2.pptxData presentation 2.pptx
Data presentation 2.pptx
 
The leader's pack for meeting individuals when taking over a team
The leader's pack for meeting individuals when taking over a teamThe leader's pack for meeting individuals when taking over a team
The leader's pack for meeting individuals when taking over a team
 
An Aging Fan Base: Using Twitter to Develop a New Generation of Baseball Fans
An Aging Fan Base: Using Twitter to Develop a New Generation of Baseball FansAn Aging Fan Base: Using Twitter to Develop a New Generation of Baseball Fans
An Aging Fan Base: Using Twitter to Develop a New Generation of Baseball Fans
 
LAX IMPACT! White Paper
LAX IMPACT! White PaperLAX IMPACT! White Paper
LAX IMPACT! White Paper
 
Outline InstructionsHere is the template that should help an.docx
Outline InstructionsHere is the template that should help an.docxOutline InstructionsHere is the template that should help an.docx
Outline InstructionsHere is the template that should help an.docx
 
Identifying Key Factors in Winning MLB Games Using a Data-Mining Approach
Identifying Key Factors in Winning MLB Games Using a Data-Mining ApproachIdentifying Key Factors in Winning MLB Games Using a Data-Mining Approach
Identifying Key Factors in Winning MLB Games Using a Data-Mining Approach
 
Bus 308Week 3 Discussion 3Read Lecture 3. React to the mater
Bus 308Week 3 Discussion 3Read Lecture 3. React to the materBus 308Week 3 Discussion 3Read Lecture 3. React to the mater
Bus 308Week 3 Discussion 3Read Lecture 3. React to the mater
 
Content Everywhere
Content EverywhereContent Everywhere
Content Everywhere
 
http://qa.us/aaaaG9 is a link Multi channel content from new page
http://qa.us/aaaaG9 is a link Multi channel content from new pagehttp://qa.us/aaaaG9 is a link Multi channel content from new page
http://qa.us/aaaaG9 is a link Multi channel content from new page
 
Harry Potter 7-2 3D tonight!!! http://4rd.ca/aaaj6w
Harry Potter 7-2 3D tonight!!! http://4rd.ca/aaaj6wHarry Potter 7-2 3D tonight!!! http://4rd.ca/aaaj6w
Harry Potter 7-2 3D tonight!!! http://4rd.ca/aaaj6w
 
Go to all channels so that I may test your stats tom
Go to all channels so that I may test your stats tomGo to all channels so that I may test your stats tom
Go to all channels so that I may test your stats tom
 
This is going everywhere
This is going everywhereThis is going everywhere
This is going everywhere
 
WC 2011 starts tom
WC 2011 starts tomWC 2011 starts tom
WC 2011 starts tom
 
All channels minus Awareness channel
All channels minus Awareness channelAll channels minus Awareness channel
All channels minus Awareness channel
 
Staging - CD Page - Super Content
Staging - CD Page - Super ContentStaging - CD Page - Super Content
Staging - CD Page - Super Content
 
Group Dynamics-Paper
Group Dynamics-PaperGroup Dynamics-Paper
Group Dynamics-Paper
 
Compare And Contrast Persuasive Essay
Compare And Contrast Persuasive EssayCompare And Contrast Persuasive Essay
Compare And Contrast Persuasive Essay
 
Solo Taxonomy: An introduction by Jack Cassidy
Solo Taxonomy: An introduction by Jack CassidySolo Taxonomy: An introduction by Jack Cassidy
Solo Taxonomy: An introduction by Jack Cassidy
 
Assignment Breif.pdfAssignment Brief Module Code THF 58 .docx
Assignment Breif.pdfAssignment Brief Module Code THF 58 .docxAssignment Breif.pdfAssignment Brief Module Code THF 58 .docx
Assignment Breif.pdfAssignment Brief Module Code THF 58 .docx
 

Recently uploaded

Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 

Recently uploaded (20)

Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 

Text mining

  • 1. Wednesday, November 25, 2015 TEXT MINING AU FOOTBALL FACEBOOK PAGE Sai Praneeth Reddy Auburn University
  • 2. Executive Summary Auburn athletic department is interested in analyzing the facebook posts on the AU football page in order to get an idea of the topics that people are mostly talking about. They would also like to know the sentiment of the people towards auburn football, if it is either positive or negative. In addition to that they also want to know which player or coach they are mostly talking about and if the comments about them are positive or negative. The athletic department would like to use this answers in order to improve there football team. We use text analytics to answer the above mention questions and then perform sentimental analysis to find if the general outlook towards the team is positive or negative.
  • 3. Table of Contents Introduction … … … … … … … … … … … … … … 4 Description of Business Problem Methodology … … … … … … … … … … … … … …4 Text Mining Analysis and Results … … … … … … … … … …5-12 Tables and figures Analysis Conclusions … … … … … … … … … … … … … 13 

  • 4. Introduction In order to improve Auburn’s football team performance the Auburn athletic department is interested in analyzing the facebook posts, comments and replies on the AU football page. The athletic department would like to know the general outlook of the public towards there football team. In addition to that the athletic department wants to know the players that are being talked about the most and the opinion of public towards these players. Methodology One of the most commonly used methodology to analyze texts is the text mining. in our case we perform the analysis of the facebook posts using SAS E-miner. We first import the posts, comments and replies on the the AU facebook into an EXCEL file using Web Crawler. Once the posts are imported into an EXCEL file it is converted into a SAS readable format file using File import. We then use the Text parsing node to remove unwanted words followed by text filtering were we group words that are synonyms and also drop certain words that we are not interested in. In the text filtering node we can also get the snippets of the text of the word we are interested in analyzing.The text cluster node groups the terms into clusters where each cluster represents the terms that occur together. !4
  • 5. Analysis and RESULTS DATA PREPARATION The facebook posts are imported into an EXCEL file using web crawling, the EXCEL file is then imported in SAS and is converted into SAS readable format using file import node. TEXT PARSING All the variables in the input data are set to rejected except for the post id whose role is set to id and message whose role is role is set to Text. The text parsing node enables us to parse the text and analyze the the number of terms and documents by frequency. In our Text parsing node we dropped all words except for nouns, proper nouns and adjectives. fig 1 The above ZIPF plot shows shows that Gus Malzahn is one of the widely disused topic, along with Jermey Johnson and Will Muschamp. !5
  • 6. Some of the most widely discussed players and coaches are as follows: Table 1 fig 2 The above Number of documents by weight plot shows that Jermy Johnson has relatively heigh weight compared to all other players. Names Weight Malzahn 0.354 Jonathan wallace 0.534 Rhett Lashlee 0.614 Jermy Johnson 0.618 Carl Lawson 0.608 Will Muschamp 0.510 Sean White 0.6 !6
  • 7. Some of the most widely discussed topics / words are: Table 2 TEXT FILTERING Text filter node is used to keep/drop terms that are either are too frequent or highly infrequent as these terms are not of much use in grouping topics.The node also helps us in grouping words that are similar to one another (i.e synonyms). Using the interactive text filter we can also know in what context people are using someones name are a word we are interested in. It will help us understand the sentiment of the people towards a particular person or a topic.Using text filtering it is also possible to know which words are strongly associated based on the concept link diagram which shows relationship towards terms. Topics Number of Documents Defence 27 Offence 18 good 30 QB 163 Running 12 Receiver 10 !7
  • 8. Sentiment Analysis: • Gus Malzahn Table 3 The above text snippets indicate that there is a negative perception among lot of people about coach Gus Malzahn, lot of people seem to be blaming Gus Malzahn for the defeat. !8
  • 9. • Jermy Johnson Table 4 From the above text snippets it appears that even though Jeremy Johnson did not have a great year lot of people still seem to trust his abilities. It also appears that people think Auburn’s offense is better when Jermy Johnson is the quarter back rather then Sean white. • Sean White !9
  • 10. From the above text snippet it appears that there seems to be a no clear favorite quarter back, as there is a lot of divided opinion on who the starting quarter back should be. • Offence Table 6 It appears that lot of people seem to blame the offense for the Auburns bad performance. There seem to be a general opinion that the defense is doing better and the offense is letting the team down. • Defence Table 7 !10
  • 11. From the above text snippet it appears that there seem to be a generally positive outlook about Auburn’s defense. They think that defense has improved a lot under Muschamp and it is the offense that is letting them down. TEXT CLUSTER The text cluster node groups the terms into clusters where each cluster represents the related terms that occur together. This can be particularly useful in the sense that the related terms are grouped into clusters and the biggest sector into the circle represents the topic that most customers are talking about. Table 8 The words defense, explosive and Muschamp are placed in a single cluster indicating that there that people are generally happy with the defense and attribute this improvement in performance to Will Muschamp. The words don’t, improvement, and Lashlee are used together a lot indicating that people in general want the offense and the offense coach Rhett Lashlee to do better. !11
  • 13. TEXT TOPIC From the text topic node output, we can find the terms that are grouped together and there cutoffs . The text topic node can be refined further by using text cluster node. The text topic node performs cluster analysis to combine words that are interesting to analysts. Table 9 !13
  • 14. CONCLUSION From the analysis of the facebook posts it appears that people are in general disappointed with the overall performance of the team. Though they feel that the defense has done better then last season it is the offense that let them down. It also appears that people prefer Jermy Johnson as the teams Quarter back over Sean White. In addition to that majority of the people seem to blame the head coach Gus Malzahn for the teams failure and think that the defense coach Will Muschamp has done a good job. 
 !14
  • 16. !16
  • 17. 3
  • 18. 4
  • 19. 6
  • 20. 14
  • 21. 13
  • 22. 14
  • 23. 15