SlideShare a Scribd company logo
1 of 9
Text Message Predictive Analysis 
This is the report of the model “Text Message Predictive Analysis”. I did the 
backup of the text messages of my cell phone using the software “SMS Backup & 
Restore”. The backup was in xml format and I am using this as my data set for 
analysis. 
I made a word cloud of the text messages and made a model which predicts the 
contact who might have sent the message. The prediction works by matching the 
input message with the body of the text message. 
Program: 
We first find the working directory and set it to our workspace. 
We now take the backup which we made as our input. The name of the backup 
file is sms.xml. We save the input in a file called file. We then find the xmlroot 
and save that in a variable called root. 
The root file is: 
We now get all the received messages and store it in s1.
We now get all the messages from the contact named Mom and store it in s2. 
We now get all the contacts from the file and store it in s3. 
We now get the unique contacts from s3 and store it in m.
Now, we get the count of messages of each contact i.e. the number of messages 
of each contact and store it in m2. 
We now plot a bar graph for the contacts and their messages.
Now, we get the body of the text messages and store it in s4. 
We now create a VCorpus of the body of messages and store it in doc.
Now we trim the document by making it lower, removing the punctuations, 
removing the numbers etc. 
We, now make a TermDocumentMatrix of the corpus and store it in tdm. 
We then get the unique terms in the tdm and store them in tem.
We can then calculate the frequent terms. Here I calculate the terms which occur 
more than 10 times. 
We can find the terms which are associated to the term call. 
We now convert the TermDocumentMatrix tdm to matrix nm. 
We now calculate the number of occurrences of each term in the matrix nm and 
save them in dt. We convert them into numeric.
We create a WordCloud of the terms in the VCorpus. 
We store the row names of the matrix nm in r. The row names are the unique 
terms. So now r contains the unique terms of the VCorpus.
We now take the input message from the user. The system gives the names of the 
contacts who could have sent the message. 
We take the input message and trim the message as we trimmed the VCorpus. 
The system now runs the following code and finds out the contacts who could 
have sent that message.
Text Message Predictive Analysis

More Related Content

Viewers also liked

40 món bánh nổi tiếng âu á khuyết danh
40 món bánh nổi tiếng âu á   khuyết danh40 món bánh nổi tiếng âu á   khuyết danh
40 món bánh nổi tiếng âu á khuyết danh
Xuan Le
 
Попрошу без "е" - Авторевю №13'2011
Попрошу без "е" - Авторевю №13'2011Попрошу без "е" - Авторевю №13'2011
Попрошу без "е" - Авторевю №13'2011
PeugeotUA
 
Agenda setmana 12-març15
Agenda setmana 12-març15Agenda setmana 12-març15
Agenda setmana 12-març15
6sise
 
Mo hammed k
Mo hammed kMo hammed k
Mo hammed k
022y2009
 

Viewers also liked (16)

Chong tham
Chong thamChong tham
Chong tham
 
Doc 2
Doc 2Doc 2
Doc 2
 
Eqipo ale _exposicion unidad 4
Eqipo ale  _exposicion unidad 4Eqipo ale  _exposicion unidad 4
Eqipo ale _exposicion unidad 4
 
Huergo
HuergoHuergo
Huergo
 
40 món bánh nổi tiếng âu á khuyết danh
40 món bánh nổi tiếng âu á   khuyết danh40 món bánh nổi tiếng âu á   khuyết danh
40 món bánh nổi tiếng âu á khuyết danh
 
PresentacióN1
PresentacióN1PresentacióN1
PresentacióN1
 
Попрошу без "е" - Авторевю №13'2011
Попрошу без "е" - Авторевю №13'2011Попрошу без "е" - Авторевю №13'2011
Попрошу без "е" - Авторевю №13'2011
 
Doc1
Doc1Doc1
Doc1
 
A guardian angel
A guardian angelA guardian angel
A guardian angel
 
Listening
ListeningListening
Listening
 
poster
posterposter
poster
 
Agenda setmana 12-març15
Agenda setmana 12-març15Agenda setmana 12-març15
Agenda setmana 12-març15
 
Eurodocs faxepark horizon 2020 presentation 2014 11-13
Eurodocs faxepark horizon 2020 presentation 2014 11-13Eurodocs faxepark horizon 2020 presentation 2014 11-13
Eurodocs faxepark horizon 2020 presentation 2014 11-13
 
Here Was A Man 10
Here Was A Man 10Here Was A Man 10
Here Was A Man 10
 
артур конан дойл 2
артур конан дойл 2артур конан дойл 2
артур конан дойл 2
 
Mo hammed k
Mo hammed kMo hammed k
Mo hammed k
 

Similar to Text Message Predictive Analysis

Text Analytics
Text AnalyticsText Analytics
Text Analytics
Ajay Ram
 
Introductionto Xm Lmessaging
Introductionto Xm LmessagingIntroductionto Xm Lmessaging
Introductionto Xm Lmessaging
LiquidHub
 
Xml representation oftextspecifications
Xml representation oftextspecificationsXml representation oftextspecifications
Xml representation oftextspecifications
usert098
 
DITA and Translation Best Praticices
DITA and Translation Best PraticicesDITA and Translation Best Praticices
DITA and Translation Best Praticices
Andrzej Zydroń MBCS
 

Similar to Text Message Predictive Analysis (20)

Text Analytics
Text AnalyticsText Analytics
Text Analytics
 
Text based search engine on a fixed corpus and utilizing indexation and ranki...
Text based search engine on a fixed corpus and utilizing indexation and ranki...Text based search engine on a fixed corpus and utilizing indexation and ranki...
Text based search engine on a fixed corpus and utilizing indexation and ranki...
 
Text Mining with R
Text Mining with RText Mining with R
Text Mining with R
 
Lecture 3.mte 407
Lecture 3.mte 407Lecture 3.mte 407
Lecture 3.mte 407
 
RDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-rRDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-r
 
Tricks in natural language processing
Tricks in natural language processingTricks in natural language processing
Tricks in natural language processing
 
A-Study_TopicModeling
A-Study_TopicModelingA-Study_TopicModeling
A-Study_TopicModeling
 
XML Schema.pptx
XML Schema.pptxXML Schema.pptx
XML Schema.pptx
 
XML Tutor maXbox starter27
XML Tutor maXbox starter27XML Tutor maXbox starter27
XML Tutor maXbox starter27
 
Impquest2
Impquest2Impquest2
Impquest2
 
Python programming : Files
Python programming : FilesPython programming : Files
Python programming : Files
 
Web data management (chapter-1)
Web data management (chapter-1)Web data management (chapter-1)
Web data management (chapter-1)
 
File handling3 (1).pdf uhgipughserigrfiogrehpiuhnfi;reuge
File handling3 (1).pdf uhgipughserigrfiogrehpiuhnfi;reugeFile handling3 (1).pdf uhgipughserigrfiogrehpiuhnfi;reuge
File handling3 (1).pdf uhgipughserigrfiogrehpiuhnfi;reuge
 
Introductionto Xm Lmessaging
Introductionto Xm LmessagingIntroductionto Xm Lmessaging
Introductionto Xm Lmessaging
 
MacAD.UK 2018: Your Code Should Document Itself
MacAD.UK 2018: Your Code Should Document ItselfMacAD.UK 2018: Your Code Should Document Itself
MacAD.UK 2018: Your Code Should Document Itself
 
A look ahead at spark 2.0
A look ahead at spark 2.0 A look ahead at spark 2.0
A look ahead at spark 2.0
 
Wordcloud Demo
Wordcloud DemoWordcloud Demo
Wordcloud Demo
 
Xml representation oftextspecifications
Xml representation oftextspecificationsXml representation oftextspecifications
Xml representation oftextspecifications
 
DITA and Translation Best Praticices
DITA and Translation Best PraticicesDITA and Translation Best Praticices
DITA and Translation Best Praticices
 
Dictionary
DictionaryDictionary
Dictionary
 

Text Message Predictive Analysis

  • 1. Text Message Predictive Analysis This is the report of the model “Text Message Predictive Analysis”. I did the backup of the text messages of my cell phone using the software “SMS Backup & Restore”. The backup was in xml format and I am using this as my data set for analysis. I made a word cloud of the text messages and made a model which predicts the contact who might have sent the message. The prediction works by matching the input message with the body of the text message. Program: We first find the working directory and set it to our workspace. We now take the backup which we made as our input. The name of the backup file is sms.xml. We save the input in a file called file. We then find the xmlroot and save that in a variable called root. The root file is: We now get all the received messages and store it in s1.
  • 2. We now get all the messages from the contact named Mom and store it in s2. We now get all the contacts from the file and store it in s3. We now get the unique contacts from s3 and store it in m.
  • 3. Now, we get the count of messages of each contact i.e. the number of messages of each contact and store it in m2. We now plot a bar graph for the contacts and their messages.
  • 4. Now, we get the body of the text messages and store it in s4. We now create a VCorpus of the body of messages and store it in doc.
  • 5. Now we trim the document by making it lower, removing the punctuations, removing the numbers etc. We, now make a TermDocumentMatrix of the corpus and store it in tdm. We then get the unique terms in the tdm and store them in tem.
  • 6. We can then calculate the frequent terms. Here I calculate the terms which occur more than 10 times. We can find the terms which are associated to the term call. We now convert the TermDocumentMatrix tdm to matrix nm. We now calculate the number of occurrences of each term in the matrix nm and save them in dt. We convert them into numeric.
  • 7. We create a WordCloud of the terms in the VCorpus. We store the row names of the matrix nm in r. The row names are the unique terms. So now r contains the unique terms of the VCorpus.
  • 8. We now take the input message from the user. The system gives the names of the contacts who could have sent the message. We take the input message and trim the message as we trimmed the VCorpus. The system now runs the following code and finds out the contacts who could have sent that message.