Text mining with RStudio
| @AshLindley
Who am I?
I'm Ashley, I SEO things for a Media Agency in London.
Why?
As part of a website redesign and
migration we undertook a large piece of
analysis.
The output was fantastic but it took a lot
of people a very long time.
Great for a one off project, but
unfeasible to run with any kind of
regularity in its current format.
Keyword
Analysis
Social Listening
COMPETITIVE LANDSCAPE
Landscape
Gaps /
Opportunities
STRUCTURE, CONTENT & FORMAT OF NEW SITE
Keyword
Mapping
URL Structure Landing Pages
NEEDS ANALYSIS
Why use R?
WHY
WHYR?WHY
Recipe:
Packages
tm: A framework for text mining
applications within R
reshape2: Reshape data
dplyr: A grammar of data manipulation
stylo: easy-to-use implementations of
various established analyses in the field
of computational stylistics
WHYR?WHY
Recipe:
Functions
First are the functions that will be used
to remove all the rubbish:
• shortened links
• “RT” and “via”
• usernames
• non-alphanumeric
Second is to setup the word/phrase
frequency function to use after the data
has been cleaned.
WHYR?WHY
Recipe:
Cleaning
Here we remove numbers,
capitalisation, common words,
punctuation, and otherwise prepare the
text for analysis.
WHYR?WHY
Recipe:
ngrams
RECIPEWHYR?WHY
What next?
shinyapps.io
Feedback? Questions?
| @AshLindley

Text mining with R-studio