Who am I?
I'm Ashley, I SEO things for a Media Agency in London.
As part of a website redesign and
migration we undertook a large piece of
The output was fantastic but it took a lot
of people a very long time.
Great for a one off project, but
unfeasible to run with any kind of
regularity in its current format.
STRUCTURE, CONTENT & FORMAT OF NEW SITE
URL Structure Landing Pages
tm: A framework for text mining
applications within R
reshape2: Reshape data
dplyr: A grammar of data manipulation
stylo: easy-to-use implementations of
various established analyses in the field
of computational stylistics
First are the functions that will be used
to remove all the rubbish:
• shortened links
• “RT” and “via”
Second is to setup the word/phrase
frequency function to use after the data
has been cleaned.
Here we remove numbers,
capitalisation, common words,
punctuation, and otherwise prepare the
text for analysis.