Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Text mining with R-studio


Published on

MeasureCamp 9.

Just to note that the contents of slide 4 are from Mark Edmondson's deck from MeasureCamp V. Forgot to add a citation before uploading.

Published in: Data & Analytics
  • Hello! Who wants to chat with me? Nu photos with me here
    Are you sure you want to  Yes  No
    Your message goes here

Text mining with R-studio

  1. 1. Text mining with RStudio | @AshLindley
  2. 2. Who am I? I'm Ashley, I SEO things for a Media Agency in London.
  3. 3. Why? As part of a website redesign and migration we undertook a large piece of analysis. The output was fantastic but it took a lot of people a very long time. Great for a one off project, but unfeasible to run with any kind of regularity in its current format. Keyword Analysis Social Listening COMPETITIVE LANDSCAPE Landscape Gaps / Opportunities STRUCTURE, CONTENT & FORMAT OF NEW SITE Keyword Mapping URL Structure Landing Pages NEEDS ANALYSIS
  4. 4. Why use R? WHY
  5. 5. WHYR?WHY Recipe: Packages tm: A framework for text mining applications within R reshape2: Reshape data dplyr: A grammar of data manipulation stylo: easy-to-use implementations of various established analyses in the field of computational stylistics
  6. 6. WHYR?WHY Recipe: Functions First are the functions that will be used to remove all the rubbish: • shortened links • “RT” and “via” • usernames • non-alphanumeric Second is to setup the word/phrase frequency function to use after the data has been cleaned.
  7. 7. WHYR?WHY Recipe: Cleaning Here we remove numbers, capitalisation, common words, punctuation, and otherwise prepare the text for analysis.
  8. 8. WHYR?WHY Recipe: ngrams
  9. 9. RECIPEWHYR?WHY What next?
  10. 10. Feedback? Questions? | @AshLindley