1. Social Media Mining with R
By Richard Heimann and Nathan Danneman
Keywords: social media, data mining, sentiment analysis, supervised learning, unsupervised learning,
R, statistics, social science, Twitter, big data, data science
Social Media Mining with R is a concise, hands-on guide with many practical examples of mining
social media data and a detailed treatise on inference and social science research that will help you in
mining data in the real world.
Whether you are an undergraduate who wishes to get hands-on experience working with social data
from the Web, a practitioner wishing to expand your competencies and learn unsupervised sentiment
analysis, or you are simply interested in social data analysis, this book will prove to be an essential
asset. No previous experience with R or statistics is required, though having knowledge of both will
enrich your experience.
We are proud to announce the forthcoming release of our first book. We were first approached by
Packt Publishing in June 2013 to write on a topic of our choosing that related deeply to social media
and data mining. We turned down the offer due to an already busy work schedule and an initial
expectation on the part of Packt to complete the material in 60 days. The offer was interesting however
and we speculated about the content and direction of a book in a world where we had the time to take
on such a task. When Packt revisited the idea in July with a revised timeline we accepted, knowing in
advance that the new timeline would still be challenging. We missed our target by three months, but
we agree that the importance of the topic deserved the extra treatment and the resultant book reflects
the extra months.
The expected released date is March 24th 2014, by our publisher Packt, who is about to release their
2,000th book (https://www.packtpub.com/about-us). To celebrate their 2,000th book Packt is
campaigning a promotion that happens to coincide with the release of our book, “Social Media Mining
with R.” During this offer Packt is giving its readers a chance to dive into their comprehensive catalog
and Buy One, Get One Free across their entire range of eBooks.
The campaign begins on 18th-Mar-2014 and will continue up until 26th-Mar-2014.
• Unlimited purchases during the offer period
• Offer is automatically applied at checkout
The book can easily be found on many of the common outlets and some are listed here for
2. Barnes & Noble: http://bit.ly/1nS0Hde
Social Media Mining with R exposes readers to techniques known commonly to extract sentiment
from social media data by practitioners as well as less known nontrivial unsupervised sentiment
analysis. These techniques are rather complex and at times counterintuitive and often assumption-
laden. The authors provide readers with a how-to guide to implementing these models, and think it is
critical to explain the techniques in depth, so users can deploy them appropriately and interpret the
results. This book explains the theoretical grounds for the techniques developed and serves as a bridge
between the discussion of the pitfalls of social media mining and the execution of that mining.
Social Media Mining with R provides valid arguments for the value of data and from a social science
perspective how to mine data from the web. Readers are not assumed to know R or statistical analysis
but are pragmatically provided the tools required to execute sophisticated data mining techniques on
data from the web.
Social Media Mining with R begins by introducing the reader to the topic of social media data,
including its sources and properties. The book then explains the basics of R programming in a
straightforward, unassuming way. Thereafter, the authors make readers aware of the inferential
dangers associated with social media data, and how to avoid them, before describing and
implementing a suite of social media mining techniques.
Readers will learn the basics of R, social media and data mining. If you have ever been interested in
programming, social media, supervised or unsupervised learning, data science, or big data -
particularly as it relates to finding value from data on the web than this book is for you. We are excited
to share our experiences with you!
Overall, Social Media Mining in R provides a light theoretical background, comprehensive instruction,
and state-of-the-art techniques, such that readers will be well equipped to embark on their own
analyses of social media data.
You will learn the basics of programming in R, the world’s fastest growing, most flexible, open source
statistical programming language. You will also learn about the pitfalls, but also the possibilities,
inherent in social media data. Most consequentially, you will learn the skills necessary to implement
non-trivial social media analyses as well as:
• The basics of R and various data types.
• Social Science Research
• Data potential and pitfalls and inferential gotchas.
• Supervised and Unsupervised Learning
• Social Media
• Visualization and some cognitive pitfalls.
3. • Sentiment Analysis
Social Media Mining with R is intended for a wide audience, including the undergraduate who wishes
to get hands on experience working with social data from the web to practitioners wishing to expand
their competencies and learn unsupervised sentiment analysis. No previous experience with R or
statistics is required though having knowledge of both will enrich your experience.
About the Authors:
Richard Heimann leads the Data Science team at L-3 Data Tactics and focuses on advanced
analytics, data science, and cloud computing. He has followed the big data turn closely and consults
with government and industry on its implications.
In addition to teaching Human Terrain Analysis (HTA) at George Mason University, Richard is also
adjunct faculty at The University of Maryland, Baltimore County where he stimulates and facilitates
similar discussions built upon related principles. His investigation has been a near 15 year critique on
current theory and methodology. Find him on Twitter: @rheimann
Richard has also participated in Operation Iraqi Freedom and Operation Enduring Freedom -
Afghanistan, most recently in 2012 where he worked directly with the 82nd Airborne Division in
Kandahar. He has recently supported the Pentagon, Department of Homeland Security and the Defense
Advanced Research Projects Agency.
Nathan Danneman has a background in the quantitative study of international and civil conflict.
Recently, his research has included the analysis of textual and geospatial data and the study of
multivariate outlier detection. Nathan is currently a Data Scientist at Data Tactics, and he supports
programs at DARPA and the Department of Homeland Security. Find him on Twitter: @ndanneman