Your SlideShare is downloading. ×
  • Like
Carma internet research module: Future data collection
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Carma internet research module: Future data collection

  • 511 views
Published

 

Published in Education , Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
511
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
3
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Future Data CollectionCARMA Internet Research Module Jeff Stanton
  • 2. Several Promising Environments and Techniques• Visual Surveys• Audio and video interviewing• Virtual Worlds• Web scraping• Network extraction and mapping• Polls Everywhere• Mobility
  • 3. Visual SurveysVisual DNA: http://www.visualdna.com/Also try: http://www.youniverse.com/
  • 4. Visual SurveysProvides an engaging alternative to text-based surveys; more fun for respondentsRequires considerable set-up time; each screen is like an item; each picture is like an item response; every item and response must be keyed against one or more criteria Example: Previous page, “How do you approach stress,” could be keyed against other subjective or objective measures of stress, coping, general health, immune response, etc.
  • 5. Audio and Video InterviewingMethods: Structured, semi- structured, and unstructured interviewing; focus groupsProducts: Skype, WebEx, Adobe Connect, Cisco TelepresenceAdvantages: Reduced travel costs, speedDisadvantages: High bandwidth, user technology requirements, unreliable connections
  • 6. Virtual WorldsVastPark: http://www.vastpark.com/
  • 7. Virtual WorldsCombine text and audio chat with social networking and 3D model buildingMethods: Structured, semi-structured, and unstructured interviewing; focus groups; unobtrusive observation; participant observation; possibly some experimental perceptual, cooperative, or navigational tasksProducts: VastPark, OpenSim, EduSim, TeleplaceAdvantages: Speed, low costDisadvantages: Steep learning curve; high bandwidth; user technology requirements; unreliable connections
  • 8. Web Scraping
  • 9. Web ScrapingRetrieval and processing of text or images, e.g., from blogs; processing may include semantic analysis of people, events, emotionsMethods: Archival document analysisProducts: 100s of commercial, mainly focused on brand, reputation, marketing; open source product: WebHarvestAdvantages: Data are plentiful and cover a wide range of topicsDisadvantages: Technology hard to master; even after considerable automated processing, analysis has an intensive, qualitative flavor
  • 10. Make a Wordcloud with Twitter and R• Download R, the open source statistical platform; for more fun, also download R-Studio; both available for Windows, Mac, and Linux• You will need four packages to make a word cloud: twitteR, stringr, tm, and wordcloud – Use install.packages() and library() commands to prepare packages for use in R• Code appears on the following page; explanation is in my free eBook, Introduction to Data Science on the iTunes Bookstore
  • 11. # TweetFrame() - Return a dataframe based on a search of TwitterTweetFrame<-function(searchTerm, maxTweets){ tweetList <- searchTwitter(searchTerm, n=maxTweets) tweetDF<- do.call("rbind", lapply(tweetList,as.data.frame)) # This last step sorts the tweets in arrival order return(tweetDF[order(as.integer(tweetDF$created)), ])}# CleanTweets() - Takes the junk out of a vector of tweet textsCleanTweets<-function(tweets){ tweets <- str_replace_all(tweets," "," ") tweets <- str_replace_all(tweets, + "http://t.co/[a-z,A-Z,0-9]{8}","") tweets <- str_replace(tweets,"RT @[a-z,A-Z]*: ","") tweets <- str_replace_all(tweets,"#[a-z,A-Z]*","") tweets <- str_replace_all(tweets,"@[a-z,A-Z]*","") return(tweets)}# Command line codetweetDF <- TweetFrame(”#yourhashtag",100)cleanText<-CleanTweets(tweetDF$text)tweetCorpus<-Corpus(VectorSource(cleanText))tweetTDM<-TermDocumentMatrix(tweetCorpus)tdMatrix <- as.matrix(tweetTDM)sortedMatrix<-sort(rowSums(tdMatrix), decreasing=TRUE)cloudFrame<-data.frame( word=names(sortedMatrix),freq=sortedMatrix)wordcloud(cloudFrame$word,cloudFrame$freq)
  • 12. Example Wordcloud: Hashtag “#solar”
  • 13. Network Mapping
  • 14. Mapping Social NetworksNicholas Christakis of the Framingham Heart Study has shown the power of social networks to influence a variety of health outcomesMethods: Traditional self-report & objective measures; topographical measures such as network centrality; “neighbor” measuresProducts: Depends on data types; TouchGraph is a network web search engine; InFlow; UCInet; See: http://en.wikipedia.org/wiki/Social_network_analysis_soft wareAdvantages: Meaningful improvement in predictive capabilityDisadvantages: Intensive technique requires careful planning and setup; data collection difficult and time consuming
  • 15. Facebook Polls
  • 16. Embedded PollsCollection of short-format survey data from social networking and membership sitesMethods: Primarily standard, closed-ended self- report; single item scalesProducts: Example: Vizu provides a “widget” that allows embedding of polls on Facebook pagesAdvantages: Quick, cheap, possible to get a large sample in a short timeDisadvantages: Difficult to control access, short format limits use of multi-item scales
  • 17. Mobilityhttp://www.surveyonthespot.com/
  • 18. Data Collection from Mobile DevicesUsing smartphones and other mobile devices as a basis for interacting with participantsMethods: Primarily self-report but can include location and movement dataProducts: Example: Survey On The Spot allows location aware surveys to be delivered to smart phones; TrailGuru collects route data from hikers and joggersAdvantages: Platform is becoming ubiquitous, location data provides new options for understanding behaviorDisadvantages: Privacy issues, small screen, complex programming interfaces
  • 19. iPhone FunReaching Mobile Participants• Micropayment system built into the platform• Feasible for short instruments• Can be tied to particular experiences, e.g., museum visits• Responses can be geotagged to support mapping