CSCW 2013!
Image Source: BYU Photo


Mining Smartphone Data to
Classify Life-Facets of

Social Relationships"
Jun-Ki Min

...
CSCW 2013!Carnegie Mellon University!
“Lost Job Because of Social Media”"
2!Image source: Web, Educator
CSCW 2013!
Faceted Identity, Faceted Lives"
Carnegie Mellon University!
3!
Farnham and Churchill, Faceted identity, facete...
CSCW 2013!
Faceted Identity, Faceted Lives"
Carnegie Mellon University!
4!
•  People have many identities!
•  Family, Work...
CSCW 2013!Carnegie Mellon University!
5!
Our Real-World Social Graph"
Social
School
Others
Client
Hobby Roommate
Universit...
CSCW 2013!
What Social Media Knows"
Carnegie Mellon University!
F
“Friends”"
School
Family
Neighbor
Work
6!
CSCW 2013!Carnegie Mellon University!
People Don’t Use Group"
Only 16% mobile phone users and
5% Facebook users create gro...
CSCW 2013!
Related Work"
•  Strength of ties between individuals!
o  Predicting tie strength: Gilbert 2009!
•  Community d...
CSCW 2013!
•  Recruited 40 participants!
•  Took smartphone-available data!
•  Built a machine-learning model to infer lif...
CSCW 2013!
•  Recruiting criteria!
o  Have used the same Android phone for six months!
o  Facebook (FB) membership (at lea...
CSCW 2013!
For each participant,!
1.  Ask to upload smartphone and Facebook data!
2.  Create a list of 70 contacts!
1)  As...
CSCW 2013!
•  About 6 months of data per participant!
o  By default, Android system keeps the last 500 calls (200 SMS per
...
CSCW 2013!
Facet" Category" Group Names Created by Participants"
Family"
Immediate family! Parents, Close Family, Siblings...
CSCW 2013!
•  Size of facets are imbalanced [Dong 2011]!
•  Family: 14%!
•  Work: 11%!
•  Social: 70%!
•  Others: 4%, Miss...
CSCW 2013!
•  Communication intensity [Hill 2003, Roberts 2011]!
o  # Calls, length of SMS, …!
•  Communication regularity...
CSCW 2013!
Phonebook data: 17 features!
•  Similarity between a user and contact!
o  Email, phone#, zip-code, …!
•  Effort...
CSCW 2013!Carnegie Mellon University!
17!
Which Features Are Most Correlated? "
Self-
reported
features
Phonebook
features...
CSCW 2013!
Evaluation (Classify Contact by Life Facets)"
•  Dataset!
o  70-contact list: 2680 contacts!
o  In-phonebook li...
CSCW 2013!
Data set"
Gender,

age, is-
Facebook
friend,
#seen"
Phone-

book"
features"
Comm."
features"
Phone-
book &"
Com...
CSCW 2013!
In-Phonebook list,!
!
Carnegie Mellon University!
20!
Confusion Matrix"
Using the features from own
smartphone ...
CSCW 2013!
In-Phonebook list,!
!
Carnegie Mellon University!
21!
Confusion Matrix"
Using the features from own
smartphone ...
CSCW 2013!
Total duration of calls (Info Gain = 0.547)!
Total #lengthy-calls (0.481)!
#Calls/Total communications on Sunda...
CSCW 2013!
Total duration of calls (Info Gain = 0.225)!
Total #calls (0.217)!
Duration of calls on weekdays / Total calls ...
CSCW 2013!
Total duration of calls (0.442)!
#Days called for the past six months (0.441)!
#Calls / Total communications (0...
CSCW 2013!
•  Un-measurable factors in relationships!
o  Duration (years-known) and history!
•  Other communication channe...
CSCW 2013!
Summary"
•  Augmented social graph!
o  Privacy control, social graph management!
•  Classify contacts into Fami...
CSCW 2013!
•  More info at cmuchimps.org

or email loomlike@cs.cmu.edu!
•  Special thanks to:!
o DARPA!
o Google!
Carnegie...
Upcoming SlideShare
Loading in...5
×

Mining Smartphone Data to Classify Life-Facets of Social Relationships at CSCW 2013

108

Published on

People engage with many overlapping social networks and enact diverse social roles across different facets of their lives. Unfortunately, many online social networking services reduce most people’s contacts to “friend.” A richer computational model of relationships would be useful for a number of applications such as managing privacy settings and organizing communications. In this paper, we take a step towards a richer computational model by using call and text message logs from mobile phones to classifying contacts according to life facet (family, work, and social). We extract various features such as communication intensity, regularity, medium, and temporal tendency, and classify the relationships using machine-learning techniques. Our experimental results on 40 users showed that we could classify life facets with up to 90.5% accuracy. The most relevant features include call duration, channel selection, and time of day for the communication.

Authors are Jun-Ki Min, Jason Wiese, Jason I. Hong, John Zimmerman

Published in: Technology, Economy & Finance
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
108
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Mining Smartphone Data to Classify Life-Facets of Social Relationships at CSCW 2013

  1. 1. CSCW 2013! Image Source: BYU Photo 
 Mining Smartphone Data to Classify Life-Facets of
 Social Relationships" Jun-Ki Min
 Jason Wiese
 Jason Hong" John Zimmerman" Computer Human Interaction: Mobility Privacy Security
  2. 2. CSCW 2013!Carnegie Mellon University! “Lost Job Because of Social Media”" 2!Image source: Web, Educator
  3. 3. CSCW 2013! Faceted Identity, Faceted Lives" Carnegie Mellon University! 3! Farnham and Churchill, Faceted identity, faceted lives: Social and technical issues with being yourself online, CSCW 2011.! Ozenc and Farnham, Life "modes" in social media, CHI 2011.!
  4. 4. CSCW 2013! Faceted Identity, Faceted Lives" Carnegie Mellon University! 4! •  People have many identities! •  Family, Work, and Social are very common contexts! •  But those “facets” are often incompatible! Farnham and Churchill, Faceted identity, faceted lives: Social and technical issues with being yourself online, CSCW 2011.! Ozenc and Farnham, Life "modes" in social media, CHI 2011.!
  5. 5. CSCW 2013!Carnegie Mellon University! 5! Our Real-World Social Graph" Social School Others Client Hobby Roommate UniversitySoccer Pizza Doctor Online friend
  6. 6. CSCW 2013! What Social Media Knows" Carnegie Mellon University! F “Friends”" School Family Neighbor Work 6!
  7. 7. CSCW 2013!Carnegie Mellon University! People Don’t Use Group" Only 16% mobile phone users and 5% Facebook users create groups [Grob 2009, Cluestr: mobile social networking for enhanced group communication; CBSNews 2011, http://www.cbsnews.com/8301-501465_162-20105561- 501465.html] 7!
  8. 8. CSCW 2013! Related Work" •  Strength of ties between individuals! o  Predicting tie strength: Gilbert 2009! •  Community detection! o  Newman 2006, Roswall 2008, Doreian 2005! o  Smartphone proximity networks: Do 2012! •  Group-based privacy control! o  Facebook friend grouping: Kelley 2011! o  Multiple group co-presence: Lampinen 2009! •  Friendship detection! o  Identify friendship: Eagle 2009! •  Our work: Classify life-facets! ! Carnegie Mellon University! 8!
  9. 9. CSCW 2013! •  Recruited 40 participants! •  Took smartphone-available data! •  Built a machine-learning model to infer life-facets! •  Achieved around 90% to classify F / W / S! Carnegie Mellon University! Overview of Our Work" “Friends” Family Work Social 9! Infer life-facets as a first step towards augmented social graph!
  10. 10. CSCW 2013! •  Recruiting criteria! o  Have used the same Android phone for six months! o  Facebook (FB) membership (at least 50 friends)! •  40 Participants! o  13 male and 27 female (age = 19~50)! o  55% student, 35% employed, 10% unemployed! •  Data collection! o  Phone: Contact info (anonymized), call and SMS logs! o  Facebook: Friend list from a Facebook backup file! o  Self-report: Demographic info, group of relationships,
 closeness (1: feel very distant ~ 5: feel very close), etc.! Carnegie Mellon University! User Study" 10!
  11. 11. CSCW 2013! For each participant,! 1.  Ask to upload smartphone and Facebook data! 2.  Create a list of 70 contacts! 1)  Ask about people you live with, family members, work with, feel close to, and do hobbies with! 2)  Pick the most frequently communicated contacts (from phone and Facebook logs)! 3)  Rest, randomly select until we get to 70 people ! 3.  Ask to provide information about each of the 70- contacts! Carnegie Mellon University! User Study: Process" 11!
  12. 12. CSCW 2013! •  About 6 months of data per participant! o  By default, Android system keeps the last 500 calls (200 SMS per contact)! •  #Contacts in a phonebook: 14 ~ 2355
 (Q1 = 236, Med = 519, Q3 = 962)! o  #Contacts had phone numbers: 12 ~ 772 (Q1=125, Med=172, Q3=301)! •  60% of communications were made with just 1 to 31 contacts (Q1 = 5, Med = 6, Q3 = 8)! Carnegie Mellon University! Basic Descriptive Statistics of Data" 12!
  13. 13. CSCW 2013! Facet" Category" Group Names Created by Participants" Family" Immediate family! Parents, Close Family, Siblings, Children, …! Extended family! Cousins, Uncle, Brother-in-laws, Mother’s side family, …! Significant other! Boyfriends, Husband, Ex-boyfriends, Sig other, …! Work" Work! Friends of work, Clients, Previously worked with, …! Social" School! UIC, Indiana high school, Roommates this year, …! Hobby! Poker, Marathon, Chess, Old dance people, …! Neighborhood! Neighbors, Roommate, Met while lived in Morgan, …! Religious! Church friends, …! Family friend! Friends of parents, Children’s friends’ parents, …! Know through! People from Greensburg, Boyfriend’s friends, …! Others! Facebook friends, My doctor, Not sure, Mentor, …! Groups Created by Participants" Carnegie Mellon University! 13!
  14. 14. CSCW 2013! •  Size of facets are imbalanced [Dong 2011]! •  Family: 14%! •  Work: 11%! •  Social: 70%! •  Others: 4%, Missing: 1%! •  Family-Social: 0.9%! •  Work-Social: 2.3%! Carnegie Mellon University! 14! Size of Facets in Our Data"
  15. 15. CSCW 2013! •  Communication intensity [Hill 2003, Roberts 2011]! o  # Calls, length of SMS, …! •  Communication regularity [Do 2011]! o  #Calls per week, #Days called, …! •  Temporal tendency of communication [Eagle 2006]! o  #Calls at time of a day, …! •  Communication channel selection [Mesch 2009] ! o  #Calls vs. #SMS, #Outgoing vs. #Incoming, …! •  Maintenance cost [Roberts 2011]! o  #Calls for the past two weeks, …! Features from Communication Logs" Carnegie Mellon University! 15! 132 features
  16. 16. CSCW 2013! Phonebook data: 17 features! •  Similarity between a user and contact! o  Email, phone#, zip-code, …! •  Effort to fill the contact info! o  %Completion of the info, has-note, …! Self-reported data: 4 features! •  Social media’s profile info! o  Is-same gender, age-difference, is-Facebook friend ! •  Frequently seen (could be from Bluetooth or GPS)
 [Cranshaw et al. 2010, Do and Gatica-Perez 2011]! Carnegie Mellon University! 16! Features from Phonebook and Others"
  17. 17. CSCW 2013!Carnegie Mellon University! 17! Which Features Are Most Correlated? " Self- reported features Phonebook features Communication features Family Work Social (Favorite  list)
  18. 18. CSCW 2013! Evaluation (Classify Contact by Life Facets)" •  Dataset! o  70-contact list: 2680 contacts! o  In-phonebook list: 1847 contacts! o  Communication list: 817 contacts! o  ! •  Conduct three runs of ten-fold cross validation! •  Machine-learning algorithm! o  Decision tree C4.5: Rule-based model! o  Naïve Bayes (NB) classifier: Probabilistic model! o  SVMs: Statistical model! Carnegie Mellon University! 18! 70-contact list! In-phonebook list! Communication list!
  19. 19. CSCW 2013! Data set" Gender,
 age, is- Facebook friend, #seen" Phone-
 book" features" Comm." features" Phone- book &" Comm." Using" all" 70-contact list! 65.5(4.4)! 81.0(4.5)! In-phonebook! list! 66.7(7.4)! 51.1(7.2)! 67.9(6.4)! 68.5(6.0)! 83.1(5.9)! Communication! list ! 60.8(10.)! 52.9(8.4)! 87.1(5.0)! 88.0(5.3)" 90.5(4.8)" Classification Accuracy (%) for SVMs" Carnegie Mellon University! 19! N/A (Some Facebook friends were not in the phonebook)
  20. 20. CSCW 2013! In-Phonebook list,! ! Carnegie Mellon University! 20! Confusion Matrix" Using the features from own smartphone (phonebook and
 communication)" + Social media profile (Gender,
 Age-diff, FB friendship) and
 #Seen (GPS/Bluetooth co-location! Many of “Work” and “Social” contacts had only few communication logs!
  21. 21. CSCW 2013! In-Phonebook list,! ! Carnegie Mellon University! 21! Confusion Matrix" Using the features from own smartphone (phonebook and
 communication)" + Social media profile (Gender,
 Age-diff, FB friendship) and
 #Seen (GPS/Bluetooth co-location! Many of “Work” and “Social” contacts had only few communication logs!
  22. 22. CSCW 2013! Total duration of calls (Info Gain = 0.547)! Total #lengthy-calls (0.481)! #Calls/Total communications on Sunday (0.478)! " " " Carnegie Mellon University! 22! Top 3 Info Gain Features for Family-Facet" 90%
 Had more than
 588 sec. of calls" 73%
 Had more than
 2 lengthy-calls" 83%
 More than 2% of
 phone comm. on Sunday were calls"
  23. 23. CSCW 2013! Total duration of calls (Info Gain = 0.225)! Total #calls (0.217)! Duration of calls on weekdays / Total calls (0.144)! Carnegie Mellon University! 23! Top 3 Info Gain Features for Work-Facet" 65%
 Had 17 ~ 588 sec.
 of calls" 52%
 Had 2 ~ 4 calls" 48%
 More than 20% of
 calls were made on
 weekdays"
  24. 24. CSCW 2013! Total duration of calls (0.442)! #Days called for the past six months (0.441)! #Calls / Total communications (0.421)! Carnegie Mellon University! 24! Top 3 Info Gain Features for Social-Facet" 72%
 Had less than
 17 sec. of calls" 75%
 Had less than
 2 days that
 had called" 80%
 Less than 7% of
 phone comm.
 were calls"
  25. 25. CSCW 2013! •  Un-measurable factors in relationships! o  Duration (years-known) and history! •  Other communication channels! o  Email, VOIP (Skype), social medias, …! •  F / W / S facets could be extended! o  Facets are not static (School: Work à Social)! o  Sub-facets (Work: Coworker vs. Client)! o  Additional facets (Services: Pizza shop, Plumber, Doctor, …)! Carnegie Mellon University! Discussion" 25!
  26. 26. CSCW 2013! Summary" •  Augmented social graph! o  Privacy control, social graph management! •  Classify contacts into Family, Work, and Social! o  The most relevant factors include intensity, channel selection, and temporal tendency! o  Achieved around 90% using a machine learning algorithm! Carnegie Mellon University! 26! “Friends” Family Work Social
  27. 27. CSCW 2013! •  More info at cmuchimps.org
 or email loomlike@cs.cmu.edu! •  Special thanks to:! o DARPA! o Google! Carnegie Mellon University! Thanks!"
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×