Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data mining for social media

3,861 views

Published on

Introduction to Data Mining for Social Media

Published in: Technology, Business
  • If we are speaking about saving time and money this site ⇒ www.HelpWriting.net ⇐ is going to be the best option!! I personally used lots of times and remain highly satisfied.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Have u ever tried external professional writing services like ⇒ www.WritePaper.info ⇐ ? I did and I am more than satisfied.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Writing a good research paper isn't easy and it's the fruit of hard work. For help you can check writing expert. Check out, please ⇒ www.HelpWriting.net ⇐ I think they are the best
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Dating direct: ❶❶❶ http://bit.ly/39pMlLF ❶❶❶
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Dating for everyone is here: ♥♥♥ http://bit.ly/39pMlLF ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Data mining for social media

  1. 1. Data Mining for Social Media<br />VNG Corporation – R&D Team<br />4/23/2011<br />1<br />VNG Corporation - R&D Team<br />
  2. 2. Content<br />Social Media Growth<br />Social Media Data<br />Data Mining for Social Media<br />Conclusion & Discussion<br />4/23/2011<br />2<br />VNG Corporation - R&D Team<br />
  3. 3. 1. Social Media Growth<br />Top sites Globally<br />Google<br />Facebook<br />Youtube<br />Yahoo<br />Live<br />Baidu<br />Wikipedia<br />Blogger<br />MSN<br />Tencent<br />Twitter<br />Top sites in Vietnam<br />Google<br />Vnexpress<br />Zing.vn<br />Yahoo<br />Youtube<br />Facebook<br />Dantri.com.vn<br />24h.com.vn<br />Mediafire<br />Vatgia.com<br />4/23/2011<br />VNG Corporation - R&D Team<br />3<br />
  4. 4. 1. Social Media Growth Some Statistics<br />Facebook - largest social network site<br />600,000,000 users, half log in everyday<br />35,000,000,000 online friendships<br />900,000,000 objects people interact with<br />30,000,000,000 shared content items / month<br />YouTube – largest video sharing site<br />2,000,000,000 views per day<br />1,000,000 video hours uploaded per month<br />Twitter – largest microblogging site<br />200,000,000 users per month<br />65,000,000 tweets per day (750 per second)<br />8,000,000 followers of most popular user<br />ZingMe – largest Vietnamese social network<br />35,000,000 users, 10,000,000 monthly active<br />260,000,000 online friendships<br />Plenty of services: music, video, karaoke, games, news, chat, photo, blog …<br />4/23/2011<br />4<br />VNG Corporation - R&D Team<br />
  5. 5. 2. Social Media Data<br />Social media data is everywhere<br />Social Overload:<br />Information Overloadblogs, microblogs, forums, wikis, news, bookmarked web pages, photos, videos, etc.<br />Interaction Overloadfriends, followers, followees, commenters, co-members, voters, “likers”, taggers, etc.<br /> How to extract useful information from this chaos?<br />4/23/2011<br />5<br />VNG Corporation - R&D Team<br />
  6. 6. 2. Social Media Data Opportunities<br />Social Media captures the pulse of humanity!<br />Can directly study opinions and behaviors of millions of users to gain insights into:<br />Human behaviors<br />Marketing analytics, product sentiment<br />Application & Problems:<br />WWW: search, information retrieval (group web sites or documents)<br />Targeted marketing: identify groups of customers or products to make recommendations (targeted advertising, viral marketing)<br />Personalization (interfaces, services)<br />Epidemiology, Fraud detection, Security (counterterrorism)<br />…<br />4/23/2011<br />6<br />VNG Corporation - R&D Team<br />
  7. 7. Quick Recap<br />Social Media Growth<br />Social Media Data<br />Data Mining for Social Media<br />Social Network as a Graph<br />Interesting Problems<br />Community Detection<br />Node Classification<br />Link Classification & Tie Strength<br />Information Flow<br />Conclusion & Discussion<br />4/23/2011<br />7<br />VNG Corporation - R&D Team<br />
  8. 8. 3. Data Mining for Social Media<br />Data Mining in Social Network: <br />Graph Mining:<br />Friendship graph, contact lists.<br />Interactions between users.<br />Text Mining: <br />Blogs, status updates, tweets…<br />Texts, messages sent between users.<br />Some interesting problems for data miners:<br />Model Information Flow (e.g. viral marketing)<br />Model evolution (e.g. link prediction)<br />Extract information for learning (e.g. node classification, community detection).<br />4/23/2011<br />8<br />VNG Corporation - R&D Team<br />
  9. 9. 3.1 Social Network as a Graph<br />A social network is a graph, but:<br />nodes can have attributes<br />edges (links) may be weighed and/or directed, or not<br />so, the similarity (tie strength, affinity) between two nodes is = f(attributes; links)<br />the network’s graph is not a simple random graph (special structural properties)<br />Large-scale graphs<br />Mining of large-scale graph<br />4/23/2011<br />9<br />VNG Corporation - R&D Team<br />
  10. 10. 3.1 Social Graph Characteristics<br />Sparse networks: number of links proportional to the number of nodes.<br />Small world effect:<br />The shortest path between two random nodes is on average small.<br />This property is related to the distribution of the degrees of the nodes: scale-free network (Barabasi, 2000)<br />4/23/2011<br />10<br />VNG Corporation - R&D Team<br />
  11. 11. 3.2 Interesting ProblemsCommunity Detection<br />Community Detection in Social Network:<br />Partition the graph into clusters<br />Find the (small) community around a given node<br />Why Community Detection?<br />Capture network’s dynamic<br />Allow local analysis of interactions.<br />Reveal the properties without releasing individual privacy information.<br />Methods<br />Clustering based on shortest-path betweenness<br />Clustering based on network modularity<br />4/23/2011<br />11<br />VNG Corporation - R&D Team<br />
  12. 12. 3.2 Interesting Problems Node Classification<br />Node Classification for Social Network: <br />Labeling nodes in the network, indicating demographic values, interest, beliefs or other characteristics.<br />Applications: Used as input for Recommendation<br />Suggest new connections, objects.<br />Personalized ads tailored to users’ interest.<br />Find community based on interests, affiliation.<br />Study how ideas are spread over time.<br />Methods<br />Methods based on traditional classifiers using graph information.<br />Graph-based Methods<br />4/23/2011<br />12<br />VNG Corporation - R&D Team<br />
  13. 13. 3.2 Interesting Problems Link Prediction & Tie Strength<br />Link prediction: Given a snapshot of a social network, infer which new interaction among its members are likely to occur in the near future.<br />Tie Strength: combination of amount of TIME, emotional INTENSITY, INTIMACY (mutual confiding), and reciprocal SERVICES.<br />Applications: <br />Predict future friends<br />Find influential users in the networks.<br />Find possible links between users and objects (e.g. online item to be sold).<br />Methods:<br />Supervised Learning: Decision Trees, Logistic Regression, Support Vector Machine …<br />Graph-based methods.<br />4/23/2011<br />13<br />VNG Corporation - R&D Team<br />
  14. 14. 3.2 Interesting Problems Information Flow<br />Information flow through Social Media<br />Analyzing underlying mechanisms for the real-time spread of information through on-line networks<br />Motivating questions:<br />How do messages spread through social networks?<br />How to predict the spread of information?<br />How to identify networks over which the messages spread?<br />Application:<br />Indicate trends and attentions<br />Predictive modeling of the spread of new ideas and behaviors<br />Search: Real-time search, Social search<br />4/23/2011<br />14<br />VNG Corporation - R&D Team<br />
  15. 15. 4. Conclusion and Discussion<br />Social Media – Rich,Big & Open Data:<br />Billions users, billions contents<br />Textual, Multimedia (image, videos, etc.)<br />Billions of connections<br />Behaviors, preferences, trends...<br />Challenges:<br />Large-scale Problems<br />Noise in data<br />Recommender System for users and enterprises:<br />Maintain users’ interest and attract new users to the network<br />Targeted Marketing: Show appropriate ads and items personalized for users to<br />Predict users’ interests and trends: Make effective plans.<br />…<br />4/23/2011<br />15<br />VNG Corporation - R&D Team<br />
  16. 16. 4/23/2011<br />VNG Corporation - R&D Team<br />16<br />Thank you <br />for your attention!<br />

×