SlideShare a Scribd company logo
1 of 6
Download to read offline
...
                           The   r   International Conference on Robotics, Informatics, and IJttelligent Technology (RIm(}()9)
                                                                             December 1       r-
                                                                                              J.f', 2009 at Bangkok. Thailand




                       AN IMPROVED PEOPLE-SEARCH TECHNIQUE
                       FOR DIRECTED SOCIAL NETWORK GRAPHS

                              Thiti Vacharasintopcha4 Nguyen Huu Phong
                             School of Technology, Shinawatra University
            Pathumthani~ Thailand 12160 Email: thitiv@Siu.ac.th.phongnh174@yahoo.com.vn

                       ABSTRACf                                       they have already got offline'relationships to reconnect with
                                                                      them [4, 5]. Users can find their friends by providing their
    Social networks. offer incredible opporttmities fur users         names addition to other infonnation. However, search
to ~     contents and $bare tbeirexperiences. The nwnber              results from large popular sites return long irrelevant user
of users joining these social networks has been rising                lists than one can imagine,
drama1ically. However. in a social network several users                    In this paper, we propose an approach to improve
may share the same name. This "CaUSeS name ambiguity in               searching for friends in social network, Our approach
which search engine returns homogeneous search results for            employs the PageRon/c algorithm to find seeds in order to
each queried name. To solve this problem we propose an                compute approximate shortest paths within a social network.
approach to improve search results fur finding friends within         We use the friendship among friends as the backbone.
a large social network by using friendships among users as            feature. We also conduct experiments to verify the
our backbone feature. Our approach finds illOSt ranked seeds          effectiveness of the proposed approach.
by using PageRank algorithm              before l.-omputing                 The rest of 1his paper win be organized as follows:
approximate shortest path in a directed graph. We also                First, we investigate previous developments of people-
retrieved real data from the social network. Twitter to verifY        search teclmiques in Section 2. Then we present our
our approach. The sesults show that our approach                      approach to search for friends in a social network in
ootper.fonns the SeedBase approach which selects seeds                Section3_ Later, results are presented in Section 4. Finally,
nmdomly by a large margin.                                            conclusions are discussed in Section 5.

Index Terms-- Search algorithm, social network analysis,                                 2. RELATED WORK
authority analysis, shortest-path algorithm, graph algorithm
                                                                           The top web search providers such as Google and
                                                                      Ymoo offer standard search services where users can search
                   1. INTRODUCTION
                                                                      by keying keywords. This may be lyrics of a song, a movie
                                                                      trailer, a show time of a fashion, the title of a textbook. or a
     Recently, social networlG have gained the explosive
                                                                      name of a friend. In traditional method. search engine5
growth of popularity on the Web and the number of people
                                                                      match the provided keyword to contents in their database
joining these networks is increasing significantly. These
                                                                      and return to users with a list of homogeneous search results.
social networks assist users to create a network of mends
                                                                           In 1he web search and infonnation retrieval area, the
and help in maintaining relationships among long distance
                                                                      accuracy of search results provided by search engines is
friends, finding friends and sharing infurmation among
                                                                      evaluated by a method called ranking algorithm. PageRank
networks. Moreover, in the very near future, the social
                                                                      is one of the well known algorithms in this area [6, 7]. This
network site will play an important source of knowledge and
                                                                      algoritlun takes the numbet of forward links and the number
information II}.
                                                                      of back links to a web page as important factors to rank: each
     Popular social networks on the Web include MySpace
                                                                      web page IS]. By this way, the search engine retmns users
and Twitter. These sites assist users to create_and customize
                                                                      with a list of ordered and ranked web pages for each
their personal information, btogs, multimedia, groups and
                                                                      particular keyword. To get better search result, a user can
other features. MySpace began in July 2003 and was the
                                                                      provide further infurmation about the searched keyword. For
largest social netwOl:k in the world in November 2006 with
                                                                      instance, in case of finding a friend, users could provide the
more than 130 million users [2]. Twitter, a microblogging
                                                                      high school name where they studied together and the school
had increased the number of users significantly since it
                                                                      year to the input SO that search engines can be able to fitter
began in October, 2006 [3].
                                                                      out irrelevant data and return more desired results.
     Researches demonstrate that the majority of user's
                                                                      Furthemlore, .search results can be improved by using
activity on the social network is to search for friends who



                                                                 61
The   r   International Conference on Robotics, biformatics. and Intelligent Technology (RIIn()()9)
                                                                                  December 11!l1 -l.f', 2009 at Bangkok, Thailand

     implicit users' information such as social annotation f7J and     (BFS) from a finder to aU users [5]. The number of
     relationship queries [9, 10].                                     calculation is limited by stopping BFS after a desired bop.
          To improve web search results, autlrors [7J discuss two      The latter algoritlnn uses seeds and computes approximate
     algorithms., namely SociaiSimRonk and SocialPageRonk.             shortest distances from these seeds to all users [5]. The
     The former is based on an observation tbat when users             research demonstrates that the seed-based ranking algoritlun
     browse and annotate a web page, this can be a sood                ootpetforms other algorithms in tenn of performance ~
     indicator of the web page content [7}. The latter is bas¢d on     precision [5].
     another observation that the number of users who annotate
     on a web page can demonstrate the quality of the web page
     [7]. That r~ shows that both types can improve web
     search significantly [7].                                              In this section, we popose an approach to select better
          Another research observestbat the top ranked web page        seeds than the approach that is discussed in Section 2, in
     pairs could contain relationships between the two entities,       wbiclI seeds are selected randomly before computing
     and that relationship can be used to improve the web search       approximate ~ paths. In our approach, all vertices are
     [10].                                                             ranked at first by using the PageRank algorithm. Then these
          In social network context, search engines could apply        vertices are sorted in reverse order and top ranked vertices
     the same patterns as mthe web search 8lU for the particular       are selected as seeds. After that. these seeds are used to
     purpose of people search as mentioned in [5]. However, the        compute shortest paths ftom. them to all vertices.
     use of this approach in searching could meet the same
     problem as in the web search, which is returning the same         3.1 Seed Dista_ces
     search result for every ~ ~ provides the same keyword
     [5,9].                                                                     According to the autOOrs [11. 12}, for a given graph
          In general, a social network can be represented as a             structure with the number of vertices is n and the number of
     graph in which vertices representing users and edges                  edges is m. a query for the distance between any pair of
     representing their relationships. The simplest form of                vertices takes smaller amount of time and space than
     relationship is the friendship where a user is a friend of            computing all pair shortest paths when these distances are
     another. In a social network, when a user searches for a              pn>COtllputed.
     people name they would likely recognize people who have a                  The authors [5] applied the concept above by selecting
     closer relationship with them, in other worrls, the :friend a         a small fiactioo of vertices randomly. These seeds
     person is looking for is more likely a person who bad the             (landmarks) are used as navigational beacons in their
     ~ path" to them in their relationship graphs. Figure 1                friendsbip graph. Then shortest paths ftorn these seeds to all
      shows an example of searching fur a people name in the               vertices are computed.. Later. the shortest path between any
     social network. The user named 'Ilman searches fur his                pair of vertices can be queried.
     friend whose name is Huyen. Two results are returned in                    Figure 2 shows an example of the seed distance
     which the first person is in the distance of I and the second         approach fur the convenience of demonstration.. Suppose
     person is in the distance of 2. 'The correct .result should be        that 'We need to ~ the shortest path between Vertex 1
     the first person since she is closer to. Thuan than the second        and Vertex 7. We also suppose that Vertex 5 is chosen. as a
     person.                                                               seed. We first find the shortest path between Vertex 5 and
                                                                           Vertex 1. this shortest path is DSl   =1. In the same manner,
                                                                           the shortest path between Vertex 5 and Vertex 7 is DS7 = 1.
                                                                           Finally, the shortest path between Vertex 1 and Vertex 7 is
                                                                           the swn of the above two shortest paths which
                                                                           is £),.7 = DSI + DS7 = 2. In this case, the shortest path is
                   Phuong                                                  correct. However, in the other case such as when we need to
                                                                           compute the shortest path between Vertex 1 and Vertex 2,
     FIgUre 1: Searching for People Name in II Social                      the seed is Vertex 5. Then by using this approach, the
     Network                                                               shortest path between Vertex 1 and Vertex 2 is
          In this case, authors in [5] use the approximate shortest
                                                                            DI2   =   Ds, + D52 = 3 which is incorrect.
     path in a friendship graph as a factor in their ranking                      Table I shows the pre"Plocessing result from seed
     algorithms. These algorithms include on-the-fiy ranking and           vector 1. From this table, we can find seed distances from
     seed-based ranking. The funner algorithm computes                     any pairs of vertices by computing the smn of their distanceS
     distances at scoring time by running Breath First Search              totbe seed.



                                                                      62



,i
 i
                                                                                                                                            d
The    r Inte.rnational Cotiference on Robotics, Informatics, and Int.eJJigent Technology (RImOO9)
                                                                                December 1   r   t
                                                                                                     -   1.f", 2009 at Bangkok, Thailand

                                                                     number of Friends that a Tweeter follows is defined
                                                                     asC(T). PageRank of a Tweeter PR(T) is computed as
                                                                     follows:




                                                                          To compute the PageRank of all Tweeters, we first set
                                                                     PageRank of 1hem to be ones. Then we iterate over' all of
                            .(                                       tweeters and compute their PageRank by using Equation 1.
                                                                          Figure 3 shows an example directed graph in Twitter. In
                                                                     this graph. the user named Thuan has five followers. Each of
Figure 2: Example Graph of Vertices ami Edges
                                                                     them also has some other followers. From Equation 1,
                                                                     Thuan is the highest rank since he is followed by many
    In a social network with 100 million users., we would
                                                                     important followers. ~ next ranked is Huyen since she has
.need to compute up to 1016 times to know distances from all         more number of followers than Hanh and Thanh. The
 vertices. By using small fraction of seeds., the runtime            remaining followers are ranked equally.
 required and space ~ be reduced significantty.

Table 1: Example ofPre-processing Seed Distance




    The approach in [5] selects the seed randomly which
may cause the lower accuracy than if better seeds are
chosen. Therefore, we propose an approach to select the
most important seeds instead of choosing seed randomly.
Since our social netWork forms a directed graph which has a
wmmon feature as web links in PageRank algorithm. we                 Figure 3: Example Directed Graph in Twitter
decide to use PageRank algorithm in selecting seeds in our
approach.                                                            3.3 Vect:ors Distances
3.2 pageRank                                                              vectors distances of seeds consist of distances from
                                                                     fractions of vertices (seeds) to all vertices. First, these seeds
     PageRank algorithm is used to rank web pages based on           are ranked by using the PageRank algorithm as descnDed in
fue number offorward links and back links to a web page [8,          Section 3.2. Then these seeds are sorted in reverse order so
13, 14, 15, 16J. The intuition of this algorithm is that when        that the highest ranked ones are arranged at first. Next, a
users link from a web page to other web pages. this could            fraction of seeds are selected from the top ones. Later, seed
indicate endorsement of the web page content [13, 14}. We            distances from these selected seeds to all vertices are
observed that in our social network Twitter, one tweeter may         computed. The exact shortest path between two given
follow other tweeters (friends) and may be followed by               vertices is computed using classical Dijkstra's algorithm as
others (followers). Therefore., applying the PageRank                described in [17, 18] instead of using BFS and Map-reduce
algorithm could help to find better seeds. -                         computation as presented in [5]. Though there are several
     We use the pageRank algorithm in our approach to rank           algorithms to perform faster rumring time such as
each tweeter in our social network in which friends are as           implementing Dijkstra's algorithm with Fibonacci heap [11,
furward links and followers are as back links. According to           19J, this goes beyond OUT scope. Finally, the approximate
[8, 15}, the PageRank algorithm can be stated as: Given a            shortest path between two given vertices is the smallest sum
graph of Twitter, the number of Followers (F) that fullows a         of shortest paths from these vertices to selected seeds.
Tweeter (T) is denoted n. A parameter d is the damping                     In Figure 2, suppose that we choose Vertex 1 and
factor which ranges between 0 and 1, and is set at 0.85. The         Vertex 5 as seeds and we need to find approximate shortest



                                                                63
~.




                                      The   r   International Coriference on Robotics, Informatics, and Intelligent Technology (RIm0Q9)
                                                                                         December 1 r - 14"', 2009 at Bangkok, Thailand

        path between two vertices Vertex 5 and Vertex 7. 1be seed                             4. EXPERIMENTS AND RESULTS
        distances from the Vertex 1 to all vertices
        are Dl   =[0" I, I, I, 1, 1,2].   Also,    the   distances   from             In this section. we present our results and discussion of
                                                                                 the two methods: SeedBase and PageRank.
        Vertex5 to all vertices is D'J =[1,2,2,2,0,2, 1]. The                         In Experiment 1, two sub experiments were conducted
        approximate shortest paths between the two vertices using                using two different datasets. The maximum size of each
        the two seed distances are 3 and 1, respectively. The                    dataset was set to 125. The first dataset contained 87
        approximate shortest path as described above is the smallest             vertices and 103 edges. The second data set contained 107
        distance 1.                                                              vertices and 120.edges. The number ofVt:rtices was less than
                                                                                 the maximum number 125 since some tweeters had protectal
        3.4 ExperimeDtal Setup                                                   data which arc tmIy accessible by those in their friend lists..
                                                                                 The numbers of seeds'was varied from 1 to 10 with "1"
             FtrSt., we selected a pair of vertices randomly since we            incremen1s. The mean accuracy of the SeedBase and the
        did not have access to data logs from the data resource                  PageRank were compared to the on-fue..fly ranking which
        (Section 3.5) for name queries. Wt; tht:n computed the                   yields l000"{' accurncy. The outcomes are presented in
        approximate shortest path between them using our approach.               Table 2 and Table 3, respectively. These data were also
        This result was eompared to me oo-d.le-fly tanking as                    plotted in Figure 4 for the convenience of comparison.
        described in [51 since it yields lOOO/e accuracy_
             In order to know the perfurmance of our approach, we                Table 2: Atturacies of SeedBase and PageRank from
        implemented the seed-base ranking algorithm (SeedBase) as                Experiment I Dataset I
        described in {5]. For comparative purposes, we modified
        this algorithm by replacing its combination of BPS and                               ::::seed       Seed Base (°'0)      PageRank (%)
        Map-reduce with Dijkstra's algorithm.                                                       1                   14                  46
             We l"3.O each experiment I (} times to compute accuracies                              2                   14                  78
        and running times. We perfunned experiments from a virtual                                  3                   21                  74
        machine with 1.8 GHz processor, 512 MB RAM.                                                 4                   20                  83
                                                                                                    5                   27                  85
        3.5 Data CoUection
                                                                                                    6                   38                  80
             We evaluated the accutacy of our approach with real                                    7                   35                  81
        data from the social network Twitter. These data were                                  ..
                                                                                                    8                   37
                                                                                                                        .. _.,              95
                                                                                         "           '''-

        gathered by using the snowbaII method described by [2J.                                     9                   47                  9S
        The algorithm was executed through several steps: selecting                                 10                  34                  99
        tweeters as initial seeds, running a BFS to all of their friends
        until it reaches to a desired hop.                                       Table 3; A~.rnde5 qfSee<Wase aad PageRank fro..-
             First, some tweeters were selected as initial seeds.                E~t I Dataset 2
        These tweeters were retrieved from Twitter public timeline
        in which a list of 20 tweeters was generated .randomly. As a                          #seed         SeedBase (° 0)       PageRank (°0)
        result, these tweeters may not be connected to each others.                               I                        4                41
        Since the focus of this research is to examine the                                        2                                         41
                                                                                                                         15
        relationship among eacll user, we decided to pick up only
                                                                                                  3                      21                 52
        one tweeter in the public timeline per time.
                                                                                                     4                   17                 81
             Second, the number of frknds of each chosen tweeter
        varies.. Twitter limits the number of tweeters that one can                                  5                   26                 7&
        follow up to 2000. Even though this limitation can be lifted                                 6                   30                 83
        by increasing the amount of number who funo~ it can be                                        7                  27                 79
        observed that these tweeters are very likely to be a                                         8                   43                 86
        representative of an organization rather than an individual.                                 9                   39                 88
        For this reason, these tweeters are not retrieved.                                          10                   41                 87
             Finally, the number of hops can also be chosen in
        variety, depending on the rnaximmn desired size of the                       With both datasets, the accuracies of both PageRank
        commnnity. For example. if each tweeter bas 10 friends and
                                                                                 and SeedBase went higher as the number of seeds increased.
        the number of hops is equal to 5, the maximum size of this               However, the PageRmtk outperfoimed SeedBase by a large
                              5
        community is 1x 10        •                                              margin. Even with given only one seed,



                                                                            64

    i
,   I
I i

!   ~
The   r       International Conference on Robotics, Informatics, and Intelligent Technology                   riImOO9)
                                                                                               December lI st - l,f', 2009 at Bangkok, Thailand

                                                                                                          In these experiments, the accuracies of both PageRank
                                                                                                     and SeedBase were also higher as the numbers of seed
                                                                                                     increased. PageRank's accuracies were between 18% and
                                                                                                     30% when -fue·first seed was given, whereas the accuracy of
      eo                                                                                             ~ed.Base was lower than 100/0. PageRank's accuracy
                                                                                                     mcreased constantly at first, then grew rapidly after seven or

                   I
                              __,___ ~~~~==-~ __:_ ~":~R~;k2
                                I                  I        .          I              1
                                                                                                     thirteen seeds and then kept increasing slowly until it
                                                                                                     reached up to above 90'% when the number of seeds was
                   I            !                  I            '.~.   I              I


           -   -
                   1         seedBase1~~
                                  -----/"4-- _
                           -----1----
                                               I
                                               ~-_
                                                                                                     between nineteen and twenty five. SeedBase's accuracy also
                   I            I                  ,        f    -               ":"   ,             increased constantly but reached up to only about 66% and

           --~-~-----~--
                                                                                                     32%.
      20                                                                                                  In summary,with all datasets the Page Rank
               /'                                                                                    outperfurmed the SeedBase significantly. In addition, it can
               ~I
       ~~--~2-----74----~6~--~8~--~'O~
                                                                                                     be seen that trends of accuracies of both PageRank and
                                                                                                     SeedBase were increas¢ as the numbers of seed went
                                                                                                     higher.

Figvre 4: Aceunlcies of SeedBase and PageRank from
                                                                                                          Our Twitter network fonns a directed graph     where  the
                                                                                                     directions from one tweeter to others are ordered. As a
E~rinrentl~tlandDa~t2
                                                                                                     result. a tweeter bas higher rank than others when many high
                                                                                                     ranked tweeters follow. Our results are also in agreement
PDgeRank's accuracy was between 40% and 50"10 whereas                                                with the results from [20] where the centrality method is
1he accuracy of SeedBase was below 20"/0. PageRank's                                                 used fur choosing seeds (landmarks) in undirected graphs,
  accuracy rate increased slowly at first; then grew rapidly
                                                                                                     where vertices at the central of graph with many shortest
  after two or four seeds. Then, the performance kept
                                                                                                     edges going through are important.
  increasing slowly and reached above 90% accuracy when
                                                                                                          The PageRank method takes longer runtime than the
 1he number of seeds was between eight and ten. SeedBase's                                           SeedBase. The reason is that, from the PageRonk
. accuracy increased constantly as the number of seeds was
                                                                                                     Equation 1, each vertex may be traversed several times to
  increased but reached up to only 40% which is much lower
                                                                                                     rank all vertices befure picking up seeds so that in worse
  than that from PageRank.
       In Experiment 2, two different datasets were also used.                                       case the runtime is   o(n2), whereas, in the SeedBase, the
  However. the maximum size of each dataset was increased
  to 1000. The first dataset contained 181 vertices and 230
                                                                                                     nmtime is constant    0(1)   since it is spent only to pick up
  edges. The second dataset contained 482 vertices and S50                                           seeds. However, in our social network Twitter, the number
  edges. The number of seeds was varied from 1 to 25 with                                            of friends that a tweeter follows is many times smaller than
  "1" increments. The mean accuracy of the SeedBase and the                                          the number of all tweeters so that our approach is reasonable
  PageRank were compared to the on-the-fly ranking.. The                                             and effective.
  results are plotted in Figure 5.
                                                                                                                           5. CONCLUSIONS

                                                                                                          In this research. the approximate shortest pa1h between
                                                                                                     tweeters in Twitter is used as our backbone factor in ranking
      eo                                                                                             search results. We have applied our strategy by using the
                                                                                                     PageRank. algorithm to select most important tweeters
                                                                                                     before computing approximate shortest path among them..
                                                                                                          In terms of accuracy, the results show that our strategy
                                                                                                     outperforms the SeedBase method in (5}, which selects seeds
                                                                                                     nmdomly, by large margin. The resuhs are also showed that
                                                                                                     the high accuracy can be achieved with small fraction of
                                                                                                     seeds. Our approach uses small amount of seed (about 2"10-
                                                                                                     5%) but yields very high accuracy. Applying this approach
                                                                                                     in social networks will make the search result for finding
                       5            10                 15                   20             25
                                    """'be< of Seeds                                                 friends more ef'fuctive. Future work includes reducing the
                                                                                                     preprocessing time by speeding up the ranking seeds
                                                                                                     process. The implemented source codes in PHP
 Figure 5: Accuracies of SeedBase and PageRank from
                                                                                                     programming language are made available.
 Experiment 2 Dataset 1 and Dataset 2



                                                                                                65
'"
                           The :zM International Conference on Robotics, Informatics, and intelligent Technology (R.IIT2009)
                                                                                        st
                                                                           December lI - 14''', 2009 at Bangkok, Thailand

                    6. REFERENCES                                    [IOJ G. Luo, C. Tang and Y. Tian, "Answering relationship
                                                                     queries on the web", Proceedings of the 16th intemativnqj
[IJ A. Mislove, M. Marcon, K. P. Gurrnnadi, P. DruscheI,             Coriference on World Wide Web, WWW '07, pp. 561-570
and B. Bhattachrujee, "'Measurement and analysis of online           ~7.                                                          '
social networks", Proceedings qfthe 7th ACM SIGCOMM
Coriference on irrternet Measurement, IMe '07, pp. 29.-42,           [tl] M. Thorup, and U. Zwick, "Approximate distance
2007.                                                                or1lCles", Journal ofACM52, 1, pp. 1-24,2005.

(2) Y- Aim, S. Han, H. Kwak, S. Moon .and H. Jeong,                  [12J S. Baswana and S. Sen, "Approximate distance oracles
"Anaiysis oftopoIQgica1 characteristics of huge online social        for unweigbted graphs in expected O(n2) time", ACM
networking services", Proceedings of the 16th international          Transactions on Algorithms 2,4, pp. 557-577, 2006.
Conference on World Wide Web, WWW '07, pp. 835-844,
2007.                                                                [13] L. Ding, T. Finin, A. Joshi, R. Pan, R. S. Cost, Y. Peng,
                                                                     P. Reddivari, V. Doshi and J. Sachs, "Swoogle: a search and
[3] A. Java, X. Song, T. Finin, and B. Tseng, "Why we                metadata engine for the semantic web", Proceedings of the
twitter:  understanding   microblogging    usage  and                Thirteenth ACM international Conference on information
communities", Proceedings of the 9th WebKDD and 1st                  and Knowledge Management, CIKM '04, pp. 652-659,
SNA·KDD 2007 Workshop on Web Mining and Social                       2004.
Network Analysis, WebKDDISNA-KDD '07, pp. 56-65,
2007.                                                                [14] M. Richardson, A. Prakash and R Brill. "Beyond
                                                                     PageRank: .machine learning for static ranking",
[4] A. N. Joinson, "Looking at, looking up or keeping up             Proceedings of the 15th international Co.,yerence on World
with people?: motives and use of facebook". Proceeding qf            Wide Web, WWW '06, pp. 707-715, 2006.
the Twenty-Sirth Annual SIGCHI Cortference on Human
Factors in Computing Systems, CHI '08, pp.. 1027-1036,               {I5J S. Brin and L Page, "The anatomy of a large-scale
2008.                                                                hypertextual Web search engine", Comput. Netw. ISDN Syst.
                                                                     30,1-7, pp. 107-117, 1998.
[5} M. V. Vieira, B. M. Fonseca, R. Damazio, P. B.
Golgher, D. d. Reis and B. Ribeiro-Neto, "Efficient search           [16] Y. Zhang, L Zhang, Y. Zhang, XLi, "XRank:
ranking in social networks"', Proceedings of the Sixteenth           Learning More from Web User Behaviors.,n Computer and
ACM Cmrference on Conference on ir(ormation and                      Information TecJnwlogy, Intemational Coriference on, pp.
Knawledge Management, CIKM '07, pp. 563-572, 2007.                   36, Sixth IEEE International Conference on Computer and
                                                                     Infurmation Technology (Crro6), 2006.
[6JE. Amitay, D. Carmel, N. Har'EL S. Ofek-Koifinan, A
Soffer, S. Yogev and N. Golbandi, "Social search and                 [17] T. G.. Micl!ael and T. Roberto, "Data Structure and
discovery using a unified approach", Proceedings if the              Algorithms in Java", John Wiley & Son, Inc., ISBN-
20th ACM Conference on Hypertext and Hypermedia, HT                  {)471644528, 2004.
'09, pp. 199-208,2009.
                                                                     [I8} E. Dijkstra, "'A note on two problems in connexion with
[7] S. Bao, G. Xue, X. Wu, Y. YU, B. Fei. and Z. Suo                 graphs", Numerische Mathematik:, 1: pp. 269-271, 1959.
''Optimizing web search using social annotatioos",
Proceedings of the I 6th internatiunal Conference on World           [19J M. Holzer, F. Schulz, D. Wagner and T. Willha1m,
Wide Web, WWW'07, pp. 501-510, 2007.                                 "Combining speed-up techniques fur shortest-path
                                                                     computations".    A CM Journal on Experimentant
[8] L. Page, S. Brin, R. Motwani and T. Winograd, "The               Algorithmics 10,2.5,2005.
pagerank citation ranking: Bringing order to the web"
Teclmical report, Stanford Digital Library Technologi~               [20] P. Michalis, B. Francesco, C. Carlos and G. Aristides,
Project, 1998.                                                       "Fast Shortest Path Distance Estimation in Large Networks",
                                                                     to be appeared in Proceedings of the Eighteenth ACM
[9] D. V. Kalashnikov, R. Nuray-Turan and S. Mebrotra,               Conference on Conference on information and Knowledge
"Towards breaking the quality curse.: a web-querying                 Management, CIKM '09, 2009.
approach to web people search", Proceedings of the 31st
Annual international ACM SIGIR Conference on Research
and Development in iriformation Retrieval, SIGIR '08, pp.
27-34, 2008.



                                                                66

More Related Content

What's hot

Detecting Spam Tags Against Collaborative Unfair Through Trust Modelling
Detecting Spam Tags Against Collaborative Unfair Through Trust ModellingDetecting Spam Tags Against Collaborative Unfair Through Trust Modelling
Detecting Spam Tags Against Collaborative Unfair Through Trust ModellingIOSR Journals
 
ipoque p2p Survey 2006
ipoque p2p Survey 2006ipoque p2p Survey 2006
ipoque p2p Survey 2006ipoque
 
Profiling User Interests on the Social Semantic Web
Profiling User Interests on the Social Semantic WebProfiling User Interests on the Social Semantic Web
Profiling User Interests on the Social Semantic WebFabrizio Orlandi
 
Microposts2015 - Social Spam Detection on Twitter
Microposts2015 - Social Spam Detection on TwitterMicroposts2015 - Social Spam Detection on Twitter
Microposts2015 - Social Spam Detection on Twitterazubiaga
 
PREDICTING POPULARITY OF KOREAN CONTENTS IN ARAB COUNTRIES USING A DATA MININ...
PREDICTING POPULARITY OF KOREAN CONTENTS IN ARAB COUNTRIES USING A DATA MININ...PREDICTING POPULARITY OF KOREAN CONTENTS IN ARAB COUNTRIES USING A DATA MININ...
PREDICTING POPULARITY OF KOREAN CONTENTS IN ARAB COUNTRIES USING A DATA MININ...csandit
 
Presentation-Detecting Spammers on Social Networks
Presentation-Detecting Spammers on Social NetworksPresentation-Detecting Spammers on Social Networks
Presentation-Detecting Spammers on Social NetworksAshish Arora
 
20111103 con tech2011-marc smith
20111103 con tech2011-marc smith20111103 con tech2011-marc smith
20111103 con tech2011-marc smithMarc Smith
 
LSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social MediaLSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social MediaLocal Social Summit
 
Socio Media Connect: A Social Profile based P2P Network
Socio Media Connect: A Social Profile based P2P NetworkSocio Media Connect: A Social Profile based P2P Network
Socio Media Connect: A Social Profile based P2P Networkiosrjce
 

What's hot (9)

Detecting Spam Tags Against Collaborative Unfair Through Trust Modelling
Detecting Spam Tags Against Collaborative Unfair Through Trust ModellingDetecting Spam Tags Against Collaborative Unfair Through Trust Modelling
Detecting Spam Tags Against Collaborative Unfair Through Trust Modelling
 
ipoque p2p Survey 2006
ipoque p2p Survey 2006ipoque p2p Survey 2006
ipoque p2p Survey 2006
 
Profiling User Interests on the Social Semantic Web
Profiling User Interests on the Social Semantic WebProfiling User Interests on the Social Semantic Web
Profiling User Interests on the Social Semantic Web
 
Microposts2015 - Social Spam Detection on Twitter
Microposts2015 - Social Spam Detection on TwitterMicroposts2015 - Social Spam Detection on Twitter
Microposts2015 - Social Spam Detection on Twitter
 
PREDICTING POPULARITY OF KOREAN CONTENTS IN ARAB COUNTRIES USING A DATA MININ...
PREDICTING POPULARITY OF KOREAN CONTENTS IN ARAB COUNTRIES USING A DATA MININ...PREDICTING POPULARITY OF KOREAN CONTENTS IN ARAB COUNTRIES USING A DATA MININ...
PREDICTING POPULARITY OF KOREAN CONTENTS IN ARAB COUNTRIES USING A DATA MININ...
 
Presentation-Detecting Spammers on Social Networks
Presentation-Detecting Spammers on Social NetworksPresentation-Detecting Spammers on Social Networks
Presentation-Detecting Spammers on Social Networks
 
20111103 con tech2011-marc smith
20111103 con tech2011-marc smith20111103 con tech2011-marc smith
20111103 con tech2011-marc smith
 
LSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social MediaLSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social Media
 
Socio Media Connect: A Social Profile based P2P Network
Socio Media Connect: A Social Profile based P2P NetworkSocio Media Connect: A Social Profile based P2P Network
Socio Media Connect: A Social Profile based P2P Network
 

Viewers also liked

Illumination Example 1
Illumination Example 1Illumination Example 1
Illumination Example 1Vijay Raskar
 
Artificial illumination and night lighting
Artificial illumination and night lightingArtificial illumination and night lighting
Artificial illumination and night lightingAbhi Vallabhaneni
 
Light as an architectural elemant1
Light as an architectural  elemant1Light as an architectural  elemant1
Light as an architectural elemant1Architecture
 
Illumination project
Illumination projectIllumination project
Illumination projectI11008341
 
Illumination:Lighting The World.
Illumination:Lighting The World.Illumination:Lighting The World.
Illumination:Lighting The World.Siddharth Joshi
 
Illumination model
Illumination modelIllumination model
Illumination modelAnkur Kumar
 
ILLUMINATION & LIGHTING
ILLUMINATION & LIGHTINGILLUMINATION & LIGHTING
ILLUMINATION & LIGHTINGHemant Suthar
 
Natural light & illumination
Natural light & illuminationNatural light & illumination
Natural light & illuminationchunnuchauhan
 
Illumination - Method of calculation
Illumination - Method of calculationIllumination - Method of calculation
Illumination - Method of calculationVijay Raskar
 
Illumination basic and schemes
Illumination basic and schemesIllumination basic and schemes
Illumination basic and schemesGAURAV. H .TANDON
 
Illumination Lighting
Illumination LightingIllumination Lighting
Illumination LightingVijay Raskar
 
Slit lamp (methods of illumination)
Slit lamp (methods of illumination)Slit lamp (methods of illumination)
Slit lamp (methods of illumination)maclester manahan
 

Viewers also liked (18)

wb1-1
wb1-1wb1-1
wb1-1
 
Right light
Right lightRight light
Right light
 
Illumination Example 1
Illumination Example 1Illumination Example 1
Illumination Example 1
 
Artificial illumination and night lighting
Artificial illumination and night lightingArtificial illumination and night lighting
Artificial illumination and night lighting
 
Light as an architectural elemant1
Light as an architectural  elemant1Light as an architectural  elemant1
Light as an architectural elemant1
 
Illumination project
Illumination projectIllumination project
Illumination project
 
Structured Cabling Technologies
Structured Cabling TechnologiesStructured Cabling Technologies
Structured Cabling Technologies
 
Illumination:Lighting The World.
Illumination:Lighting The World.Illumination:Lighting The World.
Illumination:Lighting The World.
 
Illumination
IlluminationIllumination
Illumination
 
Illumination model
Illumination modelIllumination model
Illumination model
 
Illumination
IlluminationIllumination
Illumination
 
ILLUMINATION & LIGHTING
ILLUMINATION & LIGHTINGILLUMINATION & LIGHTING
ILLUMINATION & LIGHTING
 
Natural light & illumination
Natural light & illuminationNatural light & illumination
Natural light & illumination
 
Illumination - Method of calculation
Illumination - Method of calculationIllumination - Method of calculation
Illumination - Method of calculation
 
Illumination basic and schemes
Illumination basic and schemesIllumination basic and schemes
Illumination basic and schemes
 
Illumination Lighting
Illumination LightingIllumination Lighting
Illumination Lighting
 
Slit lamp (methods of illumination)
Slit lamp (methods of illumination)Slit lamp (methods of illumination)
Slit lamp (methods of illumination)
 
Network topology.ppt
Network topology.pptNetwork topology.ppt
Network topology.ppt
 

Similar to Improving people search in social networks using friendship graphs

Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...inventionjournals
 
A topology based approach twittersdlfkjsdlkfj
A topology based approach twittersdlfkjsdlkfjA topology based approach twittersdlfkjsdlkfj
A topology based approach twittersdlfkjsdlkfjKunal Mittal
 
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis
Fuzzy AndANN Based Mining Approach Testing For Social Network AnalysisFuzzy AndANN Based Mining Approach Testing For Social Network Analysis
Fuzzy AndANN Based Mining Approach Testing For Social Network AnalysisIJERA Editor
 
AN INTEGRATED RANKING ALGORITHM FOR EFFICIENT INFORMATION COMPUTING IN SOCIAL...
AN INTEGRATED RANKING ALGORITHM FOR EFFICIENT INFORMATION COMPUTING IN SOCIAL...AN INTEGRATED RANKING ALGORITHM FOR EFFICIENT INFORMATION COMPUTING IN SOCIAL...
AN INTEGRATED RANKING ALGORITHM FOR EFFICIENT INFORMATION COMPUTING IN SOCIAL...ijwscjournal
 
A Novel Frame Work System Used In Mobile with Cloud Based Environment
A Novel Frame Work System Used In Mobile with Cloud Based EnvironmentA Novel Frame Work System Used In Mobile with Cloud Based Environment
A Novel Frame Work System Used In Mobile with Cloud Based Environmentpaperpublications3
 
Aardvark Final Www2010
Aardvark Final Www2010Aardvark Final Www2010
Aardvark Final Www2010guestcc519e
 
IMPLEMENTATION OF FOLKSONOMY BASED TAG CLOUD MODEL FOR INFORMATION RETRIEVAL ...
IMPLEMENTATION OF FOLKSONOMY BASED TAG CLOUD MODEL FOR INFORMATION RETRIEVAL ...IMPLEMENTATION OF FOLKSONOMY BASED TAG CLOUD MODEL FOR INFORMATION RETRIEVAL ...
IMPLEMENTATION OF FOLKSONOMY BASED TAG CLOUD MODEL FOR INFORMATION RETRIEVAL ...ijscai
 
Implementation of Folksonomy Based Tag Cloud Model for Information Retrieval ...
Implementation of Folksonomy Based Tag Cloud Model for Information Retrieval ...Implementation of Folksonomy Based Tag Cloud Model for Information Retrieval ...
Implementation of Folksonomy Based Tag Cloud Model for Information Retrieval ...IJSCAI Journal
 
IMPLEMENTATION OF FOLKSONOMY BASED TAG CLOUD MODEL FOR INFORMATION RETRIEVAL ...
IMPLEMENTATION OF FOLKSONOMY BASED TAG CLOUD MODEL FOR INFORMATION RETRIEVAL ...IMPLEMENTATION OF FOLKSONOMY BASED TAG CLOUD MODEL FOR INFORMATION RETRIEVAL ...
IMPLEMENTATION OF FOLKSONOMY BASED TAG CLOUD MODEL FOR INFORMATION RETRIEVAL ...ijscai
 
Recommendation System Using Social Networking
Recommendation System Using Social Networking Recommendation System Using Social Networking
Recommendation System Using Social Networking ijcseit
 
MDS 2011 Paper: An Unsupervised Approach to Discovering and Disambiguating So...
MDS 2011 Paper: An Unsupervised Approach to Discovering and Disambiguating So...MDS 2011 Paper: An Unsupervised Approach to Discovering and Disambiguating So...
MDS 2011 Paper: An Unsupervised Approach to Discovering and Disambiguating So...Carlton Northern
 
Data mining in social network
Data mining in social networkData mining in social network
Data mining in social networkakash_mishra
 

Similar to Improving people search in social networks using friendship graphs (20)

Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
 
Jf2516311637
Jf2516311637Jf2516311637
Jf2516311637
 
Jf2516311637
Jf2516311637Jf2516311637
Jf2516311637
 
A topology based approach twittersdlfkjsdlkfj
A topology based approach twittersdlfkjsdlkfjA topology based approach twittersdlfkjsdlkfj
A topology based approach twittersdlfkjsdlkfj
 
nm
nmnm
nm
 
Proposal.docx
Proposal.docxProposal.docx
Proposal.docx
 
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis
Fuzzy AndANN Based Mining Approach Testing For Social Network AnalysisFuzzy AndANN Based Mining Approach Testing For Social Network Analysis
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis
 
AN INTEGRATED RANKING ALGORITHM FOR EFFICIENT INFORMATION COMPUTING IN SOCIAL...
AN INTEGRATED RANKING ALGORITHM FOR EFFICIENT INFORMATION COMPUTING IN SOCIAL...AN INTEGRATED RANKING ALGORITHM FOR EFFICIENT INFORMATION COMPUTING IN SOCIAL...
AN INTEGRATED RANKING ALGORITHM FOR EFFICIENT INFORMATION COMPUTING IN SOCIAL...
 
A Novel Frame Work System Used In Mobile with Cloud Based Environment
A Novel Frame Work System Used In Mobile with Cloud Based EnvironmentA Novel Frame Work System Used In Mobile with Cloud Based Environment
A Novel Frame Work System Used In Mobile with Cloud Based Environment
 
Aardvark Final Www2010
Aardvark Final Www2010Aardvark Final Www2010
Aardvark Final Www2010
 
IMPLEMENTATION OF FOLKSONOMY BASED TAG CLOUD MODEL FOR INFORMATION RETRIEVAL ...
IMPLEMENTATION OF FOLKSONOMY BASED TAG CLOUD MODEL FOR INFORMATION RETRIEVAL ...IMPLEMENTATION OF FOLKSONOMY BASED TAG CLOUD MODEL FOR INFORMATION RETRIEVAL ...
IMPLEMENTATION OF FOLKSONOMY BASED TAG CLOUD MODEL FOR INFORMATION RETRIEVAL ...
 
Implementation of Folksonomy Based Tag Cloud Model for Information Retrieval ...
Implementation of Folksonomy Based Tag Cloud Model for Information Retrieval ...Implementation of Folksonomy Based Tag Cloud Model for Information Retrieval ...
Implementation of Folksonomy Based Tag Cloud Model for Information Retrieval ...
 
IMPLEMENTATION OF FOLKSONOMY BASED TAG CLOUD MODEL FOR INFORMATION RETRIEVAL ...
IMPLEMENTATION OF FOLKSONOMY BASED TAG CLOUD MODEL FOR INFORMATION RETRIEVAL ...IMPLEMENTATION OF FOLKSONOMY BASED TAG CLOUD MODEL FOR INFORMATION RETRIEVAL ...
IMPLEMENTATION OF FOLKSONOMY BASED TAG CLOUD MODEL FOR INFORMATION RETRIEVAL ...
 
Recommendation System Using Social Networking
Recommendation System Using Social Networking Recommendation System Using Social Networking
Recommendation System Using Social Networking
 
Q046049397
Q046049397Q046049397
Q046049397
 
MDS 2011 Paper: An Unsupervised Approach to Discovering and Disambiguating So...
MDS 2011 Paper: An Unsupervised Approach to Discovering and Disambiguating So...MDS 2011 Paper: An Unsupervised Approach to Discovering and Disambiguating So...
MDS 2011 Paper: An Unsupervised Approach to Discovering and Disambiguating So...
 
Jx2517481755
Jx2517481755Jx2517481755
Jx2517481755
 
Jx2517481755
Jx2517481755Jx2517481755
Jx2517481755
 
58 64
58 6458 64
58 64
 
Data mining in social network
Data mining in social networkData mining in social network
Data mining in social network
 

More from Dr. Thiti Vacharasintopchai, ATSI-DX, CISA

Data Security and Data Governance: Foundation and Case Studies - November 12,...
Data Security and Data Governance: Foundation and Case Studies - November 12,...Data Security and Data Governance: Foundation and Case Studies - November 12,...
Data Security and Data Governance: Foundation and Case Studies - November 12,...Dr. Thiti Vacharasintopchai, ATSI-DX, CISA
 
Blockchain and Cryptocurrency Lecture for Accounting Students นักศึกษาบัญชี, ...
Blockchain and Cryptocurrency Lecture for Accounting Students นักศึกษาบัญชี, ...Blockchain and Cryptocurrency Lecture for Accounting Students นักศึกษาบัญชี, ...
Blockchain and Cryptocurrency Lecture for Accounting Students นักศึกษาบัญชี, ...Dr. Thiti Vacharasintopchai, ATSI-DX, CISA
 
Data Security and Data Governance: Foundation and Case Studies - November 4, ...
Data Security and Data Governance: Foundation and Case Studies - November 4, ...Data Security and Data Governance: Foundation and Case Studies - November 4, ...
Data Security and Data Governance: Foundation and Case Studies - November 4, ...Dr. Thiti Vacharasintopchai, ATSI-DX, CISA
 
Smart Cities - A New Professional Platform for Modern Engineers เมืองอัจฉริย...
Smart Cities - A New Professional Platform for Modern Engineers  เมืองอัจฉริย...Smart Cities - A New Professional Platform for Modern Engineers  เมืองอัจฉริย...
Smart Cities - A New Professional Platform for Modern Engineers เมืองอัจฉริย...Dr. Thiti Vacharasintopchai, ATSI-DX, CISA
 
Construction 4.0 & Drones in Action - ดร.ธิติ วัชรสินธพชัย
Construction 4.0 & Drones in Action - ดร.ธิติ วัชรสินธพชัยConstruction 4.0 & Drones in Action - ดร.ธิติ วัชรสินธพชัย
Construction 4.0 & Drones in Action - ดร.ธิติ วัชรสินธพชัยDr. Thiti Vacharasintopchai, ATSI-DX, CISA
 
Knowledge Management (KM) in Business - ม.เทคโนโลยีสุรนารี - 18 ส.ค. 63
Knowledge Management (KM) in Business - ม.เทคโนโลยีสุรนารี - 18 ส.ค. 63Knowledge Management (KM) in Business - ม.เทคโนโลยีสุรนารี - 18 ส.ค. 63
Knowledge Management (KM) in Business - ม.เทคโนโลยีสุรนารี - 18 ส.ค. 63Dr. Thiti Vacharasintopchai, ATSI-DX, CISA
 
Smart City: A New Professional Platform for Modern Engineers - AIT Graduates ...
Smart City: A New Professional Platform for Modern Engineers - AIT Graduates ...Smart City: A New Professional Platform for Modern Engineers - AIT Graduates ...
Smart City: A New Professional Platform for Modern Engineers - AIT Graduates ...Dr. Thiti Vacharasintopchai, ATSI-DX, CISA
 
ระบบการจัดการห้องสมุดดิจิทัล : คุณสมบัติ ความสามารถ การใช้งาน ประโยชน์
ระบบการจัดการห้องสมุดดิจิทัล : คุณสมบัติ ความสามารถ การใช้งาน ประโยชน์ระบบการจัดการห้องสมุดดิจิทัล : คุณสมบัติ ความสามารถ การใช้งาน ประโยชน์
ระบบการจัดการห้องสมุดดิจิทัล : คุณสมบัติ ความสามารถ การใช้งาน ประโยชน์Dr. Thiti Vacharasintopchai, ATSI-DX, CISA
 
แนวทางการสร้างทรัพยาการสารสนเทศดิจิทัล (Digital Library Collection)
แนวทางการสร้างทรัพยาการสารสนเทศดิจิทัล (Digital Library Collection)แนวทางการสร้างทรัพยาการสารสนเทศดิจิทัล (Digital Library Collection)
แนวทางการสร้างทรัพยาการสารสนเทศดิจิทัล (Digital Library Collection)Dr. Thiti Vacharasintopchai, ATSI-DX, CISA
 
การประยุกต์ใช้ DSpace Open Source ในการจัดการความรู้ขององค์กร
การประยุกต์ใช้ DSpace Open Source ในการจัดการความรู้ขององค์กรการประยุกต์ใช้ DSpace Open Source ในการจัดการความรู้ขององค์กร
การประยุกต์ใช้ DSpace Open Source ในการจัดการความรู้ขององค์กรDr. Thiti Vacharasintopchai, ATSI-DX, CISA
 
การประยุกต์ใช้ DSpace Open Source ในการจัดการความรู้ขององค์กร
การประยุกต์ใช้ DSpace Open Source ในการจัดการความรู้ขององค์กรการประยุกต์ใช้ DSpace Open Source ในการจัดการความรู้ขององค์กร
การประยุกต์ใช้ DSpace Open Source ในการจัดการความรู้ขององค์กรDr. Thiti Vacharasintopchai, ATSI-DX, CISA
 
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...Dr. Thiti Vacharasintopchai, ATSI-DX, CISA
 
Semantic Web Services for Computational Mechanics : A Literature Survey and R...
Semantic Web Services for Computational Mechanics : A Literature Survey and R...Semantic Web Services for Computational Mechanics : A Literature Survey and R...
Semantic Web Services for Computational Mechanics : A Literature Survey and R...Dr. Thiti Vacharasintopchai, ATSI-DX, CISA
 
A Parallel Implementation of the Element-Free Galerkin Method on a Network of...
A Parallel Implementation of the Element-Free Galerkin Method on a Network of...A Parallel Implementation of the Element-Free Galerkin Method on a Network of...
A Parallel Implementation of the Element-Free Galerkin Method on a Network of...Dr. Thiti Vacharasintopchai, ATSI-DX, CISA
 

More from Dr. Thiti Vacharasintopchai, ATSI-DX, CISA (20)

Civil Engineers and the Development of Smart City
Civil Engineers and the Development of Smart CityCivil Engineers and the Development of Smart City
Civil Engineers and the Development of Smart City
 
Data Security and Data Governance: Foundation and Case Studies - November 12,...
Data Security and Data Governance: Foundation and Case Studies - November 12,...Data Security and Data Governance: Foundation and Case Studies - November 12,...
Data Security and Data Governance: Foundation and Case Studies - November 12,...
 
Blockchain and Cryptocurrency Lecture for Accounting Students นักศึกษาบัญชี, ...
Blockchain and Cryptocurrency Lecture for Accounting Students นักศึกษาบัญชี, ...Blockchain and Cryptocurrency Lecture for Accounting Students นักศึกษาบัญชี, ...
Blockchain and Cryptocurrency Lecture for Accounting Students นักศึกษาบัญชี, ...
 
Data Security and Data Governance: Foundation and Case Studies - November 4, ...
Data Security and Data Governance: Foundation and Case Studies - November 4, ...Data Security and Data Governance: Foundation and Case Studies - November 4, ...
Data Security and Data Governance: Foundation and Case Studies - November 4, ...
 
Smart Cities - A New Professional Platform for Modern Engineers เมืองอัจฉริย...
Smart Cities - A New Professional Platform for Modern Engineers  เมืองอัจฉริย...Smart Cities - A New Professional Platform for Modern Engineers  เมืองอัจฉริย...
Smart Cities - A New Professional Platform for Modern Engineers เมืองอัจฉริย...
 
Construction 4.0 & Drones in Action - ดร.ธิติ วัชรสินธพชัย
Construction 4.0 & Drones in Action - ดร.ธิติ วัชรสินธพชัยConstruction 4.0 & Drones in Action - ดร.ธิติ วัชรสินธพชัย
Construction 4.0 & Drones in Action - ดร.ธิติ วัชรสินธพชัย
 
Knowledge Management (KM) in Business - ม.เทคโนโลยีสุรนารี - 18 ส.ค. 63
Knowledge Management (KM) in Business - ม.เทคโนโลยีสุรนารี - 18 ส.ค. 63Knowledge Management (KM) in Business - ม.เทคโนโลยีสุรนารี - 18 ส.ค. 63
Knowledge Management (KM) in Business - ม.เทคโนโลยีสุรนารี - 18 ส.ค. 63
 
Smart City: A New Professional Platform for Modern Engineers - AIT Graduates ...
Smart City: A New Professional Platform for Modern Engineers - AIT Graduates ...Smart City: A New Professional Platform for Modern Engineers - AIT Graduates ...
Smart City: A New Professional Platform for Modern Engineers - AIT Graduates ...
 
ระบบการจัดการห้องสมุดดิจิทัล : คุณสมบัติ ความสามารถ การใช้งาน ประโยชน์
ระบบการจัดการห้องสมุดดิจิทัล : คุณสมบัติ ความสามารถ การใช้งาน ประโยชน์ระบบการจัดการห้องสมุดดิจิทัล : คุณสมบัติ ความสามารถ การใช้งาน ประโยชน์
ระบบการจัดการห้องสมุดดิจิทัล : คุณสมบัติ ความสามารถ การใช้งาน ประโยชน์
 
แนวทางการสร้างทรัพยาการสารสนเทศดิจิทัล (Digital Library Collection)
แนวทางการสร้างทรัพยาการสารสนเทศดิจิทัล (Digital Library Collection)แนวทางการสร้างทรัพยาการสารสนเทศดิจิทัล (Digital Library Collection)
แนวทางการสร้างทรัพยาการสารสนเทศดิจิทัล (Digital Library Collection)
 
การประยุกต์ใช้ DSpace Open Source ในการจัดการความรู้ขององค์กร
การประยุกต์ใช้ DSpace Open Source ในการจัดการความรู้ขององค์กรการประยุกต์ใช้ DSpace Open Source ในการจัดการความรู้ขององค์กร
การประยุกต์ใช้ DSpace Open Source ในการจัดการความรู้ขององค์กร
 
การประยุกต์ใช้ DSpace Open Source ในการจัดการความรู้ขององค์กร
การประยุกต์ใช้ DSpace Open Source ในการจัดการความรู้ขององค์กรการประยุกต์ใช้ DSpace Open Source ในการจัดการความรู้ขององค์กร
การประยุกต์ใช้ DSpace Open Source ในการจัดการความรู้ขององค์กร
 
Introducing Architectural Precast Concrete Structures - Part 2
Introducing Architectural Precast Concrete Structures - Part 2Introducing Architectural Precast Concrete Structures - Part 2
Introducing Architectural Precast Concrete Structures - Part 2
 
Low-rise vs. Tall Buildings: What is Safer during Earthquake in Bangkok?
Low-rise vs. Tall Buildings: What is Safer during Earthquake in Bangkok?Low-rise vs. Tall Buildings: What is Safer during Earthquake in Bangkok?
Low-rise vs. Tall Buildings: What is Safer during Earthquake in Bangkok?
 
Weblog and Digital Library in Knowledge Management
Weblog and Digital Library in Knowledge ManagementWeblog and Digital Library in Knowledge Management
Weblog and Digital Library in Knowledge Management
 
Introducing Architectural Precast Concrete Structures - Part 1
Introducing Architectural Precast Concrete Structures - Part 1Introducing Architectural Precast Concrete Structures - Part 1
Introducing Architectural Precast Concrete Structures - Part 1
 
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...
 
A Structural Engineering Support System using Semantic Computing
A Structural Engineering Support System using Semantic ComputingA Structural Engineering Support System using Semantic Computing
A Structural Engineering Support System using Semantic Computing
 
Semantic Web Services for Computational Mechanics : A Literature Survey and R...
Semantic Web Services for Computational Mechanics : A Literature Survey and R...Semantic Web Services for Computational Mechanics : A Literature Survey and R...
Semantic Web Services for Computational Mechanics : A Literature Survey and R...
 
A Parallel Implementation of the Element-Free Galerkin Method on a Network of...
A Parallel Implementation of the Element-Free Galerkin Method on a Network of...A Parallel Implementation of the Element-Free Galerkin Method on a Network of...
A Parallel Implementation of the Element-Free Galerkin Method on a Network of...
 

Recently uploaded

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 

Recently uploaded (20)

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 

Improving people search in social networks using friendship graphs

  • 1. ... The r International Conference on Robotics, Informatics, and IJttelligent Technology (RIm(}()9) December 1 r- J.f', 2009 at Bangkok. Thailand AN IMPROVED PEOPLE-SEARCH TECHNIQUE FOR DIRECTED SOCIAL NETWORK GRAPHS Thiti Vacharasintopcha4 Nguyen Huu Phong School of Technology, Shinawatra University Pathumthani~ Thailand 12160 Email: thitiv@Siu.ac.th.phongnh174@yahoo.com.vn ABSTRACf they have already got offline'relationships to reconnect with them [4, 5]. Users can find their friends by providing their Social networks. offer incredible opporttmities fur users names addition to other infonnation. However, search to ~ contents and $bare tbeirexperiences. The nwnber results from large popular sites return long irrelevant user of users joining these social networks has been rising lists than one can imagine, drama1ically. However. in a social network several users In this paper, we propose an approach to improve may share the same name. This "CaUSeS name ambiguity in searching for friends in social network, Our approach which search engine returns homogeneous search results for employs the PageRon/c algorithm to find seeds in order to each queried name. To solve this problem we propose an compute approximate shortest paths within a social network. approach to improve search results fur finding friends within We use the friendship among friends as the backbone. a large social network by using friendships among users as feature. We also conduct experiments to verify the our backbone feature. Our approach finds illOSt ranked seeds effectiveness of the proposed approach. by using PageRank algorithm before l.-omputing The rest of 1his paper win be organized as follows: approximate shortest path in a directed graph. We also First, we investigate previous developments of people- retrieved real data from the social network. Twitter to verifY search teclmiques in Section 2. Then we present our our approach. The sesults show that our approach approach to search for friends in a social network in ootper.fonns the SeedBase approach which selects seeds Section3_ Later, results are presented in Section 4. Finally, nmdomly by a large margin. conclusions are discussed in Section 5. Index Terms-- Search algorithm, social network analysis, 2. RELATED WORK authority analysis, shortest-path algorithm, graph algorithm The top web search providers such as Google and Ymoo offer standard search services where users can search 1. INTRODUCTION by keying keywords. This may be lyrics of a song, a movie trailer, a show time of a fashion, the title of a textbook. or a Recently, social networlG have gained the explosive name of a friend. In traditional method. search engine5 growth of popularity on the Web and the number of people match the provided keyword to contents in their database joining these networks is increasing significantly. These and return to users with a list of homogeneous search results. social networks assist users to create a network of mends In 1he web search and infonnation retrieval area, the and help in maintaining relationships among long distance accuracy of search results provided by search engines is friends, finding friends and sharing infurmation among evaluated by a method called ranking algorithm. PageRank networks. Moreover, in the very near future, the social is one of the well known algorithms in this area [6, 7]. This network site will play an important source of knowledge and algoritlun takes the numbet of forward links and the number information II}. of back links to a web page as important factors to rank: each Popular social networks on the Web include MySpace web page IS]. By this way, the search engine retmns users and Twitter. These sites assist users to create_and customize with a list of ordered and ranked web pages for each their personal information, btogs, multimedia, groups and particular keyword. To get better search result, a user can other features. MySpace began in July 2003 and was the provide further infurmation about the searched keyword. For largest social netwOl:k in the world in November 2006 with instance, in case of finding a friend, users could provide the more than 130 million users [2]. Twitter, a microblogging high school name where they studied together and the school had increased the number of users significantly since it year to the input SO that search engines can be able to fitter began in October, 2006 [3]. out irrelevant data and return more desired results. Researches demonstrate that the majority of user's Furthemlore, .search results can be improved by using activity on the social network is to search for friends who 61
  • 2. The r International Conference on Robotics, biformatics. and Intelligent Technology (RIIn()()9) December 11!l1 -l.f', 2009 at Bangkok, Thailand implicit users' information such as social annotation f7J and (BFS) from a finder to aU users [5]. The number of relationship queries [9, 10]. calculation is limited by stopping BFS after a desired bop. To improve web search results, autlrors [7J discuss two The latter algoritlnn uses seeds and computes approximate algorithms., namely SociaiSimRonk and SocialPageRonk. shortest distances from these seeds to all users [5]. The The former is based on an observation tbat when users research demonstrates that the seed-based ranking algoritlun browse and annotate a web page, this can be a sood ootpetforms other algorithms in tenn of performance ~ indicator of the web page content [7}. The latter is bas¢d on precision [5]. another observation that the number of users who annotate on a web page can demonstrate the quality of the web page [7]. That r~ shows that both types can improve web search significantly [7]. In this section, we popose an approach to select better Another research observestbat the top ranked web page seeds than the approach that is discussed in Section 2, in pairs could contain relationships between the two entities, wbiclI seeds are selected randomly before computing and that relationship can be used to improve the web search approximate ~ paths. In our approach, all vertices are [10]. ranked at first by using the PageRank algorithm. Then these In social network context, search engines could apply vertices are sorted in reverse order and top ranked vertices the same patterns as mthe web search 8lU for the particular are selected as seeds. After that. these seeds are used to purpose of people search as mentioned in [5]. However, the compute shortest paths ftom. them to all vertices. use of this approach in searching could meet the same problem as in the web search, which is returning the same 3.1 Seed Dista_ces search result for every ~ ~ provides the same keyword [5,9]. According to the autOOrs [11. 12}, for a given graph In general, a social network can be represented as a structure with the number of vertices is n and the number of graph in which vertices representing users and edges edges is m. a query for the distance between any pair of representing their relationships. The simplest form of vertices takes smaller amount of time and space than relationship is the friendship where a user is a friend of computing all pair shortest paths when these distances are another. In a social network, when a user searches for a pn>COtllputed. people name they would likely recognize people who have a The authors [5] applied the concept above by selecting closer relationship with them, in other worrls, the :friend a a small fiactioo of vertices randomly. These seeds person is looking for is more likely a person who bad the (landmarks) are used as navigational beacons in their ~ path" to them in their relationship graphs. Figure 1 friendsbip graph. Then shortest paths ftorn these seeds to all shows an example of searching fur a people name in the vertices are computed.. Later. the shortest path between any social network. The user named 'Ilman searches fur his pair of vertices can be queried. friend whose name is Huyen. Two results are returned in Figure 2 shows an example of the seed distance which the first person is in the distance of I and the second approach fur the convenience of demonstration.. Suppose person is in the distance of 2. 'The correct .result should be that 'We need to ~ the shortest path between Vertex 1 the first person since she is closer to. Thuan than the second and Vertex 7. We also suppose that Vertex 5 is chosen. as a person. seed. We first find the shortest path between Vertex 5 and Vertex 1. this shortest path is DSl =1. In the same manner, the shortest path between Vertex 5 and Vertex 7 is DS7 = 1. Finally, the shortest path between Vertex 1 and Vertex 7 is the swn of the above two shortest paths which is £),.7 = DSI + DS7 = 2. In this case, the shortest path is Phuong correct. However, in the other case such as when we need to compute the shortest path between Vertex 1 and Vertex 2, FIgUre 1: Searching for People Name in II Social the seed is Vertex 5. Then by using this approach, the Network shortest path between Vertex 1 and Vertex 2 is In this case, authors in [5] use the approximate shortest DI2 = Ds, + D52 = 3 which is incorrect. path in a friendship graph as a factor in their ranking Table I shows the pre"Plocessing result from seed algorithms. These algorithms include on-the-fiy ranking and vector 1. From this table, we can find seed distances from seed-based ranking. The funner algorithm computes any pairs of vertices by computing the smn of their distanceS distances at scoring time by running Breath First Search totbe seed. 62 ,i i d
  • 3. The r Inte.rnational Cotiference on Robotics, Informatics, and Int.eJJigent Technology (RImOO9) December 1 r t - 1.f", 2009 at Bangkok, Thailand number of Friends that a Tweeter follows is defined asC(T). PageRank of a Tweeter PR(T) is computed as follows: To compute the PageRank of all Tweeters, we first set PageRank of 1hem to be ones. Then we iterate over' all of .( tweeters and compute their PageRank by using Equation 1. Figure 3 shows an example directed graph in Twitter. In this graph. the user named Thuan has five followers. Each of Figure 2: Example Graph of Vertices ami Edges them also has some other followers. From Equation 1, Thuan is the highest rank since he is followed by many In a social network with 100 million users., we would important followers. ~ next ranked is Huyen since she has .need to compute up to 1016 times to know distances from all more number of followers than Hanh and Thanh. The vertices. By using small fraction of seeds., the runtime remaining followers are ranked equally. required and space ~ be reduced significantty. Table 1: Example ofPre-processing Seed Distance The approach in [5] selects the seed randomly which may cause the lower accuracy than if better seeds are chosen. Therefore, we propose an approach to select the most important seeds instead of choosing seed randomly. Since our social netWork forms a directed graph which has a wmmon feature as web links in PageRank algorithm. we Figure 3: Example Directed Graph in Twitter decide to use PageRank algorithm in selecting seeds in our approach. 3.3 Vect:ors Distances 3.2 pageRank vectors distances of seeds consist of distances from fractions of vertices (seeds) to all vertices. First, these seeds PageRank algorithm is used to rank web pages based on are ranked by using the PageRank algorithm as descnDed in fue number offorward links and back links to a web page [8, Section 3.2. Then these seeds are sorted in reverse order so 13, 14, 15, 16J. The intuition of this algorithm is that when that the highest ranked ones are arranged at first. Next, a users link from a web page to other web pages. this could fraction of seeds are selected from the top ones. Later, seed indicate endorsement of the web page content [13, 14}. We distances from these selected seeds to all vertices are observed that in our social network Twitter, one tweeter may computed. The exact shortest path between two given follow other tweeters (friends) and may be followed by vertices is computed using classical Dijkstra's algorithm as others (followers). Therefore., applying the PageRank described in [17, 18] instead of using BFS and Map-reduce algorithm could help to find better seeds. - computation as presented in [5]. Though there are several We use the pageRank algorithm in our approach to rank algorithms to perform faster rumring time such as each tweeter in our social network in which friends are as implementing Dijkstra's algorithm with Fibonacci heap [11, furward links and followers are as back links. According to 19J, this goes beyond OUT scope. Finally, the approximate [8, 15}, the PageRank algorithm can be stated as: Given a shortest path between two given vertices is the smallest sum graph of Twitter, the number of Followers (F) that fullows a of shortest paths from these vertices to selected seeds. Tweeter (T) is denoted n. A parameter d is the damping In Figure 2, suppose that we choose Vertex 1 and factor which ranges between 0 and 1, and is set at 0.85. The Vertex 5 as seeds and we need to find approximate shortest 63
  • 4. ~. The r International Coriference on Robotics, Informatics, and Intelligent Technology (RIm0Q9) December 1 r - 14"', 2009 at Bangkok, Thailand path between two vertices Vertex 5 and Vertex 7. 1be seed 4. EXPERIMENTS AND RESULTS distances from the Vertex 1 to all vertices are Dl =[0" I, I, I, 1, 1,2]. Also, the distances from In this section. we present our results and discussion of the two methods: SeedBase and PageRank. Vertex5 to all vertices is D'J =[1,2,2,2,0,2, 1]. The In Experiment 1, two sub experiments were conducted approximate shortest paths between the two vertices using using two different datasets. The maximum size of each the two seed distances are 3 and 1, respectively. The dataset was set to 125. The first dataset contained 87 approximate shortest path as described above is the smallest vertices and 103 edges. The second data set contained 107 distance 1. vertices and 120.edges. The number ofVt:rtices was less than the maximum number 125 since some tweeters had protectal 3.4 ExperimeDtal Setup data which arc tmIy accessible by those in their friend lists.. The numbers of seeds'was varied from 1 to 10 with "1" FtrSt., we selected a pair of vertices randomly since we incremen1s. The mean accuracy of the SeedBase and the did not have access to data logs from the data resource PageRank were compared to the on-fue..fly ranking which (Section 3.5) for name queries. Wt; tht:n computed the yields l000"{' accurncy. The outcomes are presented in approximate shortest path between them using our approach. Table 2 and Table 3, respectively. These data were also This result was eompared to me oo-d.le-fly tanking as plotted in Figure 4 for the convenience of comparison. described in [51 since it yields lOOO/e accuracy_ In order to know the perfurmance of our approach, we Table 2: Atturacies of SeedBase and PageRank from implemented the seed-base ranking algorithm (SeedBase) as Experiment I Dataset I described in {5]. For comparative purposes, we modified this algorithm by replacing its combination of BPS and ::::seed Seed Base (°'0) PageRank (%) Map-reduce with Dijkstra's algorithm. 1 14 46 We l"3.O each experiment I (} times to compute accuracies 2 14 78 and running times. We perfunned experiments from a virtual 3 21 74 machine with 1.8 GHz processor, 512 MB RAM. 4 20 83 5 27 85 3.5 Data CoUection 6 38 80 We evaluated the accutacy of our approach with real 7 35 81 data from the social network Twitter. These data were .. 8 37 .. _., 95 " '''- gathered by using the snowbaII method described by [2J. 9 47 9S The algorithm was executed through several steps: selecting 10 34 99 tweeters as initial seeds, running a BFS to all of their friends until it reaches to a desired hop. Table 3; A~.rnde5 qfSee<Wase aad PageRank fro..- First, some tweeters were selected as initial seeds. E~t I Dataset 2 These tweeters were retrieved from Twitter public timeline in which a list of 20 tweeters was generated .randomly. As a #seed SeedBase (° 0) PageRank (°0) result, these tweeters may not be connected to each others. I 4 41 Since the focus of this research is to examine the 2 41 15 relationship among eacll user, we decided to pick up only 3 21 52 one tweeter in the public timeline per time. 4 17 81 Second, the number of frknds of each chosen tweeter varies.. Twitter limits the number of tweeters that one can 5 26 7& follow up to 2000. Even though this limitation can be lifted 6 30 83 by increasing the amount of number who funo~ it can be 7 27 79 observed that these tweeters are very likely to be a 8 43 86 representative of an organization rather than an individual. 9 39 88 For this reason, these tweeters are not retrieved. 10 41 87 Finally, the number of hops can also be chosen in variety, depending on the rnaximmn desired size of the With both datasets, the accuracies of both PageRank commnnity. For example. if each tweeter bas 10 friends and and SeedBase went higher as the number of seeds increased. the number of hops is equal to 5, the maximum size of this However, the PageRmtk outperfoimed SeedBase by a large 5 community is 1x 10 • margin. Even with given only one seed, 64 i , I I i ! ~
  • 5. The r International Conference on Robotics, Informatics, and Intelligent Technology riImOO9) December lI st - l,f', 2009 at Bangkok, Thailand In these experiments, the accuracies of both PageRank and SeedBase were also higher as the numbers of seed increased. PageRank's accuracies were between 18% and 30% when -fue·first seed was given, whereas the accuracy of eo ~ed.Base was lower than 100/0. PageRank's accuracy mcreased constantly at first, then grew rapidly after seven or I __,___ ~~~~==-~ __:_ ~":~R~;k2 I I . I 1 thirteen seeds and then kept increasing slowly until it reached up to above 90'% when the number of seeds was I ! I '.~. I I - - 1 seedBase1~~ -----/"4-- _ -----1---- I ~-_ between nineteen and twenty five. SeedBase's accuracy also I I , f - ":" , increased constantly but reached up to only about 66% and --~-~-----~-- 32%. 20 In summary,with all datasets the Page Rank /' outperfurmed the SeedBase significantly. In addition, it can ~I ~~--~2-----74----~6~--~8~--~'O~ be seen that trends of accuracies of both PageRank and SeedBase were increas¢ as the numbers of seed went higher. Figvre 4: Aceunlcies of SeedBase and PageRank from Our Twitter network fonns a directed graph where the directions from one tweeter to others are ordered. As a E~rinrentl~tlandDa~t2 result. a tweeter bas higher rank than others when many high ranked tweeters follow. Our results are also in agreement PDgeRank's accuracy was between 40% and 50"10 whereas with the results from [20] where the centrality method is 1he accuracy of SeedBase was below 20"/0. PageRank's used fur choosing seeds (landmarks) in undirected graphs, accuracy rate increased slowly at first; then grew rapidly where vertices at the central of graph with many shortest after two or four seeds. Then, the performance kept edges going through are important. increasing slowly and reached above 90% accuracy when The PageRank method takes longer runtime than the 1he number of seeds was between eight and ten. SeedBase's SeedBase. The reason is that, from the PageRonk . accuracy increased constantly as the number of seeds was Equation 1, each vertex may be traversed several times to increased but reached up to only 40% which is much lower rank all vertices befure picking up seeds so that in worse than that from PageRank. In Experiment 2, two different datasets were also used. case the runtime is o(n2), whereas, in the SeedBase, the However. the maximum size of each dataset was increased to 1000. The first dataset contained 181 vertices and 230 nmtime is constant 0(1) since it is spent only to pick up edges. The second dataset contained 482 vertices and S50 seeds. However, in our social network Twitter, the number edges. The number of seeds was varied from 1 to 25 with of friends that a tweeter follows is many times smaller than "1" increments. The mean accuracy of the SeedBase and the the number of all tweeters so that our approach is reasonable PageRank were compared to the on-the-fly ranking.. The and effective. results are plotted in Figure 5. 5. CONCLUSIONS In this research. the approximate shortest pa1h between tweeters in Twitter is used as our backbone factor in ranking eo search results. We have applied our strategy by using the PageRank. algorithm to select most important tweeters before computing approximate shortest path among them.. In terms of accuracy, the results show that our strategy outperforms the SeedBase method in (5}, which selects seeds nmdomly, by large margin. The resuhs are also showed that the high accuracy can be achieved with small fraction of seeds. Our approach uses small amount of seed (about 2"10- 5%) but yields very high accuracy. Applying this approach in social networks will make the search result for finding 5 10 15 20 25 """'be< of Seeds friends more ef'fuctive. Future work includes reducing the preprocessing time by speeding up the ranking seeds process. The implemented source codes in PHP Figure 5: Accuracies of SeedBase and PageRank from programming language are made available. Experiment 2 Dataset 1 and Dataset 2 65
  • 6. '" The :zM International Conference on Robotics, Informatics, and intelligent Technology (R.IIT2009) st December lI - 14''', 2009 at Bangkok, Thailand 6. REFERENCES [IOJ G. Luo, C. Tang and Y. Tian, "Answering relationship queries on the web", Proceedings of the 16th intemativnqj [IJ A. Mislove, M. Marcon, K. P. Gurrnnadi, P. DruscheI, Coriference on World Wide Web, WWW '07, pp. 561-570 and B. Bhattachrujee, "'Measurement and analysis of online ~7. ' social networks", Proceedings qfthe 7th ACM SIGCOMM Coriference on irrternet Measurement, IMe '07, pp. 29.-42, [tl] M. Thorup, and U. Zwick, "Approximate distance 2007. or1lCles", Journal ofACM52, 1, pp. 1-24,2005. (2) Y- Aim, S. Han, H. Kwak, S. Moon .and H. Jeong, [12J S. Baswana and S. Sen, "Approximate distance oracles "Anaiysis oftopoIQgica1 characteristics of huge online social for unweigbted graphs in expected O(n2) time", ACM networking services", Proceedings of the 16th international Transactions on Algorithms 2,4, pp. 557-577, 2006. Conference on World Wide Web, WWW '07, pp. 835-844, 2007. [13] L. Ding, T. Finin, A. Joshi, R. Pan, R. S. Cost, Y. Peng, P. Reddivari, V. Doshi and J. Sachs, "Swoogle: a search and [3] A. Java, X. Song, T. Finin, and B. Tseng, "Why we metadata engine for the semantic web", Proceedings of the twitter: understanding microblogging usage and Thirteenth ACM international Conference on information communities", Proceedings of the 9th WebKDD and 1st and Knowledge Management, CIKM '04, pp. 652-659, SNA·KDD 2007 Workshop on Web Mining and Social 2004. Network Analysis, WebKDDISNA-KDD '07, pp. 56-65, 2007. [14] M. Richardson, A. Prakash and R Brill. "Beyond PageRank: .machine learning for static ranking", [4] A. N. Joinson, "Looking at, looking up or keeping up Proceedings of the 15th international Co.,yerence on World with people?: motives and use of facebook". Proceeding qf Wide Web, WWW '06, pp. 707-715, 2006. the Twenty-Sirth Annual SIGCHI Cortference on Human Factors in Computing Systems, CHI '08, pp.. 1027-1036, {I5J S. Brin and L Page, "The anatomy of a large-scale 2008. hypertextual Web search engine", Comput. Netw. ISDN Syst. 30,1-7, pp. 107-117, 1998. [5} M. V. Vieira, B. M. Fonseca, R. Damazio, P. B. Golgher, D. d. Reis and B. Ribeiro-Neto, "Efficient search [16] Y. Zhang, L Zhang, Y. Zhang, XLi, "XRank: ranking in social networks"', Proceedings of the Sixteenth Learning More from Web User Behaviors.,n Computer and ACM Cmrference on Conference on ir(ormation and Information TecJnwlogy, Intemational Coriference on, pp. Knawledge Management, CIKM '07, pp. 563-572, 2007. 36, Sixth IEEE International Conference on Computer and Infurmation Technology (Crro6), 2006. [6JE. Amitay, D. Carmel, N. Har'EL S. Ofek-Koifinan, A Soffer, S. Yogev and N. Golbandi, "Social search and [17] T. G.. Micl!ael and T. Roberto, "Data Structure and discovery using a unified approach", Proceedings if the Algorithms in Java", John Wiley & Son, Inc., ISBN- 20th ACM Conference on Hypertext and Hypermedia, HT {)471644528, 2004. '09, pp. 199-208,2009. [I8} E. Dijkstra, "'A note on two problems in connexion with [7] S. Bao, G. Xue, X. Wu, Y. YU, B. Fei. and Z. Suo graphs", Numerische Mathematik:, 1: pp. 269-271, 1959. ''Optimizing web search using social annotatioos", Proceedings of the I 6th internatiunal Conference on World [19J M. Holzer, F. Schulz, D. Wagner and T. Willha1m, Wide Web, WWW'07, pp. 501-510, 2007. "Combining speed-up techniques fur shortest-path computations". A CM Journal on Experimentant [8] L. Page, S. Brin, R. Motwani and T. Winograd, "The Algorithmics 10,2.5,2005. pagerank citation ranking: Bringing order to the web" Teclmical report, Stanford Digital Library Technologi~ [20] P. Michalis, B. Francesco, C. Carlos and G. Aristides, Project, 1998. "Fast Shortest Path Distance Estimation in Large Networks", to be appeared in Proceedings of the Eighteenth ACM [9] D. V. Kalashnikov, R. Nuray-Turan and S. Mebrotra, Conference on Conference on information and Knowledge "Towards breaking the quality curse.: a web-querying Management, CIKM '09, 2009. approach to web people search", Proceedings of the 31st Annual international ACM SIGIR Conference on Research and Development in iriformation Retrieval, SIGIR '08, pp. 27-34, 2008. 66