DETECTION OF DISHONESTBEHAVIORS IN ON-LINENETWORKS USING GRAPH-BASEDRANKING TECHNIQUESFrancisco Javier Ortega RodríguezSup...
Motivation2
Motivation3       WWW: Web Search        A   new business model          Advertisements  on the web pages          Mor...
Motivation4       Social Networks         Reputation   of users similar to relevance of web         pages         Highe...
Motivation5
Motivation6       Hypothesis        The detection of dishonest behaviors in on-        line networks can be carried out w...
Roadmap7
Web Spam Detection8       Web spam mechanisms try to increase the        web traffic to specific web sites       Reach t...
Web Spam Detection9       Content-based methods: self promotion         HiddenHTML code         Keyword stuffing
Web Spam Detection10        Link-based methods: mutual promotion          Link-farms          PR-sculpting
Roadmap11
Web Spam Detection12        Relevant web spam detection methods:          Link-based     approaches            PageRank...
Web Spam Detection13        Relevant web spam detection methods:          Link-based     approaches            Pros:   ...
Web Spam Detection14        Relevant web spam detection methods:          Content-based   approaches                    ...
Web Spam Detection15        Relevant web spam detection methods:          Content-based       approaches            Pro...
Web Spam Detection16        Relevant web spam detection methods:          Hybrid   approaches                     Databa...
Web Spam Detection17        Relevant web spam detection methods:          Hybrid    approaches            Pros:        ...
Roadmap18
PolaritySpam19        Intuition          Include   content-based knowledge in a link-based           system.            ...
PolaritySpam20        Content Evaluation                          Content                         Evaluation             ...
PolaritySpam21        Content Evaluation          Acquire    useful knowledge from the textual content          Content...
PolaritySpam22        Content Evaluation          Small   set of heuristics [Ntoulas et al., 2006]            Compressi...
PolaritySpam23        Selection of Sources                          Databas                             e                ...
PolaritySpam24        Selection of Sources          Automatically                       pick a set of a-priori spam and ...
PolaritySpam25        Selection of Sources          Most   Spamy/Not-Spamy sources (S-NS)                               ...
PolaritySpam26        Propagation algorithm                                 Propagatio    Ranking                        ...
PolaritySpam27        Propagation algorithm:          PageRank-based      algorithm          Idea: propagate a-priori i...
PolaritySpam28        Propagation algorithm:          Two   scores for each web page, vi:     ei+ ¹ 0 Û wpi Î Sources+  ...
PolaritySpam29                    Propagatio    Ranking                    n algorithm
PolaritySpam30        Evaluation:          Dataset          Baseline          Evaluation   Methods          Results
PolaritySpam31        Evaluation:          Dataset            WEBSPAM-UK           2006 (Università degli Studi di Mila...
PolaritySpam32        Evaluation:          Baseline:   TrustRank [Gyongy et al., 2004]            Link-based   web spam...
PolaritySpam33        Evaluation:          Evaluation   methods: PR-Buckets                                         …   ...
PolaritySpam34        Evaluation:          Evaluation   methods: PR-Buckets             Evaluation   metric: number of ...
PolaritySpam35        Evaluation:          Evaluation   methods: PR-Buckets             Evaluation   metric: number of ...
PolaritySpam36        Evaluation:          Normalized     Discounted Cumulative Gain (nDCG):            Global   metric...
PolaritySpam37        Evaluation:          Normalized     Discounted Cumulative Gain (nDCG):            Global   metric...
PolaritySpam38        Evaluation:          Normalized     Discounted Cumulative Gain (nDCG):            Global   metric...
PolaritySpam39                                 Evaluation:                                    PR-Buckets            eval...
PolaritySpam40        Evaluation:          nDCG   evaluation                             nDCG                  TrustRan ...
PolaritySpam41                                   Evaluation:                                     Content-based          ...
Roadmap42
Trust & Reputation in Social     Networks43        Trust and reputation are key concepts in         social networks      ...
Trust & Reputation in Social     Networks44        Example: On-line marketplaces          Trustworthiness as determining...
Trust & Reputation in Social     Networks45        Main goal: gain high reputation          Obtain   positive feedbacks ...
Roadmap46
Trust & Reputation in Social     Networks47        TRS’s in real world          Moderators              Special set of ...
Trust & Reputation in Social     Networks48        TRS’s in real world          Unsupervised        TRS’s            Us...
Trust & Reputation in Social         Networks49        Transivity of Trust and Distrust [Guha et al.,         2004]      ...
Trust & Reputation in Social         Networks50        Threats of TRS’s                  Orchestrated attacks           ...
Trust & Reputation in Social     Networks51        Threats of TRS’s            Orchestrated attacks: obtaining positive ...
Trust & Reputation in Social     Networks52        Threats of TRS’s            Camouflage behind good behavior: feigning...
Trust & Reputation in Social     Networks53        Threats of TRS’s            Malicious spies: using an “honest” accoun...
Trust & Reputation in Social     Networks54        Threats of TRS’s            Camouflage behind judgments: giving negat...
Roadmap55
PolarityTrust56        Intuition          Compute  a ranking of the users in a social           network according to the...
PolarityTrust57        Intuition          Propagation   algorithm for the opinions of the users            Given a set ...
PolarityTrust58        Algorithm          Propagation       schema of the opinions of the users            Different be...
PolarityTrust59        Algorithm          The scores of the nodes influence the scores of          their neighbors      ...
PolarityTrust60        Algorithm          The  scores of the nodes influence the scores of           their neighbors    ...
PolarityTrust61        Algorithm          The scores of the nodes influence the scores of          their neighbors      ...
PolarityTrust62        Algorithm          The  scores of the nodes influence the scores of           their neighbors    ...
PolarityTrust63         Algorithm           The  scores of the nodes influence the scores of            their neighbors ...
PolarityTrust64        Non-Negative Propagation          Problems caused by negative opinions from          malicious us...
PolarityTrust65        Action-Reaction Propagation          Problems     caused by dishonest voting attacks            ...
PolarityTrust66        Action-Reaction Propagation          Computation:            Relation between the number of dish...
PolarityTrust67        Complete Formulation
PolarityTrust68        Evaluation          Datasets          Baselines          Results
PolarityTrust69        Evaluation          Datasets            Barabasi   & Albert                 Preferential attach...
PolarityTrust70        Evaluation          Datasets            Slashdot   Zoo                 Graph of users in Slashd...
PolarityTrust71        Evaluation          Baselines            EigenTrust    [Kamvar et al. 2003]                 It ...
PolarityTrust72           Evaluation             Results:       Randomly generated datasets                nDCG     Thr...
PolarityTrust73        Evaluation          Results:      Slashdot Zoo dataset              ET      FmF       SR        N...
PolarityTrust74         Evaluation               Results:     Trolling Slashdot                 nDCG     Threats     ET...
PolarityTrust75        Evaluation          Include   a set of sources of distrust          In   Slashdot Zoo Dataset:  ...
PolarityTrust76          Evaluation            Results:            Sources os trust and distrust                 nDCG  ...
Roadmap77
Conclusions78        Final Remarks          Development of two systems for the detection of          dishonest behaviors...
Conclusions79        Final Remarks          Web   Spam Detection            Unlike                  existent approaches...
Conclusions80        Final Remarks          Trust   and Reputation in social networks            Negative    links impr...
Conclusions81        Future Work          PolaritySpam           Applicability   of more content-based metrics         ...
Conclusions82        Future Work          PolarityTrust            Study   other possible attacks                 Play...
Conclusions83        Future Work          Both   techniques            Study   of the parallelization of both algorithm...
Curriculum Vitae84        Academic and Research milestones          2006:    Degree on Computer Science          2006: ...
Curriculum Vitae85        26 contributions to conferences and journals         5  JCR          10 International Confere...
Curriculum Vitae86        Contributions related to the thesis                                                        Pola...
Curriculum Vitae87        Contributions related to the thesis         National Conf.                                  Tex...
Curriculum Vitae88        Contributions related to the thesis         National Conf.         International         Conf. ...
Curriculum Vitae89        Contributions related to the thesis                                                PolarityRank...
Curriculum Vitae90          Contributions related to the thesis           National Conf.           International         ...
DETECTION OF DISHONESTBEHAVIORS IN ON-LINENETWORKS USING GRAPH-BASEDRANKING TECHNIQUESFrancisco Javier Ortega RodríguezSup...
Upcoming SlideShare
Loading in …5
×

PhD Thesis presentation

1,545
-1

Published on

Thesis viva

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,545
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
62
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

PhD Thesis presentation

  1. 1. DETECTION OF DISHONESTBEHAVIORS IN ON-LINENETWORKS USING GRAPH-BASEDRANKING TECHNIQUESFrancisco Javier Ortega RodríguezSupervised byProf. Dr. José Antonio Troyano Jiménez
  2. 2. Motivation2
  3. 3. Motivation3  WWW: Web Search A new business model  Advertisements on the web pages  More web traffic  More visits to (or views of) the ads  Search Engine Optimization (SEO) is born  White Hat SEO  Black Hat SEO Web Spam!
  4. 4. Motivation4  Social Networks  Reputation of users similar to relevance of web pages  Higher reputation can imply some benefits  Malicious users manipulate the TRS’s  On-line marketplaces: money  Social news sites: slant the contents of the web site  Simply for “trolling” (for pleasure)
  5. 5. Motivation5
  6. 6. Motivation6  Hypothesis The detection of dishonest behaviors in on- line networks can be carried out with graph- based techniques, flexible enough to include in their schemes specific information (in the form of features of the elements in a graph) about the network to be processed and the concrete task to be solved.
  7. 7. Roadmap7
  8. 8. Web Spam Detection8  Web spam mechanisms try to increase the web traffic to specific web sites  Reach the top positions of a web search engine  Relatedness: similarity to the user query  Changing the content of the web page  Visibility: relevance in the collection  Getting a high number of references
  9. 9. Web Spam Detection9  Content-based methods: self promotion  HiddenHTML code  Keyword stuffing
  10. 10. Web Spam Detection10  Link-based methods: mutual promotion  Link-farms  PR-sculpting
  11. 11. Roadmap11
  12. 12. Web Spam Detection12  Relevant web spam detection methods:  Link-based approaches  PageRank-based  Adaptations:  Truncated PageRank [Castillo et al. 2007]  TrustRank [Gyongy et al. 2004]
  13. 13. Web Spam Detection13  Relevant web spam detection methods:  Link-based approaches  Pros:  Tackle the link-based spam methods  The ranking can be directly used as the result of an user query  Cons:  Do not take into account the content of the web pages  Need human intervention in some specific parts
  14. 14. Web Spam Detection14  Relevant web spam detection methods:  Content-based approaches Database WP’s Size Compressibilit Avg. word lenght .. y . 1 … … … … 2 … … … … … … … … … Classifie r Spam Not Spam
  15. 15. Web Spam Detection15  Relevant web spam detection methods:  Content-based approaches  Pros:  Deal with content-based spam methods  Binary classification methods  Cons:  Very slow in comparison to the link-based methods  Based on user-specified features  Do not take into account the topology of the web graph
  16. 16. Web Spam Detection16  Relevant web spam detection methods:  Hybrid approaches Database WP’s Size % Compressibilit Out-links Avg. word .. In- y / In-links lenght . links 1 … … … … … … 2 … … … … … … … … … … … … … Link-based metrics
  17. 17. Web Spam Detection17  Relevant web spam detection methods:  Hybrid approaches  Pros:  Combine the pros of link and content-based methods.  Really effective in the classification of web pages  Cons:  Need user-specified features for both the content and the link-based heuristics.  Opportunity:  Do not take advantage of the global topology of the web graph
  18. 18. Roadmap18
  19. 19. PolaritySpam19  Intuition  Include content-based knowledge in a link-based system. Content Propagatio Ranking Evaluation n algorithm Databas e Selection of sources
  20. 20. PolaritySpam20  Content Evaluation Content Evaluation Databas e
  21. 21. PolaritySpam21  Content Evaluation  Acquire useful knowledge from the textual content  Content-based heuristics  Adequate for spam detection  Easy to compute  Highest discriminative ability  A-priori spam likelihood of a web page
  22. 22. PolaritySpam22  Content Evaluation  Small set of heuristics [Ntoulas et al., 2006]  Compressibility  Average length of words A high value of the metrics implies an a-priori high spam likelihood of a web page
  23. 23. PolaritySpam23  Selection of Sources Databas e Selection of sources
  24. 24. PolaritySpam24  Selection of Sources  Automatically pick a set of a-priori spam and not- spam web pages, Sources- and Sources+, respectively  Take into account the content-basedmi1 , mi 2 ,..., mij } M i { heuristics  Given a web page wpi with metrics:
  25. 25. PolaritySpam25  Selection of Sources  Most Spamy/Not-Spamy sources (S-NS) Sources  Content-based S-NS (CS-NS) Sources  Content-based Graph Sources (C-GS)
  26. 26. PolaritySpam26  Propagation algorithm Propagatio Ranking n algorithm
  27. 27. PolaritySpam27  Propagation algorithm:  PageRank-based algorithm  Idea: propagate a-priori information from a specific set of web pages, Sources  A-priori scores for the Sources ei ¹ 0Ûwpi Î Sources
  28. 28. PolaritySpam28  Propagation algorithm:  Two scores for each web page, vi: ei+ ¹ 0 Û wpi Î Sources+ Set of a-priori non-spam web pages ei- ¹ 0 Û wpi Î Sources- Set of a-priori spam web pages
  29. 29. PolaritySpam29 Propagatio Ranking n algorithm
  30. 30. PolaritySpam30  Evaluation:  Dataset  Baseline  Evaluation Methods  Results
  31. 31. PolaritySpam31  Evaluation:  Dataset  WEBSPAM-UK 2006 (Università degli Studi di Milano)  Metrics:  98 million pages  11,400 hosts manually labeled  7,423 hosts are labeled as spam  About 10 million web pages are labeled as spam  Processed with Terrier IR Platform  http://terrier.org
  32. 32. PolaritySpam32  Evaluation:  Baseline: TrustRank [Gyongy et al., 2004]  Link-based web spam detection method  Personalized PageRank equation  Propagation from a set of hand-picked web pages
  33. 33. PolaritySpam33  Evaluation:  Evaluation methods: PR-Buckets … Bucket 1 Bucket 2 Bucket N
  34. 34. PolaritySpam34  Evaluation:  Evaluation methods: PR-Buckets  Evaluation metric: number of spam web pages in each bucket … Bucket 1 Bucket 2 Bucket N
  35. 35. PolaritySpam35  Evaluation:  Evaluation methods: PR-Buckets  Evaluation metric: number of spam web pages in each bucket … Bucket 1 Bucket 2 Bucket N
  36. 36. PolaritySpam36  Evaluation:  Normalized Discounted Cumulative Gain (nDCG):  Global metric: measures the demotion of spam web pages  Sum the “relevance” scores of not-spam web pages
  37. 37. PolaritySpam37  Evaluation:  Normalized Discounted Cumulative Gain (nDCG):  Global metric: measures the demotion of spam web pages  Sum the “relevance” scores of not-spam web pages
  38. 38. PolaritySpam38  Evaluation:  Normalized Discounted Cumulative Gain (nDCG):  Global metric: measures the demotion of spam web pages  Sum the “relevance” scores of not-spam web pages
  39. 39. PolaritySpam39  Evaluation:  PR-Buckets evaluation 1000 Number of Spam Web Pages 100 10 1 1 2 3 4 5 6 7 8 9 10 Buckets TrustRank S-NS CS-NS C-GS
  40. 40. PolaritySpam40  Evaluation:  nDCG evaluation nDCG TrustRan 0.7381 k S-NS 0.4230 CS-NS 0.8621 C-GS 0.8648
  41. 41. PolaritySpam41  Evaluation:  Content-based heuristics 1000Number of Spam Web Pages 100 10 1 1 2 3 4 5 6 7 8 9 10 Buckets AverageLength Compressibility AllMetrics TrustRank PolaritySpam
  42. 42. Roadmap42
  43. 43. Trust & Reputation in Social Networks43  Trust and reputation are key concepts in social networks  Similar to the relevance of web pages in the WWW  Reputation: assessment of the trustworthiness of a user in a social network, according to his behavior and the opinions of the other users.
  44. 44. Trust & Reputation in Social Networks44  Example: On-line marketplaces  Trustworthiness as determining as the price  Higher reputation implies more sales  Positive and negative opinions
  45. 45. Trust & Reputation in Social Networks45  Main goal: gain high reputation  Obtain positive feedbacks from the customers  Sell some bargains  Special offers  Give negative opinions for sellers that can be competitors.  Obtain false positive opinions from other accounts (not necessarily other users). Dishonest behaviors!
  46. 46. Roadmap46
  47. 47. Trust & Reputation in Social Networks47  TRS’s in real world  Moderators  Special set of users with specific responsibilities  Example: Slashdot.org  A hierarchy of moderators  A special user, No_More_Trolls, maintains a list of known trolls  Drawbacks:  Scalability  Subjectivity
  48. 48. Trust & Reputation in Social Networks48  TRS’s in real world  Unsupervised TRS’s  Users rate the contents of the system (and also other users)  Scalability problem: rely on the users  Subjectivity problem: decentralized  Examples: Digg.com, eBay.com  Drawbacks:  Unsupervised!
  49. 49. Trust & Reputation in Social Networks49  Transivity of Trust and Distrust [Guha et al., 2004]  Multiplicative distrust  The enemy of my enemy is my friend  Additive distrust  Don’t trust someone not trusted by someone you don’t trust  Neutral distrust  Don’t take into account your enemies’ opinions
  50. 50. Trust & Reputation in Social Networks50  Threats of TRS’s Orchestrated attacks Camouflage behind good behavior Malicious Spies Camouflage behind judgments
  51. 51. Trust & Reputation in Social Networks51  Threats of TRS’s  Orchestrated attacks: obtaining positive opinions from other accounts (not necessarily other users) 6 1 7 2 0 3 9 5 8 4
  52. 52. Trust & Reputation in Social Networks52  Threats of TRS’s  Camouflage behind good behavior: feigning good behavior in order to obtain positive feedback from others. 6 1 7 2 0 3 9 5 8 4
  53. 53. Trust & Reputation in Social Networks53  Threats of TRS’s  Malicious spies: using an “honest” account to provide positive opinions to malicious users. 6 1 7 2 0 3 9 8 4 5
  54. 54. Trust & Reputation in Social Networks54  Threats of TRS’s  Camouflage behind judgments: giving negative feedback to users who can be competitors. 6 1 7 2 0 3 9 5 8 4
  55. 55. Roadmap55
  56. 56. PolarityTrust56  Intuition  Compute a ranking of the users in a social network according to their trustworthiness  Take into account both positive and negative feedback  Graph-based ranking algorithm to obtain two scores ifor each node: PT (v )  PT (vi ) : positive reputation of user i  : negative reputation of user i
  57. 57. PolarityTrust57  Intuition  Propagation algorithm for the opinions of the users  Given a set of trustworthy users  Their PT+ and PT- scores are propagated to their neighbors  … and so on 6 1 7 2 0 3 9 4 5 8
  58. 58. PolarityTrust58  Algorithm  Propagation schema of the opinions of the users  Different behavior depending on the type of relation between the users: positive or negative PT ⁺ (b) ↑ ⁻ PT (e) ↑ b e a d ⁺ PT (a) c ⁻ PT (d) f ⁻ PT (c) ↑ ⁺ PT (f) ↑
  59. 59. PolarityTrust59  Algorithm  The scores of the nodes influence the scores of their neighbors PT (vi )
  60. 60. PolarityTrust60  Algorithm  The scores of the nodes influence the scores of their neighbors PT (vi ) (1 d )ei d Set of sources
  61. 61. PolarityTrust61  Algorithm  The scores of the nodes influence the scores of their neighbors pij PT (vi ) (1 d )ei d PT (v j ) j In ( i ) | p jk | k Out ( v j ) Direct relation with the PT+ of positively voting users
  62. 62. PolarityTrust62  Algorithm  The scores of the nodes influence the scores of their neighbors pij pij PT (vi ) (1 d )ei d PT (v j ) PT (v j ) j In ( i ) | p jk | j In ( i ) | p jk | k Out ( v j ) k Out ( v j ) Inverse relation with the PT- of negatively voting users
  63. 63. PolarityTrust63  Algorithm  The scores of the nodes influence the scores of their neighbors pij pij PT (vi ) (1 d )ei d PT (v j ) PT (v j ) j In ( i ) | p jk | j In ( i ) | p jk | k Out ( v j ) k Out ( v j ) pij pij PT (vi ) (1 d )ei d PT (v j ) PT (v j ) j In ( i ) | p jk | j In ( i ) | p jk | k Out ( v j ) k Out ( v j )
  64. 64. PolarityTrust64  Non-Negative Propagation  Problems caused by negative opinions from malicious users  Solution: dynamically avoid the propagation of these opinions from malicious users PR⁻(b) ↑ b a ⁻ PR (a) c ⁺ PR (c) ↑
  65. 65. PolarityTrust65  Action-Reaction Propagation  Problems caused by dishonest voting attacks  Positive votes to malicious users  Orchestrated attacks, malicious spies…  Negative votes to good users  Camouflage behind bad judgments  React against bad actions: dishonest voting  Penalize users who performs these actions  Proportional to the trustworthiness of the nodes been affected
  66. 66. PolarityTrust66  Action-Reaction Propagation  Computation:  Relation between the number of dishonest votes and the total number of votes  Applied after each iteration of the ranking algorithm b a d c
  67. 67. PolarityTrust67  Complete Formulation
  68. 68. PolarityTrust68  Evaluation  Datasets  Baselines  Results
  69. 69. PolarityTrust69  Evaluation  Datasets  Barabasi & Albert  Preferential attachment property  Randomly generated attacks  Metrics of the dataset  104nodes per graph  103 malicious users  100 malicious spies
  70. 70. PolarityTrust70  Evaluation  Datasets  Slashdot Zoo  Graph of users in Slashdot.org  Friend and Foe relationships  Gold Standard = list of Foes of the special user No_More_Trolls  Metrics of the dataset  71,500 users in total  24% negative edges  96 known trolls  Source set: CmdrTaco and his friends  6 users in total
  71. 71. PolarityTrust71  Evaluation  Baselines  EigenTrust [Kamvar et al. 2003]  It does not take into account negative opinions  Fans Minus Freaks  (Number of friends – Number of foes)  Signed Spectral Ranking [Kunegis et al. 2009]  Negative Ranking [Kunegis et al. 2009]
  72. 72. PolarityTrust72  Evaluation  Results: Randomly generated datasets  nDCG Threats ET FmF SR NR PTNN PTAR PT A 0.833 0.843 0.599 0.749 0.876 0.906 0.987 AB 0.833 0.844 0.811 0.920 0.876 0.906 0.987 ABC 0.842 0.719 0.816 0.920 0.877 0.903 0.984 ABCD 0.823 0.723 0.818 0.937 0.879 0.903 0.984 ABCDE 0.753 0.777 0.877 0.933 0.966 0.862 0.982 A: No estrategies B: Orchestrated attack ET: EigenTrust PTNN: Non-Negative Propagation C: Camouflage behind good FmF: Fans Minus PTAR: Action-Reaction behaviors Freaks Propagation D: Malicious spies SR: Spectral Ranking PT: PolarityTrust E: Camouflage behind judgments NR: Negative Ranking
  73. 73. PolarityTrust73  Evaluation  Results: Slashdot Zoo dataset ET FmF SR NR PTNN PTAR PT nDCG 0.31 0.460 0.479 0.477 0.593 0.570 0.588 0 ET: EigenTrust PTNN: Non-Negative Propagation FmF: Fans Minus PTAR: Action-Reaction Freaks Propagation SR: Spectral Ranking PT: PolarityTrust NR: Negative Ranking
  74. 74. PolarityTrust74  Evaluation  Results: Trolling Slashdot  nDCG Threats ET FmF SR NR PTNN PTAR PT A 0.310 0.460 0.479 0.477 0.593 0.570 0.588 AB 0.308 0.460 0.478 0.477 0.593 0.570 0.588 ABC 0.311 0.460 0.474 0.484 0.593 0.570 0.588 ABCD 0.370 0.476 0.501 0.501 0.580 0.570 0.586 ABCDE 0.370 0.475 0.501 0.496 0.580 0.574 0.588 A: No estrategies B: Orchestrated attack ET: EigenTrust PTNN: Non-Negative Propagation C: Camouflage behind good FmF: Fans Minus PTAR: Action-Reaction behaviors Freaks Propagation D: Malicious spies SR: Spectral Ranking PT: PolarityTrust E: Camouflage behind judgments NR: Negative Ranking
  75. 75. PolarityTrust75  Evaluation  Include a set of sources of distrust  In Slashdot Zoo Dataset:  Sources of trust: CmdrTaco and friends  Sources of distrust: 5 random foes of No_More_Trolls  Many possible methods to choose the sources of distrust
  76. 76. PolarityTrust76  Evaluation  Results: Sources os trust and distrust  nDCG Sources of Trust Sources of Trust & Distrust Threats PTNN PTAR PT PTNN PTAR PT A 0.593 0.570 0.588 0.846 0.790 0.846 AB 0.593 0.570 0.588 0. 846 0.790 0.846 ABC 0.593 0.570 0.588 0.846 0.790 0.846 ABCD 0.580 0.570 0.586 0.775 0.739 0.782 ABCDE 0.580 0.574 0.588 0.774 0.741 0.781 A: No estrategies PTNN: Non-Negative Propagation B: Orchestrated attack D: Malicious spies PTAR: Action-Reaction C: Camouflage behind good E: Camouflage behind Propagation behaviors judgments PT: PolarityTrust
  77. 77. Roadmap77
  78. 78. Conclusions78  Final Remarks  Development of two systems for the detection of dishonest behaviors in on-line networks  Web Spam Detection: PolaritySpam  Trust and Reputation: PolarityTrust  Propagation of some a-priori information  Web Spam: Textual content of the web pages  Trust and Reputation: Trust and distrust sources sets
  79. 79. Conclusions79  Final Remarks  Web Spam Detection  Unlike existent approaches, include content-based knowledge into a link-based technique  Unsupervised methods for the selection of sources  Propagate information of the sources through the network  Two simple metrics improve state-of-the-art methods
  80. 80. Conclusions80  Final Remarks  Trust and Reputation in social networks  Negative links improve the discriminative ability of TRS’s  Propagationestrategies to deal with different attacks against a TRS  Non-Negative propagation  Action-Reaction propagation  Interrelated scores modeling the transitivity of trust and distrust  Flexible to be adapted to different situations and
  81. 81. Conclusions81  Future Work  PolaritySpam  Applicability of more content-based metrics  Aditional methods for the selection of sources  Propagation ability of each source  Infer negative relations between web pages  According to their textual content  Apply similar propagation schemas as in PolarityTrust
  82. 82. Conclusions82  Future Work  PolarityTrust  Study other possible attacks  Playbook sequences (omniscience of the attackers)  Analyze the casuistry of the different social networks  Selection of sources of trust and distrust  Link-based methods  Study other contexts with positive and negative relations:  Trending topics  Authorities in the blogosphere
  83. 83. Conclusions83  Future Work  Both techniques  Study of the parallelization of both algorithms  Many works on the parallelization of PageRank  Saving time and memory  Detection of Spam on the social networks  Spam messages and spam user accounts  Recommender Systems  NLP and Opinion Mining techniques in a link-based system  Use the positive and negative information
  84. 84. Curriculum Vitae84  Academic and Research milestones  2006: Degree on Computer Science  2006: Funded Student in the Itálica research group  2008: Master of Advances Studies:  “STR: A graph-based tagger generator”  2010: Research stay at the University of Glasgow  IR Group (Dr. Iadh Ounis and Dr. Craig Macdonald)
  85. 85. Curriculum Vitae85  26 contributions to conferences and journals 5 JCR  10 International Conferences  2 CORE B  4 CORE C  4 ISI Proceedings  3 Lecture Notes in Computer Sciences  3 CiteSeer Venue Impact Ratings  Proyectos de investigación
  86. 86. Curriculum Vitae86  Contributions related to the thesis PolarityRank National Conf. TexRank for International PolarityTrus Tagging Conf. t JCR System Combinatio STR n Methods PolaritySpa m Web Spam Detection Improving a Tagger Generator in IE
  87. 87. Curriculum Vitae87  Contributions related to the thesis National Conf. TexRank for International Tagging Conf. JCR System Combinatio n Methods Improving a Tagger Generator in IE TextRank como motor de aprendizaje en tareas de etiquetado, SEPLN 2006 Bootstrapping Applied to a Corpus Generation Task, EUROCAST 2007 Improving the Performance of a Tagger Generator in an Information Extraction Application, Journal of Universal Computer Science (2007)
  88. 88. Curriculum Vitae88  Contributions related to the thesis National Conf. International Conf. JCR STR STR: A Graph-based Tagging Technique, International Journal on Artificial Intelligence Tools (2011)
  89. 89. Curriculum Vitae89  Contributions related to the thesis PolarityRank National Conf. International Conf. JCR Web Spam Detection A Knowledge-Rich Approach to Featured-based Opinion Extraction from Product Reviews, SMUC 2010 (CIKM 2010) Combining Textual Content and Hyperlinks in Web Spam Detection, NLDB 2010
  90. 90. Curriculum Vitae90  Contributions related to the thesis National Conf. International PolarityTrus Conf. t JCR PolaritySpa m PolarityTrust: Measuring Trust and Reputation in Social Networks, ITA 2011 PolaritySpam: Propagating Content-based Information Through a Web Graph to Detect Web Spam, International Journal of Innovative Computing, Information and Control (2012)
  91. 91. DETECTION OF DISHONESTBEHAVIORS IN ON-LINENETWORKS USING GRAPH-BASEDRANKING TECHNIQUESFrancisco Javier Ortega RodríguezSupervised byProf. Dr. José Antonio Troyano Jiménez
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×