SlideShare a Scribd company logo
A PU Learning Approach to
Quality Flaw Prediction in
        Wikipedia
                     Edgardo Ferretti†

Web­Technology and Information Systems Group
        Bauhaus­Universität Weimar
                                 th 
                      June 26 ­ 2012
  †
   Universidad Nacional de San Luis, Departamento de Informatica.
Outline
   Motivation
   State of the Art
   PU Learning
   Research questions
   Results
   Conclusions 
Motivation   State of the Art      PU Learning   Research questions   Results   Conclusions



                                   Motivation
     
         Web Information Quality → critical task.
                 Increasing popularity of user­generated Web content. 
                 Unavoidable divergence of the content’s quality.
        Wikipedia is a paradigmatic undertaking.
                 Content contributed by millions of users.
                               Main strength
                               Main challenge
Motivation       State of the Art        PU Learning              Research questions             Results           Conclusions



                               State of the Art
            Featured articles identification:
                     Number of edits and editors.[1]
                     Character trigrams distributions.[2]
                     Number of words.[3] 
                     Factual information.[4]
  [1] 
       D. Wilkinson and B. Huberman. Cooperation and quality in Wikipedia. In Proceedings of the 3th international 
  symposium on wikis and open collaboration (WikiSym’07), ACM, 2007.
  [2] 
       N. Lipka and B. Stein. Identifying featured articles in Wikipedia: writing  style matters. In Proceedings of the 
  19th international conference on World Wide Web, ACM, 2010.
  [3] 
       J.  Blumenstock.  Size  matters:  word  count  as  a  measure  of  quality  on  Wikipedia.  In  Proceedings  of  the  17th 
  international conference on World Wide Web (WWW’08), pages 1095–1096. ACM, 2008.
  [4] 
       E. Lex, M. Völske, M. Errecalde, E. Ferretti, L. Cagnina, C. Horn, B. Stein, and M. Granitzer. Measuring the 
  quality of web content using factual information. In Proceedings of the 2nd joint WICOW/AIRWeb workshop on 
  Web quality (WebQuality’12), pages 7–10. ACM, April 2012.
Motivation       State of the Art          PU Learning              Research questions              Results           Conclusions



                                State of the Art
                                                                                                                    [5­7]
           Development of quality measurement metrics.
           Quality flaws detection.[8­10]


  [5]
        D.  Dalip,  M.  Gonçalves,  M.  Cristo,  and  P.  Calado.  Automatic  quality  assessment  of  content  created 
  collaboratively by Web communities: a case study of Wikipedia. In Proceedings of the 9th ACM/IEEE­CS joint 
  conference on digital libraries (JCDL’09), pages 295–304. ACM, 2009.
  [6]
       M. Hu, E. Lim, A. Sun, H. Lauw, and B. Vuong. Measuring article quality in Wikipedia: models and evaluation. 
  In Proceedings of (CIKM’07, ACM, 2007.
  [7]
        B.  Stvilia,  M.  Twidale,  L.  Smith,  and  L.  Gasser.  Assessing  information  quality  of  a  community­based 
  encyclopedia. In Proceedings of ICIQ’05, MIT, 2005.
  [8]
       M.  Anderka,  B.  Stein,  and  N.  Lipka.  Detection  of  text  quality  flaws  as  a  one­class  classification  problem.  In 
  Proceedings of CIKM’11, ACM, 2011.
  [9]
       M. Anderka, B. Stein, and N. Lipka. Towards Automatic Quality Assurance in Wikipedia. In Proceedings of the 
  20th international conference on World Wide Web (WWW’11), pages 5–6. ACM, 2011.
  [10]
        M. Anderka, B. Stein, and N. Lipka. Using Cleanup Tags to Predict Quality Flaws in User­generated Content. 
  In Proceedings of SIGIR’12, ACM, 2012.
Motivation       State of the Art          PU Learning              Research questions              Results           Conclusions



                                State of the Art
                                                                 [8­10]
           Quality flaws detection.
                      One­class  classification  problem:  impossibility  to 
                        properly characterize the “class” of documents not 
                        containing a particular flaw.
                      Supervised learning approaches.



  [8]
       M.  Anderka,  B.  Stein,  and  N.  Lipka.  Detection  of  text  quality  flaws  as  a  one­class  classification  problem.  In 
  Proceedings of CIKM’11, ACM, 2011.
  [9]
       M. Anderka, B. Stein, and N. Lipka. Towards Automatic Quality Assurance in Wikipedia. In Proceedings of the 
  20th international conference on World Wide Web (WWW’11), pages 5–6. ACM, 2011.
  [10]
        M. Anderka, B. Stein, and N. Lipka. Using Cleanup Tags to Predict Quality Flaws in User­generated Content. 
  In Proceedings of SIGIR’12, ACM, 2012.
Motivation          State of the Art     PU Learning         Research questions         Results            Conclusions



                                       PU Learning
             This method uses as input a small labelled set of the 
               positive  class  to  be  predicted  and  a  large 
               unlabelled set to help learning.[11]

                       Test
               U                  Classifier 1                            P'      U'
               no                                                            Test
                              Training
                                                       Training                                Precision
              RNs                                                 Classifier 2
                                                                                               Recall


 st
                              P
1  stage

  [11] 
     Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.: Building text classifiers using positive and unlabeled examples. In: 
  Proceedings of the 3rd IEEE International Conference on Data Mining, 2003.
Motivation       State of the Art      PU Learning            Research questions           Results          Conclusions
                                                              #1


 What classifier in each stage?
             Liu's  benchmark  system  on  20­Newsgroups  and 
                               [11]
                Reuters­21578:
                     1st stage: Spy, 1­DNF, Rocchio, NB.
                     2nd stage: EM, SVM, SVM­I, SVM­IS.
             kNN as 1st stage classifier.[12]
             Our choice: NB + SVM.

  [11] 
        Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.: Building text classifiers using positive and unlabeled examples. In: 
  Proceedings of the 3rd IEEE International Conference on Data Mining, 2003.
  [12] 
        B. Zhang and W. Zuo. Reliable Negative Extracting Based on kNN for Learning from Positive and Unlabeled 
  Examples . Journal of Computers, 4(1):94–101, 2009.
Motivation    State of the Art       PU Learning      Research questions    Results     Conclusions
                                                      #1


     PAN@CLEF training release
 Flaw name                 # Articles Description
 Unreferenced              37572           The article does not cite any references or sources.
 Orphan                    21356           The article has fewer than three incoming links.
 Refimprove                23144           The article needs additional citations for verification.
 Empty section               5757          The article has at least one section that is empty.
 Notability                  3150          The article does not meet the notability guideline.
 No footnotes                6068          The article lacks of inline citations.
 Primary sources             3682          The article relies on references to primary sources.
 Wikify                      1771          The article needs to be wikified (links and layout).
 Advert                      1109          The article is written like an advertisement.
 Original research             507         The article contains original research.
                           50000           Untagged articles
Motivation   State of the Art          PU Learning                 Research questions             Results    Conclusions
                                                                   #1 #2


   Untagged sampling strategy
                                                             U5.3
                                               U3.3                                     U7.3
                                U1.3                                                                  U9.3
                                              U3.2                          U7.2
                              U1.2                               U5.2                          U9.2
                            U1.1            U3.1          U5.1            U7.1          U9.1

                U10 U1 U2 U3 U4                      U5      U6      U7 U8         U9     U10 U1        U2

                    U10.1            U2.1          U4.1            U6.1          U8.1
                      U10.2                           U4.2                              U8.2
                                       U2.2                          U6.2

                      U10.3                                   U4.3                       U8.3
                                        U2.3
                                                                          U6.3


                                               |Ui| = 5000, for i=1..10
Motivation      State of the Art      PU Learning   Research questions   Results        Conclusions
                                                    #1 #2


   Untagged sampling strategy
     1­sample             2­sample                                                  10­sample
    U1.0=U1              U2.0=U2                                                   U10.0=U10
    U1.1=U1+U2           U2.1=U2+U3                                                U10.1=U10+U1
    U1.2=U1.1+U3         U2.2=U2.1+U4                                              U10.2=U10.1+U2
    U1.3=U1.2+U4         U2.3=U2.2+U5                                              U10.3=U10.2+U3




                    (P + Ui.j), i=1..10, j=0..3 ⇒ 40 different training sets

                                   Training                   Test
                           P size Proportions           P size Proportions
                           1000 1:5                     110    1:1
                           2500 1:10
                                  1:15
                                  1:20
Motivation      State of the Art      PU Learning           Research questions          Results          Conclusions
                                                            #1 #2 #3


                Strategies to select
               negative set from RNs
    0.Selecting all RNs as negative set.[11]
    1.Selecting |P| documents by random from RNs set.
    2.Selecting the |P| best RNs (those assigned the highest 
        confidence prediction values by classifier 1).
    3.Selecting the |P| worst RNs (those assigned the lowest 
        confidence prediction values by classifier 1).


  [11] 
     Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.: Building text classifiers using positive and unlabeled examples. In: 
  Proceedings of the 3rd IEEE International Conference on Data Mining, 2003.
Motivation   State of the Art   PU Learning   Research questions   Results   Conclusions
                                              #1 #2 #3 #4


                  SVM: Which kernel?
        Linear SVM (WEKA's defaults parameters)
        RBF SVM
              
                  γ ∈{2­15, 2­13, 2­11,…, 21, 23}
              
                  C ∈{2­5, 2­3, 2­1,…, 213, 215}
Motivation      State of the Art         PU Learning             Research questions        Results          Conclusions


Results for RBF SVM: m1 for RNs
    Flaw     Training Set SVM (Param.)                 Precision      Recall      F1    Precision Recall           F1
                                   C         γ                                                 Benchmarks[8]
 Advert          U06.2             215      2-7         0.802          0.955    0.871      0.650       0.580     0.613

 Empty           U06.2             215      2-7         1.000          0.991    0.995      0.740       0.700     0.719

 No­foot         U04.2             215      2-5          0.911         0.927    0.919      0.590       0.590     0.590

 Notab           U06.2             215      2-11        1.000          0.991    0.995      0.660       0.610     0.634

 OR              U06.1             215      2-9          0.681         0.991    0.807      0.560       0.800     0.659

 Orph            U10.2             215      2-9          0.991         1.000    0.995      0.720       0.590     0.649

 PS              U04.2             215      2-5         0.842          0.918    0.878      0.610       0.590     0.600

 Ref             U06.3             215      2-9         1.000          0.991    0.995      0.570       0.560     0.565

 Unref           U09.3             215      2-9         1.000          0.991    0.995      0.630       0.630     0.630

 Wiki            U06.3             215      2-9          0.991         0.991    0.991      0.640       0.580     0.609

                                                                                0.944           AVG.             0.629
 [8] 
   M.  Anderka, B. Stein,  and  N. Lipka. Detection  of  text  quality flaws as  a one­class classification problem. In: 
 Proceedings of CIKM’11, ACM, 2011.
Motivation        State of the Art     PU Learning            Research questions           Results          Conclusions


Results for linear SVM: m1 for RNs
        Flaw        Training Set      Precision      Recall        F1        Precision        Recall         F1

                                                                                      Benchmarks[8]
        Advert          U04.2           0.862        0.909        0.885         0.650      0.580    0.613
        Empty           U04.1           0.936        0.927        0.932         0.740          0.700        0.719
        No-foot         U04.3           0.860        0.782        0.819         0.590          0.590        0.590
        Notab           U07.2           0.908        0.900        0.904         0.660          0.610        0.634
        OR              U04.1           0.877        0.845        0.861         0.560          0.800        0.659
        Orph            U07.2           0.913        0.955        0.933         0.720          0.590        0.649
        PS              U04.2           0.898        0.800        0.846         0.610          0.590        0.600
        Ref             U04.2           0.835        0.736        0.783         0.570          0.560        0.565
        Unref           U01.1           0.955        0.964        0.959         0.630          0.630        0.630
        Wiki            U08.3           0.967        0.791        0.870         0.640          0.580        0.609

                                                                  0.879              AVG.                   0.629
 [8] 
   M.  Anderka, B. Stein,  and  N. Lipka. Detection  of  text  quality flaws as  a one­class classification problem. In: 
 Proceedings of CIKM’11, ACM, 2011.
Motivation   State of the Art   PU Learning   Research questions   Results   Conclusions



                                Conclusions
        RQ#1: NB + SVM
        RQ#2: Some unlabelled sets are more promising
                 RBF kernel: U6 sub­sample → 60% of the flaws. 
                 Linear kernel: U4 sub­sample →  60% of the flaws
                 In general, Ui.j, i=1..10, j=2 or j=3 → best results.
        RQ#3: Method for selecting RNs as true negatives
                 1 > 2 > 3 > 0, “>” means “better than”.
Motivation      State of the Art       PU Learning            Research questions           Results          Conclusions



                                     Conclusions
           RQ#4: SVM kernels and parameters
                    Linear  and  RBF kernels'  best results  are  statistically 
                       significant than the benchmarks.[8] 
                    RBF is statistically better than Linear kernel.
                 
                     The  optimistic  setting  in  [8]  → not  statistically 
                      significant than Linear and RBF best results .
                    High penalty value for the error term (C) and very
                      low γ values.
           Semi­supervised methods seem very promising.

 [8] 
   M.  Anderka, B. Stein,  and  N. Lipka. Detection  of  text  quality flaws as  a one­class classification problem. In: 
 Proceedings of CIKM’11, ACM, 2011.
Questions?




Thanks very much for your attention!

More Related Content

Similar to Quality flaw prediction in Wikipedia

Creating Effective Learning Analytics Dashboards: 
Lessons Learnt
Creating Effective Learning Analytics Dashboards: 
Lessons LearntCreating Effective Learning Analytics Dashboards: 
Lessons Learnt
Creating Effective Learning Analytics Dashboards: 
Lessons Learnt
Sven Charleer
 
There is a link below watch it and write 1 page summary that can a.docx
There is a link below watch it and write 1 page summary that can a.docxThere is a link below watch it and write 1 page summary that can a.docx
There is a link below watch it and write 1 page summary that can a.docx
christalgrieg
 
Evidence report-35-technical-report
Evidence report-35-technical-reportEvidence report-35-technical-report
Evidence report-35-technical-report
Deirdre Hughes
 
Dmw. E Assessmentlive July2009 Videos
Dmw. E Assessmentlive July2009 VideosDmw. E Assessmentlive July2009 Videos
Dmw. E Assessmentlive July2009 Videos
Denise Whitelock
 
A Review on Neural Network Question Answering Systems
A Review on Neural Network Question Answering SystemsA Review on Neural Network Question Answering Systems
A Review on Neural Network Question Answering Systems
ijaia
 
Temporal learning analytics in learning design
Temporal learning analytics in learning designTemporal learning analytics in learning design
Temporal learning analytics in learning design
Quan Nguyen
 
Public PhD defense
Public PhD defensePublic PhD defense
Public PhD defense
Jose Luis Santos Odriozola
 
NVivo use for PhD study
NVivo use for PhD studyNVivo use for PhD study
NVivo use for PhD study
KEDGE Business School
 
Handbook of Design Research Methods in Education Innovations in Science, Tech...
Handbook of Design Research Methods in Education Innovations in Science, Tech...Handbook of Design Research Methods in Education Innovations in Science, Tech...
Handbook of Design Research Methods in Education Innovations in Science, Tech...
ERNIECERADO2
 
COSC 426 Lect. 7: Evaluating AR Applications
COSC 426 Lect. 7: Evaluating AR ApplicationsCOSC 426 Lect. 7: Evaluating AR Applications
COSC 426 Lect. 7: Evaluating AR Applications
Mark Billinghurst
 
Student self-assessment of the development of advanced scientific thinking sk...
Student self-assessment of the development of advanced scientific thinking sk...Student self-assessment of the development of advanced scientific thinking sk...
Student self-assessment of the development of advanced scientific thinking sk...
Kirsten Zimbardi
 
4/15/10 School of Engineering Education Research Seminar
4/15/10 School of Engineering Education Research Seminar4/15/10 School of Engineering Education Research Seminar
4/15/10 School of Engineering Education Research Seminar
Research in Feminist Engineering Group
 
E6610pbishopmodule4
E6610pbishopmodule4E6610pbishopmodule4
E6610pbishopmodule4Peter Bishop
 
Designing and Evaluating Student-facing Learning Dashboards: Lessons Learnt (...
Designing and Evaluating Student-facing Learning Dashboards: Lessons Learnt (...Designing and Evaluating Student-facing Learning Dashboards: Lessons Learnt (...
Designing and Evaluating Student-facing Learning Dashboards: Lessons Learnt (...
Sven Charleer
 
Journal of Computer Science Research | Vol.1, Iss.3
Journal of Computer Science Research | Vol.1, Iss.3Journal of Computer Science Research | Vol.1, Iss.3
Journal of Computer Science Research | Vol.1, Iss.3
Bilingual Publishing Group
 
Pratical Research 101 wwwwwwwwwwwwwwwwwwwwwwwwwwwww
Pratical Research 101 wwwwwwwwwwwwwwwwwwwwwwwwwwwwwPratical Research 101 wwwwwwwwwwwwwwwwwwwwwwwwwwwww
Pratical Research 101 wwwwwwwwwwwwwwwwwwwwwwwwwwwww
MissGMagno
 
The Geography of Distance Education Research - Bibliographic Characteristics ...
The Geography of Distance Education Research - Bibliographic Characteristics ...The Geography of Distance Education Research - Bibliographic Characteristics ...
The Geography of Distance Education Research - Bibliographic Characteristics ...
alanwylie
 
Sp110328
Sp110328Sp110328
Sp110328
Stuart Palmer
 
Transitions of digital practices and development of digital literacy skills a...
Transitions of digital practices and development of digital literacy skills a...Transitions of digital practices and development of digital literacy skills a...
Transitions of digital practices and development of digital literacy skills a...
Mengjie Jiang
 
Trend-si.pdf
Trend-si.pdfTrend-si.pdf
Trend-si.pdf
Fahmi571916
 

Similar to Quality flaw prediction in Wikipedia (20)

Creating Effective Learning Analytics Dashboards: 
Lessons Learnt
Creating Effective Learning Analytics Dashboards: 
Lessons LearntCreating Effective Learning Analytics Dashboards: 
Lessons Learnt
Creating Effective Learning Analytics Dashboards: 
Lessons Learnt
 
There is a link below watch it and write 1 page summary that can a.docx
There is a link below watch it and write 1 page summary that can a.docxThere is a link below watch it and write 1 page summary that can a.docx
There is a link below watch it and write 1 page summary that can a.docx
 
Evidence report-35-technical-report
Evidence report-35-technical-reportEvidence report-35-technical-report
Evidence report-35-technical-report
 
Dmw. E Assessmentlive July2009 Videos
Dmw. E Assessmentlive July2009 VideosDmw. E Assessmentlive July2009 Videos
Dmw. E Assessmentlive July2009 Videos
 
A Review on Neural Network Question Answering Systems
A Review on Neural Network Question Answering SystemsA Review on Neural Network Question Answering Systems
A Review on Neural Network Question Answering Systems
 
Temporal learning analytics in learning design
Temporal learning analytics in learning designTemporal learning analytics in learning design
Temporal learning analytics in learning design
 
Public PhD defense
Public PhD defensePublic PhD defense
Public PhD defense
 
NVivo use for PhD study
NVivo use for PhD studyNVivo use for PhD study
NVivo use for PhD study
 
Handbook of Design Research Methods in Education Innovations in Science, Tech...
Handbook of Design Research Methods in Education Innovations in Science, Tech...Handbook of Design Research Methods in Education Innovations in Science, Tech...
Handbook of Design Research Methods in Education Innovations in Science, Tech...
 
COSC 426 Lect. 7: Evaluating AR Applications
COSC 426 Lect. 7: Evaluating AR ApplicationsCOSC 426 Lect. 7: Evaluating AR Applications
COSC 426 Lect. 7: Evaluating AR Applications
 
Student self-assessment of the development of advanced scientific thinking sk...
Student self-assessment of the development of advanced scientific thinking sk...Student self-assessment of the development of advanced scientific thinking sk...
Student self-assessment of the development of advanced scientific thinking sk...
 
4/15/10 School of Engineering Education Research Seminar
4/15/10 School of Engineering Education Research Seminar4/15/10 School of Engineering Education Research Seminar
4/15/10 School of Engineering Education Research Seminar
 
E6610pbishopmodule4
E6610pbishopmodule4E6610pbishopmodule4
E6610pbishopmodule4
 
Designing and Evaluating Student-facing Learning Dashboards: Lessons Learnt (...
Designing and Evaluating Student-facing Learning Dashboards: Lessons Learnt (...Designing and Evaluating Student-facing Learning Dashboards: Lessons Learnt (...
Designing and Evaluating Student-facing Learning Dashboards: Lessons Learnt (...
 
Journal of Computer Science Research | Vol.1, Iss.3
Journal of Computer Science Research | Vol.1, Iss.3Journal of Computer Science Research | Vol.1, Iss.3
Journal of Computer Science Research | Vol.1, Iss.3
 
Pratical Research 101 wwwwwwwwwwwwwwwwwwwwwwwwwwwww
Pratical Research 101 wwwwwwwwwwwwwwwwwwwwwwwwwwwwwPratical Research 101 wwwwwwwwwwwwwwwwwwwwwwwwwwwww
Pratical Research 101 wwwwwwwwwwwwwwwwwwwwwwwwwwwww
 
The Geography of Distance Education Research - Bibliographic Characteristics ...
The Geography of Distance Education Research - Bibliographic Characteristics ...The Geography of Distance Education Research - Bibliographic Characteristics ...
The Geography of Distance Education Research - Bibliographic Characteristics ...
 
Sp110328
Sp110328Sp110328
Sp110328
 
Transitions of digital practices and development of digital literacy skills a...
Transitions of digital practices and development of digital literacy skills a...Transitions of digital practices and development of digital literacy skills a...
Transitions of digital practices and development of digital literacy skills a...
 
Trend-si.pdf
Trend-si.pdfTrend-si.pdf
Trend-si.pdf
 

Recently uploaded

UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 

Quality flaw prediction in Wikipedia

  • 1. A PU Learning Approach to Quality Flaw Prediction in Wikipedia Edgardo Ferretti† Web­Technology and Information Systems Group Bauhaus­Universität Weimar th  June 26 ­ 2012 † Universidad Nacional de San Luis, Departamento de Informatica.
  • 2. Outline  Motivation  State of the Art  PU Learning  Research questions  Results  Conclusions 
  • 3. Motivation State of the Art PU Learning Research questions Results Conclusions Motivation  Web Information Quality → critical task.  Increasing popularity of user­generated Web content.   Unavoidable divergence of the content’s quality.  Wikipedia is a paradigmatic undertaking.  Content contributed by millions of users.  Main strength  Main challenge
  • 4. Motivation State of the Art PU Learning Research questions Results Conclusions State of the Art  Featured articles identification:  Number of edits and editors.[1]  Character trigrams distributions.[2]  Number of words.[3]   Factual information.[4] [1]  D. Wilkinson and B. Huberman. Cooperation and quality in Wikipedia. In Proceedings of the 3th international  symposium on wikis and open collaboration (WikiSym’07), ACM, 2007. [2]  N. Lipka and B. Stein. Identifying featured articles in Wikipedia: writing  style matters. In Proceedings of the  19th international conference on World Wide Web, ACM, 2010. [3]  J.  Blumenstock.  Size  matters:  word  count  as  a  measure  of  quality  on  Wikipedia.  In  Proceedings  of  the  17th  international conference on World Wide Web (WWW’08), pages 1095–1096. ACM, 2008. [4]  E. Lex, M. Völske, M. Errecalde, E. Ferretti, L. Cagnina, C. Horn, B. Stein, and M. Granitzer. Measuring the  quality of web content using factual information. In Proceedings of the 2nd joint WICOW/AIRWeb workshop on  Web quality (WebQuality’12), pages 7–10. ACM, April 2012.
  • 5. Motivation State of the Art PU Learning Research questions Results Conclusions State of the Art [5­7]  Development of quality measurement metrics.  Quality flaws detection.[8­10] [5]  D.  Dalip,  M.  Gonçalves,  M.  Cristo,  and  P.  Calado.  Automatic  quality  assessment  of  content  created  collaboratively by Web communities: a case study of Wikipedia. In Proceedings of the 9th ACM/IEEE­CS joint  conference on digital libraries (JCDL’09), pages 295–304. ACM, 2009. [6]  M. Hu, E. Lim, A. Sun, H. Lauw, and B. Vuong. Measuring article quality in Wikipedia: models and evaluation.  In Proceedings of (CIKM’07, ACM, 2007. [7]  B.  Stvilia,  M.  Twidale,  L.  Smith,  and  L.  Gasser.  Assessing  information  quality  of  a  community­based  encyclopedia. In Proceedings of ICIQ’05, MIT, 2005. [8]  M.  Anderka,  B.  Stein,  and  N.  Lipka.  Detection  of  text  quality  flaws  as  a  one­class  classification  problem.  In  Proceedings of CIKM’11, ACM, 2011. [9]  M. Anderka, B. Stein, and N. Lipka. Towards Automatic Quality Assurance in Wikipedia. In Proceedings of the  20th international conference on World Wide Web (WWW’11), pages 5–6. ACM, 2011. [10]  M. Anderka, B. Stein, and N. Lipka. Using Cleanup Tags to Predict Quality Flaws in User­generated Content.  In Proceedings of SIGIR’12, ACM, 2012.
  • 6. Motivation State of the Art PU Learning Research questions Results Conclusions State of the Art [8­10]  Quality flaws detection.  One­class  classification  problem:  impossibility  to  properly characterize the “class” of documents not  containing a particular flaw.  Supervised learning approaches. [8]  M.  Anderka,  B.  Stein,  and  N.  Lipka.  Detection  of  text  quality  flaws  as  a  one­class  classification  problem.  In  Proceedings of CIKM’11, ACM, 2011. [9]  M. Anderka, B. Stein, and N. Lipka. Towards Automatic Quality Assurance in Wikipedia. In Proceedings of the  20th international conference on World Wide Web (WWW’11), pages 5–6. ACM, 2011. [10]  M. Anderka, B. Stein, and N. Lipka. Using Cleanup Tags to Predict Quality Flaws in User­generated Content.  In Proceedings of SIGIR’12, ACM, 2012.
  • 7. Motivation State of the Art PU Learning Research questions Results Conclusions PU Learning  This method uses as input a small labelled set of the  positive  class  to  be  predicted  and  a  large  unlabelled set to help learning.[11] Test U Classifier 1 P' U' no Test Training Training Precision RNs Classifier 2 Recall st P 1  stage [11]  Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.: Building text classifiers using positive and unlabeled examples. In:  Proceedings of the 3rd IEEE International Conference on Data Mining, 2003.
  • 8. Motivation State of the Art PU Learning Research questions Results Conclusions #1 What classifier in each stage?  Liu's  benchmark  system  on  20­Newsgroups  and  [11] Reuters­21578:  1st stage: Spy, 1­DNF, Rocchio, NB.  2nd stage: EM, SVM, SVM­I, SVM­IS.  kNN as 1st stage classifier.[12]  Our choice: NB + SVM. [11]  Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.: Building text classifiers using positive and unlabeled examples. In:  Proceedings of the 3rd IEEE International Conference on Data Mining, 2003. [12]  B. Zhang and W. Zuo. Reliable Negative Extracting Based on kNN for Learning from Positive and Unlabeled  Examples . Journal of Computers, 4(1):94–101, 2009.
  • 9. Motivation State of the Art PU Learning Research questions Results Conclusions #1 PAN@CLEF training release Flaw name # Articles Description Unreferenced 37572 The article does not cite any references or sources. Orphan 21356 The article has fewer than three incoming links. Refimprove 23144 The article needs additional citations for verification. Empty section   5757 The article has at least one section that is empty. Notability   3150 The article does not meet the notability guideline. No footnotes   6068 The article lacks of inline citations. Primary sources   3682 The article relies on references to primary sources. Wikify   1771 The article needs to be wikified (links and layout). Advert   1109 The article is written like an advertisement. Original research     507 The article contains original research. 50000 Untagged articles
  • 10. Motivation State of the Art PU Learning Research questions Results Conclusions #1 #2 Untagged sampling strategy U5.3 U3.3 U7.3 U1.3 U9.3 U3.2 U7.2 U1.2 U5.2 U9.2 U1.1 U3.1 U5.1 U7.1 U9.1 U10 U1 U2 U3 U4 U5 U6 U7 U8 U9 U10 U1 U2 U10.1 U2.1 U4.1 U6.1 U8.1 U10.2 U4.2 U8.2 U2.2 U6.2 U10.3 U4.3 U8.3 U2.3 U6.3 |Ui| = 5000, for i=1..10
  • 11. Motivation State of the Art PU Learning Research questions Results Conclusions #1 #2 Untagged sampling strategy 1­sample 2­sample 10­sample U1.0=U1 U2.0=U2 U10.0=U10 U1.1=U1+U2 U2.1=U2+U3 U10.1=U10+U1 U1.2=U1.1+U3 U2.2=U2.1+U4 U10.2=U10.1+U2 U1.3=U1.2+U4 U2.3=U2.2+U5 U10.3=U10.2+U3 (P + Ui.j), i=1..10, j=0..3 ⇒ 40 different training sets Training Test P size Proportions P size Proportions 1000 1:5 110 1:1 2500 1:10 1:15 1:20
  • 12. Motivation State of the Art PU Learning Research questions Results Conclusions #1 #2 #3 Strategies to select negative set from RNs 0.Selecting all RNs as negative set.[11] 1.Selecting |P| documents by random from RNs set. 2.Selecting the |P| best RNs (those assigned the highest  confidence prediction values by classifier 1). 3.Selecting the |P| worst RNs (those assigned the lowest  confidence prediction values by classifier 1). [11]  Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.: Building text classifiers using positive and unlabeled examples. In:  Proceedings of the 3rd IEEE International Conference on Data Mining, 2003.
  • 13. Motivation State of the Art PU Learning Research questions Results Conclusions #1 #2 #3 #4 SVM: Which kernel?  Linear SVM (WEKA's defaults parameters)  RBF SVM  γ ∈{2­15, 2­13, 2­11,…, 21, 23}  C ∈{2­5, 2­3, 2­1,…, 213, 215}
  • 14. Motivation State of the Art PU Learning Research questions Results Conclusions Results for RBF SVM: m1 for RNs Flaw Training Set SVM (Param.) Precision Recall F1 Precision Recall F1 C γ Benchmarks[8] Advert U06.2 215 2-7 0.802 0.955 0.871 0.650 0.580 0.613 Empty U06.2 215 2-7 1.000 0.991 0.995 0.740 0.700 0.719 No­foot U04.2 215 2-5 0.911 0.927 0.919 0.590 0.590 0.590 Notab U06.2 215 2-11 1.000 0.991 0.995 0.660 0.610 0.634 OR U06.1 215 2-9 0.681 0.991 0.807 0.560 0.800 0.659 Orph U10.2 215 2-9 0.991 1.000 0.995 0.720 0.590 0.649 PS U04.2 215 2-5 0.842 0.918 0.878 0.610 0.590 0.600 Ref U06.3 215 2-9 1.000 0.991 0.995 0.570 0.560 0.565 Unref U09.3 215 2-9 1.000 0.991 0.995 0.630 0.630 0.630 Wiki U06.3 215 2-9 0.991 0.991 0.991 0.640 0.580 0.609 0.944 AVG. 0.629 [8]  M.  Anderka, B. Stein,  and  N. Lipka. Detection  of  text  quality flaws as  a one­class classification problem. In:  Proceedings of CIKM’11, ACM, 2011.
  • 15. Motivation State of the Art PU Learning Research questions Results Conclusions Results for linear SVM: m1 for RNs Flaw Training Set Precision Recall F1 Precision Recall F1 Benchmarks[8] Advert U04.2 0.862 0.909 0.885 0.650 0.580 0.613 Empty U04.1 0.936 0.927 0.932 0.740 0.700 0.719 No-foot U04.3 0.860 0.782 0.819 0.590 0.590 0.590 Notab U07.2 0.908 0.900 0.904 0.660 0.610 0.634 OR U04.1 0.877 0.845 0.861 0.560 0.800 0.659 Orph U07.2 0.913 0.955 0.933 0.720 0.590 0.649 PS U04.2 0.898 0.800 0.846 0.610 0.590 0.600 Ref U04.2 0.835 0.736 0.783 0.570 0.560 0.565 Unref U01.1 0.955 0.964 0.959 0.630 0.630 0.630 Wiki U08.3 0.967 0.791 0.870 0.640 0.580 0.609 0.879 AVG. 0.629 [8]  M.  Anderka, B. Stein,  and  N. Lipka. Detection  of  text  quality flaws as  a one­class classification problem. In:  Proceedings of CIKM’11, ACM, 2011.
  • 16. Motivation State of the Art PU Learning Research questions Results Conclusions Conclusions  RQ#1: NB + SVM  RQ#2: Some unlabelled sets are more promising  RBF kernel: U6 sub­sample → 60% of the flaws.   Linear kernel: U4 sub­sample →  60% of the flaws  In general, Ui.j, i=1..10, j=2 or j=3 → best results.  RQ#3: Method for selecting RNs as true negatives  1 > 2 > 3 > 0, “>” means “better than”.
  • 17. Motivation State of the Art PU Learning Research questions Results Conclusions Conclusions  RQ#4: SVM kernels and parameters  Linear  and  RBF kernels'  best results  are  statistically  significant than the benchmarks.[8]   RBF is statistically better than Linear kernel.  The  optimistic  setting  in  [8]  → not  statistically  significant than Linear and RBF best results .  High penalty value for the error term (C) and very low γ values.  Semi­supervised methods seem very promising. [8]  M.  Anderka, B. Stein,  and  N. Lipka. Detection  of  text  quality flaws as  a one­class classification problem. In:  Proceedings of CIKM’11, ACM, 2011.
  • 18. Questions? Thanks very much for your attention!