SlideShare a Scribd company logo
1 of 35
A collaboration graph for E-LIS

             Thomas Krichel
Long Island University & Novosibirsk State
    University & Open Library Society
            3 November 2011
Introduction
• Thanks
  – Ángel Sánchez Villegas for usage of the e-lis
    domain.
  – To Tomas Baiget, who has encouraged me to
    present here.
• Warnings
  – Data shown here were correct as of 1 November
    2011.
  – I am glossing over some technical details.
  – Over 30 slides
overview
• Introduction to AuthorClaim
• Introduction to a co-authorship network
  based on restricting AuthorClaim to E-LIS
  documents
• Web interface and campaign
a known problem
• In publishing systems such as E-LIS, the
  authors are usually entered by name.
• It is well known that the name of an author
  does not identify a author
  – multiple ways to express the name of the same
    person
  – multiple people sharing one expression of their
    names
a tried solution
• One way to partially solve this problem is to
  have a system where authors can
  – claim papers that they have written
  – disclaim papers written by their homonyms
• The first system of this kind was the RePEc
  Author Service
  – created by Thomas Krichel in 1999
  – now has registered over 30000 economists
AuthorClaim
• AuthorClaim is an interdisciplinary version of
  the RePEc Author Service.
• It was created by Thomas Krichel in 2008.
• Lives at http://authorclaim.org.
• Over 100000000 authorships of over
  35000000 documents can be claimed.
• Among the documents are the E-LIS papers.
445 E-LIS papers claimed …
•   72 Tomas Baiget
•   61 Ulrich Herb
•   43 Antonella De Robbio
•   39 Thomas Krichel
•   26 Andrea Marchitelli & fernanda peset,
•   20 Ross MacIntyre
•   16 Dirk Lewandowski
•   15 Bożena Bednarek-Michalska
•   14 Lidia Derfert-Wolf
•   11 Zeno Tajoli & Imma Subirats
by 36 authors
• 9 Derek Law & Emma McCulloch & Philipp Mayr
• 8 Jeffrey Beall
• 7 nuria Lloret Lloret Romero
• 6 Benjamin John Keele
• 5 Adrian Pohl & Maria Francisca Abad-Garcia
• 4 Walther Umstaetter
• 3 Andrea Scharnhorst & Jose Manuel Barrueco &
  Thomas Hapke & Christian Hauschke & Klaus Graf
• 2 Frank Havemann & Eberhard R. Hilf & Bhojaraju
  Gunjal & Chris L. Awre
• 1 Loet Leydesdorff & Peter Bolles Hirtle & Alexei
  Botchkarev & Christina K. Pikas & Oliver Flimm &
  Sridhar Gutam
so far so good
• I don’t really want to talk about AuthorClaim
  but about a services that we can build when
  we have identified authors.
• When we have this data, we can find out who
  has been writing papers with whom.
• In other words we can study the co-authorship
  network.
co-authorship
• When two registered author claim to have
  authored the same paper, we say that they are
  co-authors.
• The authorship relationship creates a link
  between the two authors.
• The link is symmetric, meaning that the fact
  that Thomas is a co-author of Imma means
  that Imma is a co-author of Thomas.
58 papers have been co-claimed …
•   12 fernanda peset
•   10 Tomas Baiget
•   8 Imma Subirats
•   6 Antonella De Robbio
•   4 nuria Lloret Lloret Romero
by 16 co-authors
• 2 Andrea Marchitelli & Ulrich Herb & Ross
  MacIntyre & Bożena Bednarek-Michalska &
  Thomas Krichel & Dirk Lewandowski & Lidia
  Derfert-Wolf
• 1 Derek Law & Emma McCulloch & Sridhar
  Gutam & Philipp Mayr
network and components
• When we start with one co-author, and we
  move to her co-authors, what other authors
  can be reach?
• We call the authors we can reach by starting
  from any one of them by following co-
  authorship relationships a component of the
  network.
components in the network
• “Scottish”: Derek Law & Emma McCulloch
• “Polish”: Bożena Bednarek-Michalska & Lidia
  Derfert-Wolf
• “German”: Dirk Lewandowski & Sridhar Gutam
  & Philipp Mayr
• “Giant”: Andrea Marchitelli & Ulrich Herb &
  Thomas Krichel & Antonella De Robbio &
  fernanda peset & Imma Subirats & Ross
  MacIntyre & nuria Lloret Lloret Romero &
  Tomas Baiget
the giant component
• The size of the giant component is larger than
  the combined size of all other component.
• It is very common, in real existing networks,
  that there is a giant component.
• As the network grows, older small
  components join the giant component and
  new small components are created.
• We therefore study the giant component.
centrality
• Who is at the center of the E-LIS author
  network, i.e. the most central author in E-LIS?
• The answer is that it depends on how we
  measure centrality.
• Two measures are commonly used
  – closeness centrality
  – betweenness centrality
• Both depend on a measure of distance
distance
• To understand that we need a measure of
  distance.
  – We say that two authors have distance one if they
    are co-authors.
  – We say that two authors have distance two if they
    are not co-authors, but have a common co-author.
  – etc
distances for Imma Subirats
•   Tomas Baiget 1
•   Antonella De Robbio 1
•   Ulrich Herb 2
•   Thomas Krichel 1
•   nuria Lloret Lloret Romero 2
•   Andrea Marchitelli 2
•   Ross MacIntyre 2
•   fernanda peset 1
•   Imma Subirats 0
distances for Ulrich Herb
•   Tomas Baiget 1
•   Antonella De Robbio 3
•   Ulrich Herb 0
•   Thomas Krichel 2
•   nuria Lloret Lloret Romero 3
•   Andrea Marchitelli 4
•   Ross MacIntyre 4
•   fernanda peset 2
•   Imma Subirats 2
closeness centrality
• The average distance of Imma is much small
  than the average distance of Ulrich.
• In fact, we can calculated to average distance
  of the every author from all other authors.
• This is what we call closeness centrality of an
  author.
shortest paths
• In order to find the distance between two
  authors, we have to evaluate all possible paths
  between them.
• We need to find shortest paths between.
  There are well-known algorithms to find them.
• The distance is the length of the shortest path.
diameter
• When we have found all shortest paths, we
  can find the length of the longest shortest
  paths between any two authors.
• This is called the diameter.
• In our network the diameter is four.
• This much smaller than the number of authors
  in the giant component (16).
• We say that our network has the “small
  world” property.
shortest paths from Tomas Baiget
•   → Thomas Krichel
•   → fernanda peset → nuria Lloret Lloret Romero
•   → fernanda peset
•   → Imma Subirats → Antonella De Robbio → Ross
    MacIntyre
•   → Ulrich Herb
•   → Imma Subirats → Antonella De Robbio
•   → Imma Subirats → Antonella De Robbio → Andrea
    Marchitelli
•   → Imma Subirats
shortest paths from Antonella De Robbio
• → Imma Subirats → fernanda peset → nuria Lloret
  Lloret Romero
• → Imma Subirats
• → Imma Subirats → Tomas Baiget → Ulrich Herb
• → Imma Subirats → Tomas Baiget
• → Imma Subirats → fernanda peset
• → Andrea Marchitelli
• → Ross MacIntyre
• → Thomas Krichel
shortest paths from Ross MacIntyre
• → Antonella De Robbio → Imma Subirats →
  fernanda peset → nuria Lloret Lloret Romero
• → Antonella De Robbio → Imma Subirats →
  fernanda peset
• → Antonella De Robbio → Imma Subirats → Tomas
  Baiget → Ulrich Herb
• → Antonella De Robbio → Thomas Krichel
• → Antonella De Robbio → Imma Subirats → Tomas
  Baiget
• → Antonella De Robbio → Imma Subirats
• → Antonella De Robbio
• → Andrea Marchitelli
what do the paths tell us?
• We find that some authors are appearing more
  often as intermediaries than other authors.
• In fact, we can evaluate the number of times
  an author appears as an intermediary in the
  paths.
• This is what we call the betweenness centrality
  of an author.
• A large number of authors have a
  betweenness of zero. They are called marginal
  authors.
summary
• We build a network.
• We find two ways to evaluate authors
  – closeness
  – betweenness
• Now let us look at the results.
ranking for closeness
 rank   name                    closeness
• 1     Imma Subirats               1.5
• 2     Antonella De Robbio         1.75
• 2     Tomas Baiget               1.75
• 2     Thomas Krichel              1.75
• 5     fernanda peset              1.875
• 6     Andrea Marchitelli          2.5
• 6     Ross MacIntyre              2.5
• 8     Ulrich Herb                 2.625
• 9     nuria Lloret Lloret Romero 2.75
ranking for betweenness
 rank name                   betweenness
• 1 Antonella De Robbio 2.7
• 1 Imma Subirats             2.7
• 3 Tomas Baiget              2.025
• 4 fernanda peset            1.575
• Andrea Marchitelli, Ross MacIntyre, nuria
   Lloret Lloret Romero, Thomas Krichel, Ulrich
   Herb are all marginal.
web service
• E-LIS and AuthorClaim data are readily
  available in bulk.
• There is a software called icanis, developed by
  yours truly, that can calculate and visualize
  results. It is configurable via XSLT.
• Almost instantaneous updates are in principle
  possible, but not implemented.
coll.e-lis.org
• This is a site that I have set up.
• I think we need a site in the rclis domain but I
  am not sure what the name should be.
• coll.e-lis.org is a bad name too.
• So this is meant as a prototype.
features
• Rankings for closeness.
• Full path searching from author pages
  – with support for partial name entry
  – but within there no highlighting for parts
• Unclear documentation
ranking
• Ranking is the way forward with populating
  scholarly communication services. RePEc has
  shown this time and again.
• Co-authorship ranking is particularly
  interesting because authors have to convince
  their co-authors to publish papers in E-LIS and
  to claim them in AuthorClaim.
campaign
• We need to do some work on the site.
• Then we can have campaign and award a cash
  prize.
• I am thinking about donating $200 to the top
  of each category or $300 to joint winner.
• The competition would be time-limited, say
  about three months next Summer.
• During that time we would do frequent
  updates of the site.
Thank you for your attention!

http://openlib.org/home/krichel

 write to krichel@openlib.org

More Related Content

Similar to Krichel·A Collaboration Graph for E-LIS

Success stories lla 2012
Success stories lla 2012Success stories lla 2012
Success stories lla 2012jacquiekeleher
 
1 the basic concepts of we think
1   the basic concepts of we think1   the basic concepts of we think
1 the basic concepts of we thinkCharis Creber
 
Writing The Research Paper A Handbook (7th ed) - Ch 5 computers and the resea...
Writing The Research Paper A Handbook (7th ed) - Ch 5 computers and the resea...Writing The Research Paper A Handbook (7th ed) - Ch 5 computers and the resea...
Writing The Research Paper A Handbook (7th ed) - Ch 5 computers and the resea...tedster777
 
Write A Better FM - Ohio Linux 2011
Write A Better FM - Ohio Linux 2011Write A Better FM - Ohio Linux 2011
Write A Better FM - Ohio Linux 2011Rich Bowen
 
Solving Problems with Web 2.0
Solving Problems with Web 2.0Solving Problems with Web 2.0
Solving Problems with Web 2.0Dorothea Salo
 
Courage of our Connections
Courage of our ConnectionsCourage of our Connections
Courage of our ConnectionsRachel Frick
 
Contributing to Open Source
Contributing to Open SourceContributing to Open Source
Contributing to Open SourceDaniel Stenberg
 
Presentation: Virtual Pizza is not enough. Mozilla Brownbag March, 2014
Presentation: Virtual Pizza is not enough. Mozilla Brownbag March, 2014Presentation: Virtual Pizza is not enough. Mozilla Brownbag March, 2014
Presentation: Virtual Pizza is not enough. Mozilla Brownbag March, 2014Jennie Rose Halperin
 
Why twitter? What can Twitter do for my library & my professional development?
Why twitter? What can Twitter do for my library & my professional development?Why twitter? What can Twitter do for my library & my professional development?
Why twitter? What can Twitter do for my library & my professional development?Bill Drew
 
2014FreelanceWriting
2014FreelanceWriting2014FreelanceWriting
2014FreelanceWritingFran Molloy
 
Presentation: Virtual Pizza is not enough. Mozilla Brownbag March, 2014
Presentation: Virtual Pizza is not enough. Mozilla Brownbag March, 2014Presentation: Virtual Pizza is not enough. Mozilla Brownbag March, 2014
Presentation: Virtual Pizza is not enough. Mozilla Brownbag March, 2014Jennie Rose Halperin
 
Using online and print resources.pptx
Using online and print resources.pptxUsing online and print resources.pptx
Using online and print resources.pptxLaljiBaraiya1
 
Keeping the Content Train on the Tracks (And on Topic)
Keeping the Content Train on the Tracks (And on Topic)Keeping the Content Train on the Tracks (And on Topic)
Keeping the Content Train on the Tracks (And on Topic)Kristen Eberlein
 
Write a better FM
Write a better FMWrite a better FM
Write a better FMRich Bowen
 
Anon p2p slides
Anon p2p slidesAnon p2p slides
Anon p2p slideschintaan
 
Breaking into the Nonfiction Market, Step-by-Step
Breaking into the Nonfiction Market, Step-by-StepBreaking into the Nonfiction Market, Step-by-Step
Breaking into the Nonfiction Market, Step-by-Stepggaldorisi
 
Social Media Analytics
Social Media AnalyticsSocial Media Analytics
Social Media AnalyticsMuhammad Rifqi
 

Similar to Krichel·A Collaboration Graph for E-LIS (20)

Success stories lla 2012
Success stories lla 2012Success stories lla 2012
Success stories lla 2012
 
1 the basic concepts of we think
1   the basic concepts of we think1   the basic concepts of we think
1 the basic concepts of we think
 
Writing The Research Paper A Handbook (7th ed) - Ch 5 computers and the resea...
Writing The Research Paper A Handbook (7th ed) - Ch 5 computers and the resea...Writing The Research Paper A Handbook (7th ed) - Ch 5 computers and the resea...
Writing The Research Paper A Handbook (7th ed) - Ch 5 computers and the resea...
 
Write A Better FM - Ohio Linux 2011
Write A Better FM - Ohio Linux 2011Write A Better FM - Ohio Linux 2011
Write A Better FM - Ohio Linux 2011
 
Sources
SourcesSources
Sources
 
Solving Problems with Web 2.0
Solving Problems with Web 2.0Solving Problems with Web 2.0
Solving Problems with Web 2.0
 
Courage of our Connections
Courage of our ConnectionsCourage of our Connections
Courage of our Connections
 
Contributing to Open Source
Contributing to Open SourceContributing to Open Source
Contributing to Open Source
 
Presentation: Virtual Pizza is not enough. Mozilla Brownbag March, 2014
Presentation: Virtual Pizza is not enough. Mozilla Brownbag March, 2014Presentation: Virtual Pizza is not enough. Mozilla Brownbag March, 2014
Presentation: Virtual Pizza is not enough. Mozilla Brownbag March, 2014
 
375 cc3 a_lindabeebe
375 cc3 a_lindabeebe375 cc3 a_lindabeebe
375 cc3 a_lindabeebe
 
Why twitter? What can Twitter do for my library & my professional development?
Why twitter? What can Twitter do for my library & my professional development?Why twitter? What can Twitter do for my library & my professional development?
Why twitter? What can Twitter do for my library & my professional development?
 
2014FreelanceWriting
2014FreelanceWriting2014FreelanceWriting
2014FreelanceWriting
 
Presentation: Virtual Pizza is not enough. Mozilla Brownbag March, 2014
Presentation: Virtual Pizza is not enough. Mozilla Brownbag March, 2014Presentation: Virtual Pizza is not enough. Mozilla Brownbag March, 2014
Presentation: Virtual Pizza is not enough. Mozilla Brownbag March, 2014
 
Using online and print resources.pptx
Using online and print resources.pptxUsing online and print resources.pptx
Using online and print resources.pptx
 
Keeping the Content Train on the Tracks (And on Topic)
Keeping the Content Train on the Tracks (And on Topic)Keeping the Content Train on the Tracks (And on Topic)
Keeping the Content Train on the Tracks (And on Topic)
 
Write a better FM
Write a better FMWrite a better FM
Write a better FM
 
Class 6 jrnl 6202
Class 6 jrnl 6202Class 6 jrnl 6202
Class 6 jrnl 6202
 
Anon p2p slides
Anon p2p slidesAnon p2p slides
Anon p2p slides
 
Breaking into the Nonfiction Market, Step-by-Step
Breaking into the Nonfiction Market, Step-by-StepBreaking into the Nonfiction Market, Step-by-Step
Breaking into the Nonfiction Market, Step-by-Step
 
Social Media Analytics
Social Media AnalyticsSocial Media Analytics
Social Media Analytics
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 

Recently uploaded (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 

Krichel·A Collaboration Graph for E-LIS

  • 1. A collaboration graph for E-LIS Thomas Krichel Long Island University & Novosibirsk State University & Open Library Society 3 November 2011
  • 2. Introduction • Thanks – Ángel Sánchez Villegas for usage of the e-lis domain. – To Tomas Baiget, who has encouraged me to present here. • Warnings – Data shown here were correct as of 1 November 2011. – I am glossing over some technical details. – Over 30 slides
  • 3. overview • Introduction to AuthorClaim • Introduction to a co-authorship network based on restricting AuthorClaim to E-LIS documents • Web interface and campaign
  • 4. a known problem • In publishing systems such as E-LIS, the authors are usually entered by name. • It is well known that the name of an author does not identify a author – multiple ways to express the name of the same person – multiple people sharing one expression of their names
  • 5. a tried solution • One way to partially solve this problem is to have a system where authors can – claim papers that they have written – disclaim papers written by their homonyms • The first system of this kind was the RePEc Author Service – created by Thomas Krichel in 1999 – now has registered over 30000 economists
  • 6. AuthorClaim • AuthorClaim is an interdisciplinary version of the RePEc Author Service. • It was created by Thomas Krichel in 2008. • Lives at http://authorclaim.org. • Over 100000000 authorships of over 35000000 documents can be claimed. • Among the documents are the E-LIS papers.
  • 7. 445 E-LIS papers claimed … • 72 Tomas Baiget • 61 Ulrich Herb • 43 Antonella De Robbio • 39 Thomas Krichel • 26 Andrea Marchitelli & fernanda peset, • 20 Ross MacIntyre • 16 Dirk Lewandowski • 15 Bożena Bednarek-Michalska • 14 Lidia Derfert-Wolf • 11 Zeno Tajoli & Imma Subirats
  • 8. by 36 authors • 9 Derek Law & Emma McCulloch & Philipp Mayr • 8 Jeffrey Beall • 7 nuria Lloret Lloret Romero • 6 Benjamin John Keele • 5 Adrian Pohl & Maria Francisca Abad-Garcia • 4 Walther Umstaetter • 3 Andrea Scharnhorst & Jose Manuel Barrueco & Thomas Hapke & Christian Hauschke & Klaus Graf • 2 Frank Havemann & Eberhard R. Hilf & Bhojaraju Gunjal & Chris L. Awre • 1 Loet Leydesdorff & Peter Bolles Hirtle & Alexei Botchkarev & Christina K. Pikas & Oliver Flimm & Sridhar Gutam
  • 9. so far so good • I don’t really want to talk about AuthorClaim but about a services that we can build when we have identified authors. • When we have this data, we can find out who has been writing papers with whom. • In other words we can study the co-authorship network.
  • 10. co-authorship • When two registered author claim to have authored the same paper, we say that they are co-authors. • The authorship relationship creates a link between the two authors. • The link is symmetric, meaning that the fact that Thomas is a co-author of Imma means that Imma is a co-author of Thomas.
  • 11. 58 papers have been co-claimed … • 12 fernanda peset • 10 Tomas Baiget • 8 Imma Subirats • 6 Antonella De Robbio • 4 nuria Lloret Lloret Romero
  • 12. by 16 co-authors • 2 Andrea Marchitelli & Ulrich Herb & Ross MacIntyre & Bożena Bednarek-Michalska & Thomas Krichel & Dirk Lewandowski & Lidia Derfert-Wolf • 1 Derek Law & Emma McCulloch & Sridhar Gutam & Philipp Mayr
  • 13. network and components • When we start with one co-author, and we move to her co-authors, what other authors can be reach? • We call the authors we can reach by starting from any one of them by following co- authorship relationships a component of the network.
  • 14. components in the network • “Scottish”: Derek Law & Emma McCulloch • “Polish”: Bożena Bednarek-Michalska & Lidia Derfert-Wolf • “German”: Dirk Lewandowski & Sridhar Gutam & Philipp Mayr • “Giant”: Andrea Marchitelli & Ulrich Herb & Thomas Krichel & Antonella De Robbio & fernanda peset & Imma Subirats & Ross MacIntyre & nuria Lloret Lloret Romero & Tomas Baiget
  • 15. the giant component • The size of the giant component is larger than the combined size of all other component. • It is very common, in real existing networks, that there is a giant component. • As the network grows, older small components join the giant component and new small components are created. • We therefore study the giant component.
  • 16. centrality • Who is at the center of the E-LIS author network, i.e. the most central author in E-LIS? • The answer is that it depends on how we measure centrality. • Two measures are commonly used – closeness centrality – betweenness centrality • Both depend on a measure of distance
  • 17. distance • To understand that we need a measure of distance. – We say that two authors have distance one if they are co-authors. – We say that two authors have distance two if they are not co-authors, but have a common co-author. – etc
  • 18. distances for Imma Subirats • Tomas Baiget 1 • Antonella De Robbio 1 • Ulrich Herb 2 • Thomas Krichel 1 • nuria Lloret Lloret Romero 2 • Andrea Marchitelli 2 • Ross MacIntyre 2 • fernanda peset 1 • Imma Subirats 0
  • 19. distances for Ulrich Herb • Tomas Baiget 1 • Antonella De Robbio 3 • Ulrich Herb 0 • Thomas Krichel 2 • nuria Lloret Lloret Romero 3 • Andrea Marchitelli 4 • Ross MacIntyre 4 • fernanda peset 2 • Imma Subirats 2
  • 20. closeness centrality • The average distance of Imma is much small than the average distance of Ulrich. • In fact, we can calculated to average distance of the every author from all other authors. • This is what we call closeness centrality of an author.
  • 21. shortest paths • In order to find the distance between two authors, we have to evaluate all possible paths between them. • We need to find shortest paths between. There are well-known algorithms to find them. • The distance is the length of the shortest path.
  • 22. diameter • When we have found all shortest paths, we can find the length of the longest shortest paths between any two authors. • This is called the diameter. • In our network the diameter is four. • This much smaller than the number of authors in the giant component (16). • We say that our network has the “small world” property.
  • 23. shortest paths from Tomas Baiget • → Thomas Krichel • → fernanda peset → nuria Lloret Lloret Romero • → fernanda peset • → Imma Subirats → Antonella De Robbio → Ross MacIntyre • → Ulrich Herb • → Imma Subirats → Antonella De Robbio • → Imma Subirats → Antonella De Robbio → Andrea Marchitelli • → Imma Subirats
  • 24. shortest paths from Antonella De Robbio • → Imma Subirats → fernanda peset → nuria Lloret Lloret Romero • → Imma Subirats • → Imma Subirats → Tomas Baiget → Ulrich Herb • → Imma Subirats → Tomas Baiget • → Imma Subirats → fernanda peset • → Andrea Marchitelli • → Ross MacIntyre • → Thomas Krichel
  • 25. shortest paths from Ross MacIntyre • → Antonella De Robbio → Imma Subirats → fernanda peset → nuria Lloret Lloret Romero • → Antonella De Robbio → Imma Subirats → fernanda peset • → Antonella De Robbio → Imma Subirats → Tomas Baiget → Ulrich Herb • → Antonella De Robbio → Thomas Krichel • → Antonella De Robbio → Imma Subirats → Tomas Baiget • → Antonella De Robbio → Imma Subirats • → Antonella De Robbio • → Andrea Marchitelli
  • 26. what do the paths tell us? • We find that some authors are appearing more often as intermediaries than other authors. • In fact, we can evaluate the number of times an author appears as an intermediary in the paths. • This is what we call the betweenness centrality of an author. • A large number of authors have a betweenness of zero. They are called marginal authors.
  • 27. summary • We build a network. • We find two ways to evaluate authors – closeness – betweenness • Now let us look at the results.
  • 28. ranking for closeness rank name closeness • 1 Imma Subirats 1.5 • 2 Antonella De Robbio 1.75 • 2 Tomas Baiget 1.75 • 2 Thomas Krichel 1.75 • 5 fernanda peset 1.875 • 6 Andrea Marchitelli 2.5 • 6 Ross MacIntyre 2.5 • 8 Ulrich Herb 2.625 • 9 nuria Lloret Lloret Romero 2.75
  • 29. ranking for betweenness rank name betweenness • 1 Antonella De Robbio 2.7 • 1 Imma Subirats 2.7 • 3 Tomas Baiget 2.025 • 4 fernanda peset 1.575 • Andrea Marchitelli, Ross MacIntyre, nuria Lloret Lloret Romero, Thomas Krichel, Ulrich Herb are all marginal.
  • 30. web service • E-LIS and AuthorClaim data are readily available in bulk. • There is a software called icanis, developed by yours truly, that can calculate and visualize results. It is configurable via XSLT. • Almost instantaneous updates are in principle possible, but not implemented.
  • 31. coll.e-lis.org • This is a site that I have set up. • I think we need a site in the rclis domain but I am not sure what the name should be. • coll.e-lis.org is a bad name too. • So this is meant as a prototype.
  • 32. features • Rankings for closeness. • Full path searching from author pages – with support for partial name entry – but within there no highlighting for parts • Unclear documentation
  • 33. ranking • Ranking is the way forward with populating scholarly communication services. RePEc has shown this time and again. • Co-authorship ranking is particularly interesting because authors have to convince their co-authors to publish papers in E-LIS and to claim them in AuthorClaim.
  • 34. campaign • We need to do some work on the site. • Then we can have campaign and award a cash prize. • I am thinking about donating $200 to the top of each category or $300 to joint winner. • The competition would be time-limited, say about three months next Summer. • During that time we would do frequent updates of the site.
  • 35. Thank you for your attention! http://openlib.org/home/krichel write to krichel@openlib.org