SlideShare a Scribd company logo
1 of 64
Download to read offline
Man
                                    vs
                                  Machine



Main theme, Web 2.0 is as much about machine consumable as human consumable data.
Web 1.0
 
                                             Web 2.0
         DoubleClick
                                           Google AdSense
         Ofoto

                                                Flickr
         Akamai
 
                                              BitTorrent
         mp3.com
                                               Napster
         Britannica Online

                                    Wikipedia
         personal websites

                                    blogging
         evite
                                                 upcoming.org and EVDB
         domain name speculation

                              search engine optimization
         page views
 
                                          cost per click
         screen scraping
 
                                     web services
         publishing                                             participation
         CMS                                                    wikis
         directories (taxonomy)
                                tagging (folksonomy)
         stickiness                                             syndication



The meme of Web 2.0 was influenced by comparing pre-dot com bubble companies and post
dot com bubble companies.

What is the difference between the list on the left and the list on the right?

Let’s take the example of Brtiannica vs Wikipedia.

The information in Britannica is centrally controlled. It has a relatively small number of contributors.
The workload per contributor is high.

Wikipedia is open to anyone to contribute. A collaboration of 1000’s can lead to a work of equal
quality to
a more centrally controlled method.

Britannica’s revenues decreased from 650M to 50M over a 10 year period!

The new sites make it easy to add information and use that information to
answer or solve problems for people.
y
     easy
      contributing




     hard                                           mining                      easy

    Two key parts to Web 2.0 are easy addition of information into
    the system (user generated content), followed by ways of mining
    that information.

    One of the thesis that we are following by trying to work in this context
    is that by realizing the nature of the flow of information
    and the availability of ways of mining that information
    we can create useful solutions to real problems.

    Companies that find ways to do this should succeed.
y
     easy
      contributing




                                                                                semantic web


     hard                                           mining                            easy

    Two key parts to Web 2.0 are easy addition of information into
    the system (user generated content), followed by ways of mining
    that information.

    One of the thesis that we are following by trying to work in this context
    is that by realizing the nature of the flow of information
    and the availability of ways of mining that information
    we can create useful solutions to real problems.

    Companies that find ways to do this should succeed.
y
     easy
      contributing        plain text, emails




                                                                                semantic web


     hard                                           mining                            easy

    Two key parts to Web 2.0 are easy addition of information into
    the system (user generated content), followed by ways of mining
    that information.

    One of the thesis that we are following by trying to work in this context
    is that by realizing the nature of the flow of information
    and the availability of ways of mining that information
    we can create useful solutions to real problems.

    Companies that find ways to do this should succeed.
y
     easy                 plain text, emails                                     hyperlinks
                                                                                   views
                                                                                     tags
                                                                                    citations?
      contributing




                                                                                semantic web


     hard                                           mining                            easy

    Two key parts to Web 2.0 are easy addition of information into
    the system (user generated content), followed by ways of mining
    that information.

    One of the thesis that we are following by trying to work in this context
    is that by realizing the nature of the flow of information
    and the availability of ways of mining that information
    we can create useful solutions to real problems.

    Companies that find ways to do this should succeed.
y
     easy                 plain text, emails                                     hyperlinks
                                                                                   views
                                                                                     tags
                                                                                    citations?
      contributing




                         academic papers                                        semantic web


     hard                                           mining                            easy

    Two key parts to Web 2.0 are easy addition of information into
    the system (user generated content), followed by ways of mining
    that information.

    One of the thesis that we are following by trying to work in this context
    is that by realizing the nature of the flow of information
    and the availability of ways of mining that information
    we can create useful solutions to real problems.

    Companies that find ways to do this should succeed.
y
     easy                 plain text, emails                                     hyperlinks
                                                                                   views
                                                                                     tags
                                                                                    citations?
      contributing




                                                                      microformats
                                                                        MicroFormats



                         academic papers                                        semantic web


     hard                                           mining                            easy

    Two key parts to Web 2.0 are easy addition of information into
    the system (user generated content), followed by ways of mining
    that information.

    One of the thesis that we are following by trying to work in this context
    is that by realizing the nature of the flow of information
    and the availability of ways of mining that information
    we can create useful solutions to real problems.

    Companies that find ways to do this should succeed.
The Kind of Information that we can capture with Connotea is typical of many sites.
For Connotea we have:
- citation information
- usage patterns, (when did an item get added to our DB, how many times has it been added)
- user generated meta-data such as tags
- Potentially social network information, how many of my friends have added this item?
The Kind of Information that we can capture with Connotea is typical of many sites.
For Connotea we have:
- citation information
- usage patterns, (when did an item get added to our DB, how many times has it been added)
- user generated meta-data such as tags
- Potentially social network information, how many of my friends have added this item?
The Kind of Information that we can capture with Connotea is typical of many sites.
For Connotea we have:
- citation information
- usage patterns, (when did an item get added to our DB, how many times has it been added)
- user generated meta-data such as tags
- Potentially social network information, how many of my friends have added this item?
The Kind of Information that we can capture with Connotea is typical of many sites.
For Connotea we have:
- citation information
- usage patterns, (when did an item get added to our DB, how many times has it been added)
- user generated meta-data such as tags
- Potentially social network information, how many of my friends have added this item?
The Kind of Information that we can capture with Connotea is typical of many sites.
For Connotea we have:
- citation information
- usage patterns, (when did an item get added to our DB, how many times has it been added)
- user generated meta-data such as tags
- Potentially social network information, how many of my friends have added this item?
Gatherin                         Trustin                     Integrat        Analyz   Triangl
                g                                g                           ing             ing      es




                                                                               del.icio.us




Many Web 2.0 sites, have created islands of data.
Some key technologies for bridging these islands include fire eagle, OpenId and OAuth.
- rfid, fire eagle point the way to merging these islands with the real world
Whats the process?




• Gathering The data
• Trusting the data
• Integration / Disambiguating
• Understanding and analyzing the data
DOI




Some key technologies for bridging these islands include fire eagle, OpenId and OAuth.
In the publishing world DOIʼs are a key technology
Internet




                                             Cf

      Site
       or                              Internet Site
   Application




OpenID cf OAuth

OpenID allows a single person to interact with multiple web sites using one log-in mechanisim
OAuth allows both desktop and web applications to share data using one authentication mechanisim
Rated 5/5                                              Rated 1/5




                         Redemption                                                                                    Based-on-Play
                                                        Android                                                 Love                   Refugee
           Spacecraft
                                    Time-Travel Soldier                                                  Famous-Score Hope
                                            Alien
         Blockbuster                                                 Alien                               Broken-Heart Blockbuster
          Space
                                         War
                                                          Futuristic                                     Based-on-Novel Racism
                        Artificial-Intelligence                                                                     Hero          Melodrama




Once you merge the data, you have to understand it.

The tags that a person uses across different services can give you a more holistic picture of their interests
However tags can be ambiguous.

Some technologies that are addressing this a semantic web technologies, look at projects such as
Tagora http://www.tagora-project.eu/
DBpedia http://dbpedia.org/
SIOC http://sioc-project.org/
FOAF http://www.foaf-project.org/
Open
    Science                                                                 Web 2.0




                                                                  Semantic
                                                                    Web


Though not exactly the same, web 2.0, Open science and the semantic web work well together
and they share some common traits, namely sharing, openness and minability of information.
Growth in submissions to the arXiv, demonstrating growth in scientific output
certainly growth in output of available data online in e-format
There is some discussion about whether there is an information overload, as the main journals
are still the important ones, but reading habits have changed
Discussion Groups and Mailing lists contain a huge amount of information from
from snippets of computer code, to long discussions about topics.

Mark Mail, from MarkLogic, have a site that mines this information. Here we see
a comparison of a search for FORTRAN vs a search for Java.

At the moment these kinds of archives are mainly relevant in the computer science area, but
these kinds of conversations are going on all the time in every field.

http://markmail.org/
Amazon use page views and a database of user purchases to find things you might like.

Again, here they are using data that they get for free from people using their site.

Google page rank is another canonical example
Crystal Eye


                     Social/Knowledge
                       Networking

An example of two type of uses in science:

CrystalEye http://wwmm.ch.cam.ac.uk/crystaleye/
example bond length for a structure: http://wwmm.ch.cam.ac.uk/crystaleye/bondlengths/H-Rb.svg


Nature Network: human-human interaction
Nature Web Publishing
                     group




                                                      OTMI

The main products that we have developed so far are

-   database gateways
-   OTMI (open text mining interface)
-   podcasts
-   scintilla
-   nature network
-   nature preceedings
-   connotea
There are also other tools out there that are doing the same kind of thing, but I’m partial.
There are also other tools out there that are doing the same kind of thing, but I’m partial.
There are also other tools out there that are doing the same kind of thing, but I’m partial.
There are also other tools out there that are doing the same kind of thing, but I’m partial.
There are also other tools out there that are doing the same kind of thing, but I’m partial.
There are also other tools out there that are doing the same kind of thing, but I’m partial.
Repository




Discuss how social silo’s can be interchange locations between repositories
and also between repositories and applications that we might also be built on top
of the social silos.
Repository




Discuss how social silo’s can be interchange locations between repositories
and also between repositories and applications that we might also be built on top
of the social silos.
Repository




Discuss how social silo’s can be interchange locations between repositories
and also between repositories and applications that we might also be built on top
of the social silos.
Repository




Discuss how social silo’s can be interchange locations between repositories
and also between repositories and applications that we might also be built on top
of the social silos.
Repository




Discuss how social silo’s can be interchange locations between repositories
and also between repositories and applications that we might also be built on top
of the social silos.
Repository
                                        Repository
                                         Repository
                                          Repository




 Repository




Discuss how social silo’s can be interchange locations between repositories
and also between repositories and applications that we might also be built on top
of the social silos.
Repository
                                        Repository
                                         Repository
                                          Repository




 Repository




Discuss how social silo’s can be interchange locations between repositories
and also between repositories and applications that we might also be built on top
of the social silos.
Repository
                                        Repository
                                         Repository
                                          Repository




 Repository




      Citation                               Pubmed                                 Activity
     Management                            Integration                              Listing

Discuss how social silo’s can be interchange locations between repositories
and also between repositories and applications that we might also be built on top
of the social silos.
Connotea citation parsing modules

This model was quick and easy to implement but using the URL as the unique key.
Amazon.pm                 DOI.pm                    LivingReviews.pm
        PLoS.pm                   RIS.pm                    SpamDNSBL.pm
        autodiscovery.pm
        BibTeX.pm                  Dlib.pm                    NASA.pm
        PMC.pm                     Scitation.pm               Springer.pm
        blog.pm
        Blackwell.pm                Highwire.pm               NPG.pm
        PNAS.pm                     Self.pm                   Wiley.pm
        ePrints.pm
        BmcPdf.pm                   Hubmed.pm                  OUP.pm
        Pubmed.pm                   Simple.pm                  arXiv.pm




We have a bunch of citation modules

they currently have to be written in perl, and this is a problem,
there is nothing similar to the scaffold infrastructure that Zotero has
Title
Title
Title

Date
Title

Date
Title
Author   Date
Title
Author   Date
Title
      Author   Date




PMID/DOI
Getting data in, part 2

The meta-data from the paper has been captured

When you begin to add tags suggested tags are presented based on
tags you have already used

paper by Huberman et all shows that displaying all tags drives tag-onomies to stable state (Polya-
Renyi urn model)
You need to display the full community tags, which we don’t do ... yet.
Getting data in, part 2

The meta-data from the paper has been captured

When you begin to add tags suggested tags are presented based on
tags you have already used

paper by Huberman et all shows that displaying all tags drives tag-onomies to stable state (Polya-
Renyi urn model)
You need to display the full community tags, which we don’t do ... yet.
Getting data in, part 2

The meta-data from the paper has been captured

When you begin to add tags suggested tags are presented based on
tags you have already used

paper by Huberman et all shows that displaying all tags drives tag-onomies to stable state (Polya-
Renyi urn model)
You need to display the full community tags, which we don’t do ... yet.
user home page,
toolbox, on right
user tags
related tags
related users, groups
user home page,
toolbox, on right
user tags
related tags
related users, groups
user home page,
toolbox, on right
user tags
related tags
related users, groups
Getitng data out

Open Data, important


Export only gets out the citation data, and not extra meta data that the user
has added such as comments or tags.

Formats: txt, rdf, BibTex,RIS,EndNote an api??
Getitng data out

Open Data, important


Export only gets out the citation data, and not extra meta data that the user
has added such as comments or tags.

Formats: txt, rdf, BibTex,RIS,EndNote an api??
perl
       mod_perl
       Template Toolkit
       MySQL
       Open Source, GPL2.5 v 1.8.1
       web1.75 application


Discuss reasons for OS, discuss web1.8.1
- hope for community involvement,
- Code is not MVC structured, this has led to some problems with adoption
- We do have some people running their own instances, with some feedback ,
but we would like to eventually make the code easier to work with
- Why not port it? That’s a big can of worms, and someone needs to convince me of
the benefits.
- If for some reason we choose to no longer support connotea then the data and the code could be
hosted be someone else,
- Someone asked me what do how do they know we don’t cheat, and preferentially
return NPG articles in searches, well the code is open so if you are that paranoid
you can go and run an instance yourself and check up on us.
http://www.connotea.org/user/IanMulvany

   http://www.connotea.org/users/tag/scifoo


  http://www.connotea.org/user/IanMulvany/tag/scifoo


http://www.connotea.org/user/IanMulvany/tag/science


   http://www.connotea.org/user/IanMulvany/tag/
               science2.0+citation

Example of calls to query the data, html output
http://www.connotea.org/data/user/IanMulvany

      http://www.connotea.org/data/users/tag/scifoo


 http://www.connotea.org/data/user/IanMulvany/tag/scifoo

      http://www.connotea.org/data/user/IanMulvany/tag/
                          science

 http://www.connotea.org/data/user/IanMulvany/tag/
 science2.0+citation

Example of API calls
(you don’t have to type them in green when making the call)
http://www.connotea.org/rss/user/IanMulvany

       http://www.connotea.org/rss/users/tag/scifoo


   http://www.connotea.org/rss/user/IanMulvany/tag/scifoo


 http://www.connotea.org/rss/user/IanMulvany/tag/science

 http://www.connotea.org/rss/user/IanMulvany/tag/
 science2.0+citation

Example of RSS calls
(you don’t have to type them in green when making the call)

We create an rss feed of everything
Thousands
                               Ja
                                  n




                                                 100
                                                       200
                                                             300
                                                                   400
                                                                         500
                                                                                                          600




                                             0
                                    -0
                               M       5
                                   ar
                                      -0
                               M         5
                                ay
                                    -0
                                       5
                                Ju
                                   l-0
                               Se 5




Growth in Connotea bookmarks
                                  p-
                                     0
                               N 5
                                ov
                                    -0
                               Ja 5
                                  n-
                                     0
                                                                               Entries in All Libraries




                               M 6
                                 ar
                                    -0
                               M 6
                                ay
                                    -0
                                       6
                                Ju
                                   l-0
                               Se 6
                                  p-
                                     0
                               N 6
                                ov
                                    -0
                                                                                                                Bookmark Growth in Connotea




                               Ja 6
                                  n-
                                     0
                               M 7
                                 ar
                                    -0
                               M 7
                                ay
                                    -0
                                       7
                                Ju
                                   l-0
                               Se 7
                                  p-
                                     0
                               N 7
                                ov
                                    -0
                               Ja 7
                                  n-
                                     0
                               M 8
                                 ar
                                    -0
                                       8
Mirko Gontek at the university of Colonge
information visualization of links in connotea

These social links can create networks of information on top of the basic
information.

This is what we want to use to start building collaborative intelligence into
these systems.

More Related Content

What's hot

News gator socialsites_enhancements
News gator socialsites_enhancementsNews gator socialsites_enhancements
News gator socialsites_enhancements
Sankaran D
 
Organizations Social Media Web 2 0 Approach
Organizations Social Media   Web 2 0 ApproachOrganizations Social Media   Web 2 0 Approach
Organizations Social Media Web 2 0 Approach
Juhani Anttila
 
Bodleian 2.0? Summary and Findings
Bodleian 2.0? Summary and FindingsBodleian 2.0? Summary and Findings
Bodleian 2.0? Summary and Findings
Helen_hq
 
Web 2.0 Core Concepts, Applications, and Implications
Web 2.0 Core Concepts, Applications, and ImplicationsWeb 2.0 Core Concepts, Applications, and Implications
Web 2.0 Core Concepts, Applications, and Implications
Tomáš Pitner
 

What's hot (18)

Tics Article 6 Ideas
Tics Article 6 IdeasTics Article 6 Ideas
Tics Article 6 Ideas
 
Frydenberg Web20 Scu09
Frydenberg Web20 Scu09Frydenberg Web20 Scu09
Frydenberg Web20 Scu09
 
Semantic Web & Web 3.0 – Eine Einführung
Semantic Web & Web 3.0 – Eine EinführungSemantic Web & Web 3.0 – Eine Einführung
Semantic Web & Web 3.0 – Eine Einführung
 
Companies benefit from Web 2.0 investment
Companies benefit from Web 2.0 investmentCompanies benefit from Web 2.0 investment
Companies benefit from Web 2.0 investment
 
Web 2.0 In The Enterprise
Web 2.0 In The EnterpriseWeb 2.0 In The Enterprise
Web 2.0 In The Enterprise
 
News gator socialsites_enhancements
News gator socialsites_enhancementsNews gator socialsites_enhancements
News gator socialsites_enhancements
 
The Evolution of Web 3.0
The Evolution of Web 3.0The Evolution of Web 3.0
The Evolution of Web 3.0
 
The web phenomenon
The web phenomenonThe web phenomenon
The web phenomenon
 
Fyronic seminar-software factorymeeting-sls
Fyronic seminar-software factorymeeting-slsFyronic seminar-software factorymeeting-sls
Fyronic seminar-software factorymeeting-sls
 
Using secured social networks for crisis management
Using secured social networks for crisis managementUsing secured social networks for crisis management
Using secured social networks for crisis management
 
AD306 - Turbocharge Your Enterprise Social Network With Analytics
AD306 - Turbocharge Your Enterprise Social Network With AnalyticsAD306 - Turbocharge Your Enterprise Social Network With Analytics
AD306 - Turbocharge Your Enterprise Social Network With Analytics
 
Organizations Social Media Web 2 0 Approach
Organizations Social Media   Web 2 0 ApproachOrganizations Social Media   Web 2 0 Approach
Organizations Social Media Web 2 0 Approach
 
Real-Time Marketing in a world of Search and Social
Real-Time Marketing in a world of Search and SocialReal-Time Marketing in a world of Search and Social
Real-Time Marketing in a world of Search and Social
 
Bodleian 2.0? Summary and Findings
Bodleian 2.0? Summary and FindingsBodleian 2.0? Summary and Findings
Bodleian 2.0? Summary and Findings
 
Share point 2013 finally getting social
Share point 2013   finally getting socialShare point 2013   finally getting social
Share point 2013 finally getting social
 
IBM Connections - Bridging the Gap (delivered at DanNotes, Nov 2011)
IBM Connections - Bridging the Gap (delivered at DanNotes, Nov 2011)IBM Connections - Bridging the Gap (delivered at DanNotes, Nov 2011)
IBM Connections - Bridging the Gap (delivered at DanNotes, Nov 2011)
 
Web 2.0 Core Concepts, Applications, and Implications
Web 2.0 Core Concepts, Applications, and ImplicationsWeb 2.0 Core Concepts, Applications, and Implications
Web 2.0 Core Concepts, Applications, and Implications
 
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic WebMulti-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
 

Similar to Manvsmachinewithnotes

Web 2.0
Web 2.0Web 2.0
Web 2.0
gypsy
 
Web 3.0: The Upcoming Revolution
Web 3.0: The Upcoming RevolutionWeb 3.0: The Upcoming Revolution
Web 3.0: The Upcoming Revolution
Nitin Godawat
 
3 05564736
3 055647363 05564736
3 05564736
School
 

Similar to Manvsmachinewithnotes (20)

Science and Web2.0
Science and Web2.0Science and Web2.0
Science and Web2.0
 
Enterprise 2.0 in practice
Enterprise 2.0 in practiceEnterprise 2.0 in practice
Enterprise 2.0 in practice
 
Web 2.0
Web 2.0Web 2.0
Web 2.0
 
Web 2.0
Web 2.0Web 2.0
Web 2.0
 
Web 2.0 and Web 3.0
Web 2.0 and Web 3.0Web 2.0 and Web 3.0
Web 2.0 and Web 3.0
 
Evolution Towards Web 3.0: The Semantic Web
Evolution Towards Web 3.0: The Semantic WebEvolution Towards Web 3.0: The Semantic Web
Evolution Towards Web 3.0: The Semantic Web
 
Cbb Training Session 1
Cbb Training   Session 1Cbb Training   Session 1
Cbb Training Session 1
 
Web 3.0: The Upcoming Revolution
Web 3.0: The Upcoming RevolutionWeb 3.0: The Upcoming Revolution
Web 3.0: The Upcoming Revolution
 
Semantic web
Semantic webSemantic web
Semantic web
 
Enterprise 2.0 in Law Firms
Enterprise 2.0 in Law FirmsEnterprise 2.0 in Law Firms
Enterprise 2.0 in Law Firms
 
An imperative focus on semantic
An imperative focus on semanticAn imperative focus on semantic
An imperative focus on semantic
 
Web20Web30
Web20Web30Web20Web30
Web20Web30
 
Introduction to Web 2.0
Introduction to Web 2.0Introduction to Web 2.0
Introduction to Web 2.0
 
Intranet 2.0 - Integrating Enterprise 2.0 into your corporate intranet
Intranet 2.0 - Integrating Enterprise 2.0 into your corporate intranetIntranet 2.0 - Integrating Enterprise 2.0 into your corporate intranet
Intranet 2.0 - Integrating Enterprise 2.0 into your corporate intranet
 
web 2.0
web 2.0web 2.0
web 2.0
 
Web2.0 2007 01-29
Web2.0 2007 01-29Web2.0 2007 01-29
Web2.0 2007 01-29
 
3 05564736
3 055647363 05564736
3 05564736
 
Document of presentation(web 3.0)(part 2)
Document of presentation(web 3.0)(part 2)Document of presentation(web 3.0)(part 2)
Document of presentation(web 3.0)(part 2)
 
Web 2.0
Web 2.0Web 2.0
Web 2.0
 
Web 2.0
Web 2.0Web 2.0
Web 2.0
 

More from Ian Mulvany

Mendeley and Activity Data
Mendeley and Activity DataMendeley and Activity Data
Mendeley and Activity Data
Ian Mulvany
 

More from Ian Mulvany (11)

Mendeley and Activity Data
Mendeley and Activity DataMendeley and Activity Data
Mendeley and Activity Data
 
Unveiling the web, making the implicit explicit.
Unveiling the web, making the implicit explicit.Unveiling the web, making the implicit explicit.
Unveiling the web, making the implicit explicit.
 
Telstar cambridge-2010-07-22-im.key
Telstar cambridge-2010-07-22-im.keyTelstar cambridge-2010-07-22-im.key
Telstar cambridge-2010-07-22-im.key
 
Growing Beyond Journals, Nature Web Applications
Growing Beyond Journals, Nature Web ApplicationsGrowing Beyond Journals, Nature Web Applications
Growing Beyond Journals, Nature Web Applications
 
Potential Of Technology
Potential Of TechnologyPotential Of Technology
Potential Of Technology
 
A Cabinet Of Web2.0 Scientific Curiosities
A Cabinet Of Web2.0 Scientific CuriositiesA Cabinet Of Web2.0 Scientific Curiosities
A Cabinet Of Web2.0 Scientific Curiosities
 
Mining Surprise
Mining SurpriseMining Surprise
Mining Surprise
 
Integrating Everyting
Integrating EverytingIntegrating Everyting
Integrating Everyting
 
Manvsmachine
ManvsmachineManvsmachine
Manvsmachine
 
Digital Library Federation, Fall 07, Connotea Presentation
Digital Library Federation, Fall 07, Connotea PresentationDigital Library Federation, Fall 07, Connotea Presentation
Digital Library Federation, Fall 07, Connotea Presentation
 
BarCamb Connotea by Ian Mulvany
BarCamb Connotea by Ian MulvanyBarCamb Connotea by Ian Mulvany
BarCamb Connotea by Ian Mulvany
 

Recently uploaded

Recently uploaded (20)

Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Manvsmachinewithnotes

  • 1. Man vs Machine Main theme, Web 2.0 is as much about machine consumable as human consumable data.
  • 2. Web 1.0 Web 2.0 DoubleClick Google AdSense Ofoto Flickr Akamai BitTorrent mp3.com Napster Britannica Online Wikipedia personal websites blogging evite upcoming.org and EVDB domain name speculation search engine optimization page views cost per click screen scraping web services publishing participation CMS wikis directories (taxonomy) tagging (folksonomy) stickiness syndication The meme of Web 2.0 was influenced by comparing pre-dot com bubble companies and post dot com bubble companies. What is the difference between the list on the left and the list on the right? Let’s take the example of Brtiannica vs Wikipedia. The information in Britannica is centrally controlled. It has a relatively small number of contributors. The workload per contributor is high. Wikipedia is open to anyone to contribute. A collaboration of 1000’s can lead to a work of equal quality to a more centrally controlled method. Britannica’s revenues decreased from 650M to 50M over a 10 year period! The new sites make it easy to add information and use that information to answer or solve problems for people.
  • 3. y easy contributing hard mining easy Two key parts to Web 2.0 are easy addition of information into the system (user generated content), followed by ways of mining that information. One of the thesis that we are following by trying to work in this context is that by realizing the nature of the flow of information and the availability of ways of mining that information we can create useful solutions to real problems. Companies that find ways to do this should succeed.
  • 4. y easy contributing semantic web hard mining easy Two key parts to Web 2.0 are easy addition of information into the system (user generated content), followed by ways of mining that information. One of the thesis that we are following by trying to work in this context is that by realizing the nature of the flow of information and the availability of ways of mining that information we can create useful solutions to real problems. Companies that find ways to do this should succeed.
  • 5. y easy contributing plain text, emails semantic web hard mining easy Two key parts to Web 2.0 are easy addition of information into the system (user generated content), followed by ways of mining that information. One of the thesis that we are following by trying to work in this context is that by realizing the nature of the flow of information and the availability of ways of mining that information we can create useful solutions to real problems. Companies that find ways to do this should succeed.
  • 6. y easy plain text, emails hyperlinks views tags citations? contributing semantic web hard mining easy Two key parts to Web 2.0 are easy addition of information into the system (user generated content), followed by ways of mining that information. One of the thesis that we are following by trying to work in this context is that by realizing the nature of the flow of information and the availability of ways of mining that information we can create useful solutions to real problems. Companies that find ways to do this should succeed.
  • 7. y easy plain text, emails hyperlinks views tags citations? contributing academic papers semantic web hard mining easy Two key parts to Web 2.0 are easy addition of information into the system (user generated content), followed by ways of mining that information. One of the thesis that we are following by trying to work in this context is that by realizing the nature of the flow of information and the availability of ways of mining that information we can create useful solutions to real problems. Companies that find ways to do this should succeed.
  • 8. y easy plain text, emails hyperlinks views tags citations? contributing microformats MicroFormats academic papers semantic web hard mining easy Two key parts to Web 2.0 are easy addition of information into the system (user generated content), followed by ways of mining that information. One of the thesis that we are following by trying to work in this context is that by realizing the nature of the flow of information and the availability of ways of mining that information we can create useful solutions to real problems. Companies that find ways to do this should succeed.
  • 9. The Kind of Information that we can capture with Connotea is typical of many sites. For Connotea we have: - citation information - usage patterns, (when did an item get added to our DB, how many times has it been added) - user generated meta-data such as tags - Potentially social network information, how many of my friends have added this item?
  • 10. The Kind of Information that we can capture with Connotea is typical of many sites. For Connotea we have: - citation information - usage patterns, (when did an item get added to our DB, how many times has it been added) - user generated meta-data such as tags - Potentially social network information, how many of my friends have added this item?
  • 11. The Kind of Information that we can capture with Connotea is typical of many sites. For Connotea we have: - citation information - usage patterns, (when did an item get added to our DB, how many times has it been added) - user generated meta-data such as tags - Potentially social network information, how many of my friends have added this item?
  • 12. The Kind of Information that we can capture with Connotea is typical of many sites. For Connotea we have: - citation information - usage patterns, (when did an item get added to our DB, how many times has it been added) - user generated meta-data such as tags - Potentially social network information, how many of my friends have added this item?
  • 13. The Kind of Information that we can capture with Connotea is typical of many sites. For Connotea we have: - citation information - usage patterns, (when did an item get added to our DB, how many times has it been added) - user generated meta-data such as tags - Potentially social network information, how many of my friends have added this item?
  • 14. Gatherin Trustin Integrat Analyz Triangl g g ing ing es del.icio.us Many Web 2.0 sites, have created islands of data. Some key technologies for bridging these islands include fire eagle, OpenId and OAuth. - rfid, fire eagle point the way to merging these islands with the real world
  • 15. Whats the process? • Gathering The data • Trusting the data • Integration / Disambiguating • Understanding and analyzing the data
  • 16. DOI Some key technologies for bridging these islands include fire eagle, OpenId and OAuth. In the publishing world DOIʼs are a key technology
  • 17. Internet Cf Site or Internet Site Application OpenID cf OAuth OpenID allows a single person to interact with multiple web sites using one log-in mechanisim OAuth allows both desktop and web applications to share data using one authentication mechanisim
  • 18. Rated 5/5 Rated 1/5 Redemption Based-on-Play Android Love Refugee Spacecraft Time-Travel Soldier Famous-Score Hope Alien Blockbuster Alien Broken-Heart Blockbuster Space War Futuristic Based-on-Novel Racism Artificial-Intelligence Hero Melodrama Once you merge the data, you have to understand it. The tags that a person uses across different services can give you a more holistic picture of their interests
  • 19. However tags can be ambiguous. Some technologies that are addressing this a semantic web technologies, look at projects such as Tagora http://www.tagora-project.eu/ DBpedia http://dbpedia.org/ SIOC http://sioc-project.org/ FOAF http://www.foaf-project.org/
  • 20. Open Science Web 2.0 Semantic Web Though not exactly the same, web 2.0, Open science and the semantic web work well together and they share some common traits, namely sharing, openness and minability of information.
  • 21. Growth in submissions to the arXiv, demonstrating growth in scientific output certainly growth in output of available data online in e-format There is some discussion about whether there is an information overload, as the main journals are still the important ones, but reading habits have changed
  • 22. Discussion Groups and Mailing lists contain a huge amount of information from from snippets of computer code, to long discussions about topics. Mark Mail, from MarkLogic, have a site that mines this information. Here we see a comparison of a search for FORTRAN vs a search for Java. At the moment these kinds of archives are mainly relevant in the computer science area, but these kinds of conversations are going on all the time in every field. http://markmail.org/
  • 23. Amazon use page views and a database of user purchases to find things you might like. Again, here they are using data that they get for free from people using their site. Google page rank is another canonical example
  • 24. Crystal Eye Social/Knowledge Networking An example of two type of uses in science: CrystalEye http://wwmm.ch.cam.ac.uk/crystaleye/ example bond length for a structure: http://wwmm.ch.cam.ac.uk/crystaleye/bondlengths/H-Rb.svg Nature Network: human-human interaction
  • 25. Nature Web Publishing group OTMI The main products that we have developed so far are - database gateways - OTMI (open text mining interface) - podcasts - scintilla - nature network - nature preceedings - connotea
  • 26. There are also other tools out there that are doing the same kind of thing, but I’m partial.
  • 27. There are also other tools out there that are doing the same kind of thing, but I’m partial.
  • 28. There are also other tools out there that are doing the same kind of thing, but I’m partial.
  • 29. There are also other tools out there that are doing the same kind of thing, but I’m partial.
  • 30. There are also other tools out there that are doing the same kind of thing, but I’m partial.
  • 31. There are also other tools out there that are doing the same kind of thing, but I’m partial.
  • 32. Repository Discuss how social silo’s can be interchange locations between repositories and also between repositories and applications that we might also be built on top of the social silos.
  • 33. Repository Discuss how social silo’s can be interchange locations between repositories and also between repositories and applications that we might also be built on top of the social silos.
  • 34. Repository Discuss how social silo’s can be interchange locations between repositories and also between repositories and applications that we might also be built on top of the social silos.
  • 35. Repository Discuss how social silo’s can be interchange locations between repositories and also between repositories and applications that we might also be built on top of the social silos.
  • 36. Repository Discuss how social silo’s can be interchange locations between repositories and also between repositories and applications that we might also be built on top of the social silos.
  • 37. Repository Repository Repository Repository Repository Discuss how social silo’s can be interchange locations between repositories and also between repositories and applications that we might also be built on top of the social silos.
  • 38. Repository Repository Repository Repository Repository Discuss how social silo’s can be interchange locations between repositories and also between repositories and applications that we might also be built on top of the social silos.
  • 39. Repository Repository Repository Repository Repository Citation Pubmed Activity Management Integration Listing Discuss how social silo’s can be interchange locations between repositories and also between repositories and applications that we might also be built on top of the social silos.
  • 40. Connotea citation parsing modules This model was quick and easy to implement but using the URL as the unique key.
  • 41. Amazon.pm DOI.pm LivingReviews.pm PLoS.pm RIS.pm SpamDNSBL.pm autodiscovery.pm BibTeX.pm Dlib.pm NASA.pm PMC.pm Scitation.pm Springer.pm blog.pm Blackwell.pm Highwire.pm NPG.pm PNAS.pm Self.pm Wiley.pm ePrints.pm BmcPdf.pm Hubmed.pm OUP.pm Pubmed.pm Simple.pm arXiv.pm We have a bunch of citation modules they currently have to be written in perl, and this is a problem, there is nothing similar to the scaffold infrastructure that Zotero has
  • 42.
  • 43.
  • 44. Title
  • 45. Title
  • 48. Title Author Date
  • 49. Title Author Date
  • 50. Title Author Date PMID/DOI
  • 51. Getting data in, part 2 The meta-data from the paper has been captured When you begin to add tags suggested tags are presented based on tags you have already used paper by Huberman et all shows that displaying all tags drives tag-onomies to stable state (Polya- Renyi urn model) You need to display the full community tags, which we don’t do ... yet.
  • 52. Getting data in, part 2 The meta-data from the paper has been captured When you begin to add tags suggested tags are presented based on tags you have already used paper by Huberman et all shows that displaying all tags drives tag-onomies to stable state (Polya- Renyi urn model) You need to display the full community tags, which we don’t do ... yet.
  • 53. Getting data in, part 2 The meta-data from the paper has been captured When you begin to add tags suggested tags are presented based on tags you have already used paper by Huberman et all shows that displaying all tags drives tag-onomies to stable state (Polya- Renyi urn model) You need to display the full community tags, which we don’t do ... yet.
  • 54. user home page, toolbox, on right user tags related tags related users, groups
  • 55. user home page, toolbox, on right user tags related tags related users, groups
  • 56. user home page, toolbox, on right user tags related tags related users, groups
  • 57. Getitng data out Open Data, important Export only gets out the citation data, and not extra meta data that the user has added such as comments or tags. Formats: txt, rdf, BibTex,RIS,EndNote an api??
  • 58. Getitng data out Open Data, important Export only gets out the citation data, and not extra meta data that the user has added such as comments or tags. Formats: txt, rdf, BibTex,RIS,EndNote an api??
  • 59. perl mod_perl Template Toolkit MySQL Open Source, GPL2.5 v 1.8.1 web1.75 application Discuss reasons for OS, discuss web1.8.1 - hope for community involvement, - Code is not MVC structured, this has led to some problems with adoption - We do have some people running their own instances, with some feedback , but we would like to eventually make the code easier to work with - Why not port it? That’s a big can of worms, and someone needs to convince me of the benefits. - If for some reason we choose to no longer support connotea then the data and the code could be hosted be someone else, - Someone asked me what do how do they know we don’t cheat, and preferentially return NPG articles in searches, well the code is open so if you are that paranoid you can go and run an instance yourself and check up on us.
  • 60. http://www.connotea.org/user/IanMulvany http://www.connotea.org/users/tag/scifoo http://www.connotea.org/user/IanMulvany/tag/scifoo http://www.connotea.org/user/IanMulvany/tag/science http://www.connotea.org/user/IanMulvany/tag/ science2.0+citation Example of calls to query the data, html output
  • 61. http://www.connotea.org/data/user/IanMulvany http://www.connotea.org/data/users/tag/scifoo http://www.connotea.org/data/user/IanMulvany/tag/scifoo http://www.connotea.org/data/user/IanMulvany/tag/ science http://www.connotea.org/data/user/IanMulvany/tag/ science2.0+citation Example of API calls (you don’t have to type them in green when making the call)
  • 62. http://www.connotea.org/rss/user/IanMulvany http://www.connotea.org/rss/users/tag/scifoo http://www.connotea.org/rss/user/IanMulvany/tag/scifoo http://www.connotea.org/rss/user/IanMulvany/tag/science http://www.connotea.org/rss/user/IanMulvany/tag/ science2.0+citation Example of RSS calls (you don’t have to type them in green when making the call) We create an rss feed of everything
  • 63. Thousands Ja n 100 200 300 400 500 600 0 -0 M 5 ar -0 M 5 ay -0 5 Ju l-0 Se 5 Growth in Connotea bookmarks p- 0 N 5 ov -0 Ja 5 n- 0 Entries in All Libraries M 6 ar -0 M 6 ay -0 6 Ju l-0 Se 6 p- 0 N 6 ov -0 Bookmark Growth in Connotea Ja 6 n- 0 M 7 ar -0 M 7 ay -0 7 Ju l-0 Se 7 p- 0 N 7 ov -0 Ja 7 n- 0 M 8 ar -0 8
  • 64. Mirko Gontek at the university of Colonge information visualization of links in connotea These social links can create networks of information on top of the basic information. This is what we want to use to start building collaborative intelligence into these systems.