SlideShare a Scribd company logo
1 of 28
Ed Chamberlain
Cambridge University Library
 Cambridge Open Metadata



 Funded by the JISC Infrastructure for Resource
  Discovery Project
 Cambridge, back in 2010 …

 OKFN - Open Bibliography project
  (2010-2011)

 Debate around re-use of catalogue
  records from vendors (not just
  OCLC)

 CUL already provides public APIs

 Increasing interest in linked data

 FAST / VIAF

 Lorcan
 “The initial aim of this project will
  be to identify and release a
  substantial record set to an
  external platform under an open
  license” …

 “For OCLC-derived bibliographic
  records data will be released in a
  fashion compliant with their
  WorldCat Rights and
  Responsibilities for the OCLC
  Cooperative” …

 “The project aims to then deploy
  and test and number of
  technologies and methodologies
  for releasing open bibliographic
  data including XML, RDF, SPARQL,
  and JSON” …
 Cambridge University Library
        Metadata conversion
        Development
        Project management


       CARET
        Infrastructure support


       OCLC
        Licensing consultancy
        FAST / VIAFF enrichment
 Value for money – Taxpayers

 Open data = affiliate marketing for
  our collections

 Drive innovation - vital buy-in from
  non library developer communities

 One of many open data projects at
  the time
 “Library catalogues have imposed on them librarian or supplier-made
  decisions about what can/can’t be searched and in what way. Some of these
  decisions are limited by current cataloguing rules, but not all; often the data is
  recorded, but not in a usable way, or is there but isn’t tapped by the
  interface. For example, in most catalogues you can limit by publication type to
  newspapers, but you can’t limit by frequency of the issues.”

 “Releasing data means that people can start to use it in the way they want to.”
 Most of the catalogue (3 million +)
     Bulk downloads of RDF triples
     Query-able ‘endpoints’
     Fast / VIAF enriched
     Snapshot


 RDF conversion tools

 Working model and code to decide
  on MArc21 record origin

 Codebase for ‘library centric’ RDF
  publishing website

 SPARQL tutorial

 Verbose blog
                                        Data and code at http://data.lib.cam.ac.uk
 Examine contracts with major
  vendors

 Contact them and decide on re-use
  conditions

 Deduce record origin from Marc21
  fields
 Several places in Marc21 where
  this data could be held
  (015,035,038,994 …)

 Logic and hierarchy for
  examination

 Attempt at scripted analysis

 Marc21 fails at ‘IPR’

 Potential down the line for
  problem to persist if attribution is
  not handled correctly in future
  formats
Need the right
license!
 Most vendors happy with
  permissive license for ‘non-
  marc21’ formats

 RLUK / BL B.N.B. – Public Domain
  Data License

 OCLC – ODC-By Attribution license
  with community norms
 RDF allows you to freely mix
  vocabularies

 Emerging consensus on
  bibliographic description

 BL and others leading the way

 Victory for pragmatism?
 Punctuation as a function

 Binary encoding

 Numbers for field names

 Bad characters

 Replication of data in fields
 PHP script to match text against
  LOC subject headings – enrich with
  LOC GUID

 FAST / VIAF enrichment courtesy
  of OCLC
No. of records:               3,658,384
No. of records with LCSH
headings:                     2,709,878
Percentage with LCSH
headings:                     74%

No. of subject headings found: 5,889,048
No. of subject headings
skipped:                      45

Valid FAST subjects:          8,134,230
 Marc / AACR2 cannot translate
  easily to semantically rich
  formats

 Libraries need to better utilise
  modern container / transfer
  standards (not necessarily RDF)

 No ‘one size fits all’ approach for
  future
Karen Coyle criticises the Marc21 Bibliographic Framework Transition Initiative
for not including museums, publishing, and IT professionals …

She argues that our data is not just for us to consume alone …

  “The next data carrier for libraries needs to be developed as a truly open effort.
ItSteeringbe led byand Marc organization (possibly ad hoc) that can bring
    should for RDA a neutral
   replacement needs non-librarian
 together the wide range of interested parties and make sure that all voices are
 heard. or ownership
   input Technical development should be done by computer professionals with
 expertise in metadata design. The resulting system should be rigorous yet flexible
 enough to allow growth and specialization.”


http://kcoyle.blogspot.com/2011/08/bibliographic-framework-transition.html
Open Bibliography 2
Lightweight approach to sharing
bibliography now its open …
   Bottom up, community led software
    called Bibserver
   Wikimedia for bib data
   JSON as a container format – flexible, able
    to cope with different structures,
    vocabularies etc.
   Engagement with UK PubMed Central



C.L.O.C.K. (Cambridge/ Lincoln open
cataloguing knowledgebase)
New approaches to traditional library
workflows (copy cataloguing) using
open data
    Using rich open data to enrich bare bones
     data
    NOSQL database technology
    APIs as key deliverables
FAST subject
                          Language      Place of     headings
                                       publication

                                                     LCSH subject
                                                     headings
 Special          Archives           Bibliographic
collections
                                                      Creator / entity
                               Holdings
              Libraries

Librarians
               Course lists      Transactions
 Anonymous usage data from
  circulation systems

 Aggregated from several University
  Libraries

 API feed

 Available openly (CC-BY )
 It becomes (even) easier to go to
  Amazon

 Our status as authoritative data
  providers will be (further) eroded

 Assume we can

 Assume we should (where we can)
 http://www.discovery.ac.uk -
  Discovery

 Ncg4lib mailing list

 http://okfn.org - Open Knowledge
  Foundation

 http://data.lib.cam.ac.uk
 Ed Chamberlain

   @edchamberlain
   emc59@cam.ac.uk
   http://www.slideshare.net/EdmundChamberlain/

More Related Content

What's hot

External CV support in Dataverse 5.7
External CV support in Dataverse 5.7External CV support in Dataverse 5.7
External CV support in Dataverse 5.7vty
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked DataEUCLID project
 
Building Linked Data Applications
Building Linked Data ApplicationsBuilding Linked Data Applications
Building Linked Data ApplicationsEUCLID project
 
Open Knowledge Foundation Edinburgh meet-up #3
Open Knowledge Foundation Edinburgh meet-up #3Open Knowledge Foundation Edinburgh meet-up #3
Open Knowledge Foundation Edinburgh meet-up #3Gill Hamilton
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Andrea Scharnhorst
 
20080917 Rev
20080917 Rev20080917 Rev
20080917 Revcharper
 
Site Interoperability Projects at DERI Galway's SW Cluster
Site Interoperability Projects at DERI Galway's SW ClusterSite Interoperability Projects at DERI Galway's SW Cluster
Site Interoperability Projects at DERI Galway's SW ClusterJohn Breslin
 
Microtask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataMicrotask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataEUCLID project
 
PRELIDA Project Draft Roadmap
PRELIDA Project Draft RoadmapPRELIDA Project Draft Roadmap
PRELIDA Project Draft RoadmapPRELIDA Project
 
Metadata for your Digital Collections
Metadata for your Digital CollectionsMetadata for your Digital Collections
Metadata for your Digital CollectionsJenn Riley
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataEUCLID project
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...vty
 
Structured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product StackStructured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product StackMike Bergman
 
Presentation: mashing up ontologies
Presentation: mashing up ontologiesPresentation: mashing up ontologies
Presentation: mashing up ontologiesLIBIS
 
RDF and Open Linked Data, a first approach
RDF and Open Linked Data, a first approachRDF and Open Linked Data, a first approach
RDF and Open Linked Data, a first approachhorvadam
 
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple CountThe RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple CountLeigh Dodds
 
Semantic Web and Linked Data for cultural heritage materials - Approaches in ...
Semantic Web and Linked Data for cultural heritage materials - Approaches in ...Semantic Web and Linked Data for cultural heritage materials - Approaches in ...
Semantic Web and Linked Data for cultural heritage materials - Approaches in ...Antoine Isaac
 

What's hot (20)

Semantic Web Nature
Semantic Web NatureSemantic Web Nature
Semantic Web Nature
 
External CV support in Dataverse 5.7
External CV support in Dataverse 5.7External CV support in Dataverse 5.7
External CV support in Dataverse 5.7
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked Data
 
Building Linked Data Applications
Building Linked Data ApplicationsBuilding Linked Data Applications
Building Linked Data Applications
 
Open Knowledge Foundation Edinburgh meet-up #3
Open Knowledge Foundation Edinburgh meet-up #3Open Knowledge Foundation Edinburgh meet-up #3
Open Knowledge Foundation Edinburgh meet-up #3
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
 
20080917 Rev
20080917 Rev20080917 Rev
20080917 Rev
 
Site Interoperability Projects at DERI Galway's SW Cluster
Site Interoperability Projects at DERI Galway's SW ClusterSite Interoperability Projects at DERI Galway's SW Cluster
Site Interoperability Projects at DERI Galway's SW Cluster
 
Microtask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataMicrotask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked Data
 
PRELIDA Project Draft Roadmap
PRELIDA Project Draft RoadmapPRELIDA Project Draft Roadmap
PRELIDA Project Draft Roadmap
 
Providing Linked Data
Providing Linked DataProviding Linked Data
Providing Linked Data
 
Metadata for your Digital Collections
Metadata for your Digital CollectionsMetadata for your Digital Collections
Metadata for your Digital Collections
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
 
Structured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product StackStructured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product Stack
 
Presentation: mashing up ontologies
Presentation: mashing up ontologiesPresentation: mashing up ontologies
Presentation: mashing up ontologies
 
RDF and Open Linked Data, a first approach
RDF and Open Linked Data, a first approachRDF and Open Linked Data, a first approach
RDF and Open Linked Data, a first approach
 
RDF and Java
RDF and JavaRDF and Java
RDF and Java
 
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple CountThe RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
 
Semantic Web and Linked Data for cultural heritage materials - Approaches in ...
Semantic Web and Linked Data for cultural heritage materials - Approaches in ...Semantic Web and Linked Data for cultural heritage materials - Approaches in ...
Semantic Web and Linked Data for cultural heritage materials - Approaches in ...
 

Viewers also liked

Cambridge university library ess update for ucs
Cambridge university library  ess update for ucsCambridge university library  ess update for ucs
Cambridge university library ess update for ucsEdmund Chamberlain
 
Open (linked) bibliographic data
Open (linked) bibliographic dataOpen (linked) bibliographic data
Open (linked) bibliographic dataEdmund Chamberlain
 
DeCosta Properties Listing Presentation
DeCosta Properties Listing PresentationDeCosta Properties Listing Presentation
DeCosta Properties Listing PresentationDeCostaProperties
 
Aula virtual
Aula virtualAula virtual
Aula virtualsalsa558
 
Debt Outlook Negative
Debt Outlook NegativeDebt Outlook Negative
Debt Outlook NegativeMike Plant
 
Future Search 2011
Future Search 2011Future Search 2011
Future Search 2011HZMCI
 
Developments in catalogues and data sharing
Developments in catalogues and data sharingDevelopments in catalogues and data sharing
Developments in catalogues and data sharingEdmund Chamberlain
 
WRCCISD Technology Plan
WRCCISD Technology PlanWRCCISD Technology Plan
WRCCISD Technology Planangelmc43
 
Portfolio Julie Ariens
Portfolio Julie AriensPortfolio Julie Ariens
Portfolio Julie AriensJulie Ann
 
State of fusion
State of fusionState of fusion
State of fusionGieljan
 
WRCCISD Technology Plan
WRCCISD Technology PlanWRCCISD Technology Plan
WRCCISD Technology Planangelmc43
 

Viewers also liked (16)

Text to data
Text to dataText to data
Text to data
 
Cambridge university library ess update for ucs
Cambridge university library  ess update for ucsCambridge university library  ess update for ucs
Cambridge university library ess update for ucs
 
Linked data and voyager
Linked data and voyagerLinked data and voyager
Linked data and voyager
 
Open (linked) bibliographic data
Open (linked) bibliographic dataOpen (linked) bibliographic data
Open (linked) bibliographic data
 
DeCosta Properties Listing Presentation
DeCosta Properties Listing PresentationDeCosta Properties Listing Presentation
DeCosta Properties Listing Presentation
 
Aula virtual
Aula virtualAula virtual
Aula virtual
 
Debt Outlook Negative
Debt Outlook NegativeDebt Outlook Negative
Debt Outlook Negative
 
The kove
The koveThe kove
The kove
 
Future Search 2011
Future Search 2011Future Search 2011
Future Search 2011
 
Developments in catalogues and data sharing
Developments in catalogues and data sharingDevelopments in catalogues and data sharing
Developments in catalogues and data sharing
 
Sharing data
Sharing dataSharing data
Sharing data
 
WRCCISD Technology Plan
WRCCISD Technology PlanWRCCISD Technology Plan
WRCCISD Technology Plan
 
Portfolio Julie Ariens
Portfolio Julie AriensPortfolio Julie Ariens
Portfolio Julie Ariens
 
CreativeBloc 2011
CreativeBloc 2011CreativeBloc 2011
CreativeBloc 2011
 
State of fusion
State of fusionState of fusion
State of fusion
 
WRCCISD Technology Plan
WRCCISD Technology PlanWRCCISD Technology Plan
WRCCISD Technology Plan
 

Similar to Comet project

Rdf and open linked data a first approach
Rdf and open linked data a first approach Rdf and open linked data a first approach
Rdf and open linked data a first approach @CULT Srl
 
Ifla swsig meeting - Puerto Rico - 20110817
Ifla swsig meeting - Puerto Rico - 20110817Ifla swsig meeting - Puerto Rico - 20110817
Ifla swsig meeting - Puerto Rico - 20110817Figoblog
 
AnotherTest
AnotherTestAnotherTest
AnotherTestZarksaDS
 
Cataloger 3.0: Competencies and Education for the BIBFRAME Catalog
Cataloger 3.0: Competencies and Education for the BIBFRAME CatalogCataloger 3.0: Competencies and Education for the BIBFRAME Catalog
Cataloger 3.0: Competencies and Education for the BIBFRAME CatalogAllison Jai O'Dell
 
Semantic Web Technologies: Changing Bibliographic Descriptions?
Semantic Web Technologies: Changing Bibliographic Descriptions?Semantic Web Technologies: Changing Bibliographic Descriptions?
Semantic Web Technologies: Changing Bibliographic Descriptions?Stuart Weibel
 
Open for Business Open Archives, OpenURL, RSS and the Dublin Core
Open for Business  Open Archives, OpenURL, RSS and the Dublin CoreOpen for Business  Open Archives, OpenURL, RSS and the Dublin Core
Open for Business Open Archives, OpenURL, RSS and the Dublin CoreAndy Powell
 
2015 02 19 platforms and discovery
2015 02 19 platforms and discovery2015 02 19 platforms and discovery
2015 02 19 platforms and discoveryStephen Abram
 
Descubrimiento, entrega de información y gestión: tendencias actuales de las ...
Descubrimiento, entrega de información y gestión: tendencias actuales de las ...Descubrimiento, entrega de información y gestión: tendencias actuales de las ...
Descubrimiento, entrega de información y gestión: tendencias actuales de las ...innovatics
 
Digital Library Applications Of Social Networking Jeju Intl Conference
Digital Library Applications Of Social Networking Jeju Intl ConferenceDigital Library Applications Of Social Networking Jeju Intl Conference
Digital Library Applications Of Social Networking Jeju Intl Conferenceguestbba8ac
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and ResearchCorrib.org - OpenSource and Research
Corrib.org - OpenSource and Researchadameq
 
Cornell20080516
Cornell20080516Cornell20080516
Cornell20080516charper
 
The Role of Discovery and its Relationship with the ILS
The Role of Discovery and its Relationship with the ILSThe Role of Discovery and its Relationship with the ILS
The Role of Discovery and its Relationship with the ILSCharleston Conference
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
 

Similar to Comet project (20)

Breeding, "Closing Presentation: Where can we go from here?"
Breeding, "Closing Presentation: Where can we go from here?"Breeding, "Closing Presentation: Where can we go from here?"
Breeding, "Closing Presentation: Where can we go from here?"
 
Rdf and open linked data a first approach
Rdf and open linked data a first approach Rdf and open linked data a first approach
Rdf and open linked data a first approach
 
Ifla swsig meeting - Puerto Rico - 20110817
Ifla swsig meeting - Puerto Rico - 20110817Ifla swsig meeting - Puerto Rico - 20110817
Ifla swsig meeting - Puerto Rico - 20110817
 
1530 mon lomond breeding
1530 mon lomond breeding1530 mon lomond breeding
1530 mon lomond breeding
 
AnotherTest
AnotherTestAnotherTest
AnotherTest
 
Cataloger 3.0: Competencies and Education for the BIBFRAME Catalog
Cataloger 3.0: Competencies and Education for the BIBFRAME CatalogCataloger 3.0: Competencies and Education for the BIBFRAME Catalog
Cataloger 3.0: Competencies and Education for the BIBFRAME Catalog
 
Linked library data
Linked library dataLinked library data
Linked library data
 
Semantic Web Technologies: Changing Bibliographic Descriptions?
Semantic Web Technologies: Changing Bibliographic Descriptions?Semantic Web Technologies: Changing Bibliographic Descriptions?
Semantic Web Technologies: Changing Bibliographic Descriptions?
 
Open for Business Open Archives, OpenURL, RSS and the Dublin Core
Open for Business  Open Archives, OpenURL, RSS and the Dublin CoreOpen for Business  Open Archives, OpenURL, RSS and the Dublin Core
Open for Business Open Archives, OpenURL, RSS and the Dublin Core
 
2015 02 19 platforms and discovery
2015 02 19 platforms and discovery2015 02 19 platforms and discovery
2015 02 19 platforms and discovery
 
Descubrimiento, entrega de información y gestión: tendencias actuales de las ...
Descubrimiento, entrega de información y gestión: tendencias actuales de las ...Descubrimiento, entrega de información y gestión: tendencias actuales de las ...
Descubrimiento, entrega de información y gestión: tendencias actuales de las ...
 
Digital Libraries of the Future
Digital Libraries of the Future
Digital Libraries of the Future
Digital Libraries of the Future
 
Digital Library Applications Of Social Networking Jeju Intl Conference
Digital Library Applications Of Social Networking Jeju Intl ConferenceDigital Library Applications Of Social Networking Jeju Intl Conference
Digital Library Applications Of Social Networking Jeju Intl Conference
 
Digital Library Applications Of Social Networking
Digital Library Applications Of Social Networking  Digital Library Applications Of Social Networking
Digital Library Applications Of Social Networking
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and ResearchCorrib.org - OpenSource and Research
Corrib.org - OpenSource and Research
 
The Danish National Bibliography as LOD
The Danish National Bibliography as LODThe Danish National Bibliography as LOD
The Danish National Bibliography as LOD
 
Cornell20080516
Cornell20080516Cornell20080516
Cornell20080516
 
The Role of Discovery and its Relationship with the ILS
The Role of Discovery and its Relationship with the ILSThe Role of Discovery and its Relationship with the ILS
The Role of Discovery and its Relationship with the ILS
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
Paradigm Shift: A Slate of New Automation Platforms address Current and Futur...
Paradigm Shift: A Slate of New Automation Platforms address Current and Futur...Paradigm Shift: A Slate of New Automation Platforms address Current and Futur...
Paradigm Shift: A Slate of New Automation Platforms address Current and Futur...
 

Recently uploaded

ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...KokoStevan
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 

Recently uploaded (20)

ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 

Comet project

  • 2.  Cambridge Open Metadata  Funded by the JISC Infrastructure for Resource Discovery Project
  • 3.  Cambridge, back in 2010 …  OKFN - Open Bibliography project (2010-2011)  Debate around re-use of catalogue records from vendors (not just OCLC)  CUL already provides public APIs  Increasing interest in linked data  FAST / VIAF  Lorcan
  • 4.  “The initial aim of this project will be to identify and release a substantial record set to an external platform under an open license” …  “For OCLC-derived bibliographic records data will be released in a fashion compliant with their WorldCat Rights and Responsibilities for the OCLC Cooperative” …  “The project aims to then deploy and test and number of technologies and methodologies for releasing open bibliographic data including XML, RDF, SPARQL, and JSON” …
  • 5.  Cambridge University Library  Metadata conversion  Development  Project management  CARET  Infrastructure support  OCLC  Licensing consultancy  FAST / VIAFF enrichment
  • 6.  Value for money – Taxpayers  Open data = affiliate marketing for our collections  Drive innovation - vital buy-in from non library developer communities  One of many open data projects at the time
  • 7.  “Library catalogues have imposed on them librarian or supplier-made decisions about what can/can’t be searched and in what way. Some of these decisions are limited by current cataloguing rules, but not all; often the data is recorded, but not in a usable way, or is there but isn’t tapped by the interface. For example, in most catalogues you can limit by publication type to newspapers, but you can’t limit by frequency of the issues.”  “Releasing data means that people can start to use it in the way they want to.”
  • 8.  Most of the catalogue (3 million +)  Bulk downloads of RDF triples  Query-able ‘endpoints’  Fast / VIAF enriched  Snapshot  RDF conversion tools  Working model and code to decide on MArc21 record origin  Codebase for ‘library centric’ RDF publishing website  SPARQL tutorial  Verbose blog Data and code at http://data.lib.cam.ac.uk
  • 9.
  • 10.
  • 11.  Examine contracts with major vendors  Contact them and decide on re-use conditions  Deduce record origin from Marc21 fields
  • 12.  Several places in Marc21 where this data could be held (015,035,038,994 …)  Logic and hierarchy for examination  Attempt at scripted analysis  Marc21 fails at ‘IPR’  Potential down the line for problem to persist if attribution is not handled correctly in future formats
  • 13.
  • 14. Need the right license!  Most vendors happy with permissive license for ‘non- marc21’ formats  RLUK / BL B.N.B. – Public Domain Data License  OCLC – ODC-By Attribution license with community norms
  • 15.
  • 16.  RDF allows you to freely mix vocabularies  Emerging consensus on bibliographic description  BL and others leading the way  Victory for pragmatism?
  • 17.  Punctuation as a function  Binary encoding  Numbers for field names  Bad characters  Replication of data in fields
  • 18.  PHP script to match text against LOC subject headings – enrich with LOC GUID  FAST / VIAF enrichment courtesy of OCLC
  • 19. No. of records: 3,658,384 No. of records with LCSH headings: 2,709,878 Percentage with LCSH headings: 74% No. of subject headings found: 5,889,048 No. of subject headings skipped: 45 Valid FAST subjects: 8,134,230
  • 20.  Marc / AACR2 cannot translate easily to semantically rich formats  Libraries need to better utilise modern container / transfer standards (not necessarily RDF)  No ‘one size fits all’ approach for future
  • 21. Karen Coyle criticises the Marc21 Bibliographic Framework Transition Initiative for not including museums, publishing, and IT professionals … She argues that our data is not just for us to consume alone … “The next data carrier for libraries needs to be developed as a truly open effort. ItSteeringbe led byand Marc organization (possibly ad hoc) that can bring should for RDA a neutral replacement needs non-librarian together the wide range of interested parties and make sure that all voices are heard. or ownership input Technical development should be done by computer professionals with expertise in metadata design. The resulting system should be rigorous yet flexible enough to allow growth and specialization.” http://kcoyle.blogspot.com/2011/08/bibliographic-framework-transition.html
  • 22.
  • 23. Open Bibliography 2 Lightweight approach to sharing bibliography now its open …  Bottom up, community led software called Bibserver  Wikimedia for bib data  JSON as a container format – flexible, able to cope with different structures, vocabularies etc.  Engagement with UK PubMed Central C.L.O.C.K. (Cambridge/ Lincoln open cataloguing knowledgebase) New approaches to traditional library workflows (copy cataloguing) using open data  Using rich open data to enrich bare bones data  NOSQL database technology  APIs as key deliverables
  • 24. FAST subject Language Place of headings publication LCSH subject headings Special Archives Bibliographic collections Creator / entity Holdings Libraries Librarians Course lists Transactions
  • 25.  Anonymous usage data from circulation systems  Aggregated from several University Libraries  API feed  Available openly (CC-BY )
  • 26.  It becomes (even) easier to go to Amazon  Our status as authoritative data providers will be (further) eroded  Assume we can  Assume we should (where we can)
  • 27.  http://www.discovery.ac.uk - Discovery  Ncg4lib mailing list  http://okfn.org - Open Knowledge Foundation  http://data.lib.cam.ac.uk
  • 28.  Ed Chamberlain  @edchamberlain  emc59@cam.ac.uk  http://www.slideshare.net/EdmundChamberlain/

Editor's Notes

  1. Respond to academic / national demand for Open Data – previously given some to the Open Bibliography projectGet our data to non-librarians and provicdeTax-payer value-for-moneyCUL already provides public APIsGain in-house experience of RDFMove library services forward
  2. This is my colleague Katies’ write up of a talk lead by Owen Stephens it really sums it all up …
  3. See if there were any expressive contractual clauses saying we could not redistribute
  4. Where does a record come from ? – practically quite hard to determine …Several places in Marc21 where this data could be held …Logic for examinationAttempt at scripted analysis – list bib_ids by record vendor
  5. Most vendors happy with permissive license for ‘non-marc21’ formats - Non marc thing is not an issue in this context, no one outside of library land cares about a load of binary encoded numbers … we are re-purposing Marc originated data for a wider audienceRLUK / BL BNB – PDDL OCLC – ODC-By Attribution licenseNo good reason not to re-publish – need the right license!
  6. RDF allows you to freely mix vocabularies – choices of fields to describe your dataEmerging consensus on bibliographic description - thankfully no-one is attempting to recreate Marc, mainly a use of Qualified Dublin Core, FOAFAnd other relevant bibliographic focused vocabularies. There may never emrgeseuch a Our conversion script is CSV customisableBL and others leading the way on vocab choice – they did some great data modelling, which we stayed clear ofIts my personal hope that we never see a heavyweight approach of the style of Marc again. As we move forward with new container formats, pragmatism needs to rule over completionism if we are to successfully share our valuable data with a wider user base.
  7. PHP script to match text against LOC subject headings – enrich with LOC GUIDFAST / VIAF enrichment courtesy of OCLC FAST – next generation subject headings – very excitingVIAF – Virtual International Authority FileOCLC want to develop these as linked services, keen to help.
  8. Marc / AACR2 cannot translate will to semantically rich formats Need better container / transfer standards (not necessarily RDF)
  9. So despite the change its my worry that those in charge of Marc21 and RDA developments arenot thinking widely enough about the new open ecosystem in which our data must inhabit
  10. Two projects, focused less on data release and license and more about exploiting its value in an open environment
  11. If we don’t try and shift …It becomes easier to go to Amazon – who have awesome API’sOr even Google books (theirs are rubbish)Our status as an authority of data providers will be further erodedNo-one will want to play with us if we do not share