Making the most of metadata Feb 2014 - BNB Linked Data Updatenw13
Presentation given at the 'Making the Most of Metadata' BL Labs event at the British Library, London in February 2014. Provides an update on the BNB LOD service.
Presentation for the OCLC Linked Data Roundtable event for IFLA Helsinki 2012. Covers the reasoning behind the BL's linked open data version of the British National Bibliography, the processes needed to create the service and challenges to be addressed.
Corpus Protocols IFLA Geneva August 2014 by Neil Smyth and Stella WisdomStella Wisdom
"Corpus Protocols: digital transformations of commercial newspaper collections for text and data mining to support academic research"
Presented by Neil Smyth and Stella Wisdom, at the IFLA 2014 Pre-Conference; "Digital Transformation and the Changing Role of News Media in the 21st Century" held at ITU, Geneva, August 13-14, 2014
Making the most of metadata Feb 2014 - BNB Linked Data Updatenw13
Presentation given at the 'Making the Most of Metadata' BL Labs event at the British Library, London in February 2014. Provides an update on the BNB LOD service.
Presentation for the OCLC Linked Data Roundtable event for IFLA Helsinki 2012. Covers the reasoning behind the BL's linked open data version of the British National Bibliography, the processes needed to create the service and challenges to be addressed.
Corpus Protocols IFLA Geneva August 2014 by Neil Smyth and Stella WisdomStella Wisdom
"Corpus Protocols: digital transformations of commercial newspaper collections for text and data mining to support academic research"
Presented by Neil Smyth and Stella Wisdom, at the IFLA 2014 Pre-Conference; "Digital Transformation and the Changing Role of News Media in the 21st Century" held at ITU, Geneva, August 13-14, 2014
BHL Technical Projects Update presented during the BHL Stafff and Technical Meeting on September 26-27, 2012 at the Harvard Museum of Comparative Zoology
Cambridge, Massachusetts
The Biodiversity Heritage Library and bibliographic citations: towards new u...Trish Rose-Sandler
The data model and user interface for the Biodiversity Heritage Library (BHL) portal at http://www.biodiversitylibrary.org/ was originally designed to accommodate books and journals found in botanical garden libraries and natural history museums. As the size and reputation of the BHL grew, there were many publishers and individuals who wanted to contribute to the BHL but their content consisted of publication types at more granular levels, such as articles, book chapters, and dissertations. In order to ingest and serve these materials, in early 2011, BHL launched a separate portal called Citebank hosted at citebank.org. Currently, Citebank contains over 180,000 citations linked to content files, either hosted at citebank.org or hosted externally. While feedback on Citebank has been positive, users indicated a desire to combine both the services of the BHL portal and the services of the Citebank portal into a single interface in order to enable a unified search for all biodiversity literature. To respond to these needs, the BHL has begun expansion of its data model in the BHL portal to accommodate articles, book chapters, treatments and other segment-like material so that they can be searched alongside its traditional book and journal content. Parallel to this activity the NSF-funded Global Names Architecture (GNA) Project has enlisted Citebank to fulfill the role of a global biodiversity repository for bibliographic citations. In support of this, Citebank will provide a key functional component to the GNA - that of reconciliation services for citations. Once reconciled, citations can be linked either to scanned page images in the BHL, or to PDFs uploaded by users. If neither exists, citations can point to other digital representations online. Experience with Citebank has resulted in many lessons learned about working with diverse publication types; data formats; and contributors with varying levels of technical competencies. Those lessons were incorporated into a functional requirements document that is being used to inform development of the BHL data model. This talk will outline the functional requirements needed for a global citation repository for biodiversity and how those requirements will better serve the needs of the biodiversity community.
Tsakonas-Robbio·Open Bibliographic Data E-LisLIS EPI Meeting
1st Workshop of Transfer Information for Innovation · 3rd November 2011 · Valencia. Robbio, Antonella De; Tsakonas, Giannis. "Open Bibliographic Data and E-Lis: marrying good intentions"
BHL Technical Projects Update presented during the BHL Stafff and Technical Meeting on September 26-27, 2012 at the Harvard Museum of Comparative Zoology
Cambridge, Massachusetts
The Biodiversity Heritage Library and bibliographic citations: towards new u...Trish Rose-Sandler
The data model and user interface for the Biodiversity Heritage Library (BHL) portal at http://www.biodiversitylibrary.org/ was originally designed to accommodate books and journals found in botanical garden libraries and natural history museums. As the size and reputation of the BHL grew, there were many publishers and individuals who wanted to contribute to the BHL but their content consisted of publication types at more granular levels, such as articles, book chapters, and dissertations. In order to ingest and serve these materials, in early 2011, BHL launched a separate portal called Citebank hosted at citebank.org. Currently, Citebank contains over 180,000 citations linked to content files, either hosted at citebank.org or hosted externally. While feedback on Citebank has been positive, users indicated a desire to combine both the services of the BHL portal and the services of the Citebank portal into a single interface in order to enable a unified search for all biodiversity literature. To respond to these needs, the BHL has begun expansion of its data model in the BHL portal to accommodate articles, book chapters, treatments and other segment-like material so that they can be searched alongside its traditional book and journal content. Parallel to this activity the NSF-funded Global Names Architecture (GNA) Project has enlisted Citebank to fulfill the role of a global biodiversity repository for bibliographic citations. In support of this, Citebank will provide a key functional component to the GNA - that of reconciliation services for citations. Once reconciled, citations can be linked either to scanned page images in the BHL, or to PDFs uploaded by users. If neither exists, citations can point to other digital representations online. Experience with Citebank has resulted in many lessons learned about working with diverse publication types; data formats; and contributors with varying levels of technical competencies. Those lessons were incorporated into a functional requirements document that is being used to inform development of the BHL data model. This talk will outline the functional requirements needed for a global citation repository for biodiversity and how those requirements will better serve the needs of the biodiversity community.
Tsakonas-Robbio·Open Bibliographic Data E-LisLIS EPI Meeting
1st Workshop of Transfer Information for Innovation · 3rd November 2011 · Valencia. Robbio, Antonella De; Tsakonas, Giannis. "Open Bibliographic Data and E-Lis: marrying good intentions"
BHL hardware architecture - storage and clustersPhil Cryer
The Biodiversity Heritage Library (BHL), like many other projects within biodiversity informatics, maintains terabytes of data that must be safeguarded against loss. Further, a scalable and resilient infrastructure is required to enable continuous data interoperability, as BHL provides unique services to its community of users. This volume of data and associated availability requirements present significant challenges to a distributed organization like BHL, not only in funding capital equipment purchases, but also in ongoing system administration and maintenance. A new standardized system is required to bring new opportunities to collaborate on distributed services and processing across what will be geographically dispersed nodes. Such services and processing include taxon name finding, indexes or GUID/LSID services, distributed text mining, names reconciliation and other computationally intensive tasks, or tasks with high availability requirements.
Babouk: Focused Web Crawling for Corpus Compilation and Automatic Terminology...Christophe Tricot
The use of the World Wide Web as a free source for large linguistic resources is a well-established idea. Such resources are keystones to domains such as lexicon-based categorization, information retrieval, machine translation and information extraction. In this paper, we present an industrial focused web crawler for the automatic compilation of specialized corpora from the web. This application, created within the framework of the TTC project1, is used daily by several linguists to bootstrap large thematic corpora which are then used to automatically generate bilingual terminologies
Slides from a presentation given by Holly Large, Emma Sewell (in absentia) and Dr Chris Willmott at the launch of our guide on the use of BoB ("Box of Broadcasts" and TRILT (the Television and Radio Index for Learning and Teaching) as tools for academic research. The launch event took place in London on 23rd September 2022.
0x01 - Newton's Third Law: Static vs. Dynamic AbusersOWASP Beja
f you offer a service on the web, odds are that someone will abuse it. Be it an API, a SaaS, a PaaS, or even a static website, someone somewhere will try to figure out a way to use it to their own needs. In this talk we'll compare measures that are effective against static attackers and how to battle a dynamic attacker who adapts to your counter-measures.
About the Speaker
===============
Diogo Sousa, Engineering Manager @ Canonical
An opinionated individual with an interest in cryptography and its intersection with secure software development.
Have you ever wondered how search works while visiting an e-commerce site, internal website, or searching through other types of online resources? Look no further than this informative session on the ways that taxonomies help end-users navigate the internet! Hear from taxonomists and other information professionals who have first-hand experience creating and working with taxonomies that aid in navigation, search, and discovery across a range of disciplines.
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...Orkestra
UIIN Conference, Madrid, 27-29 May 2024
James Wilson, Orkestra and Deusto Business School
Emily Wise, Lund University
Madeline Smith, The Glasgow School of Art
Acorn Recovery: Restore IT infra within minutesIP ServerOne
Introducing Acorn Recovery as a Service, a simple, fast, and secure managed disaster recovery (DRaaS) by IP ServerOne. A DR solution that helps restore your IT infra within minutes.
This presentation by Morris Kleiner (University of Minnesota), was made during the discussion “Competition and Regulation in Professions and Occupations” held at the Working Party No. 2 on Competition and Regulation on 10 June 2024. More papers and presentations on the topic can be found out at oe.cd/crps.
This presentation was uploaded with the author’s consent.
This presentation, created by Syed Faiz ul Hassan, explores the profound influence of media on public perception and behavior. It delves into the evolution of media from oral traditions to modern digital and social media platforms. Key topics include the role of media in information propagation, socialization, crisis awareness, globalization, and education. The presentation also examines media influence through agenda setting, propaganda, and manipulative techniques used by advertisers and marketers. Furthermore, it highlights the impact of surveillance enabled by media technologies on personal behavior and preferences. Through this comprehensive overview, the presentation aims to shed light on how media shapes collective consciousness and public opinion.
2. www.sti-innsbruck.at
• BBC working to integrate data and linking documents across BBC
domains
• Collaboration with Freie Universität Berlin, Rattle Research (and
Ontotext)
• Semantic Web context: usage of Linked Data from MusicBrainz and
DBpedia
2
3. www.sti-innsbruck.at
Problem
• BBC publishes large amounts of online content text/videos/audio
• Mostly data for broadcast brands and domain specific microsites
• Division of its services by domain, e.g. food, music, news etc.
No interlinking between these domain specific sites – not using the full
potential of available data
3
4. www.sti-innsbruck.at
Objectives
• DBpedia to provide a common ”controlled” vocabulary and
equivalency service, which in turn is used to add ”topic badges” to
existing, legacy web pages
• Soft transition of the old to the new system
– Developing a new service that supports the branding of our Radio stations, TV
channels and programmes (bbc.co.uk/programmes)
– Developing a new music offering (bbc.co.uk/music/beta) that builds on existing
open web standards and is fully integrated with programme support service
– Simple navigational elements (i.e. Topic Badges and term extraction) to support
contextual, semantic navigation
– Common set of web scale identifiers to help classify all BBC online content (and
external URLs) and to help create equivalency between multiple vocabularies
4
5. www.sti-innsbruck.at
Cross-Linking Legacy Content with Legacy Systems
• Desire to link to further BBC domains (apart from programmes and music)
– Through an about-relationship between programmes, people, places and subjects
• Data was created with a legacy auto-categorization system called CIS.
• CIS holds a hierarchy of terms in five main top-level classes:
– Proper names
– Subjects
– Brands
– Time periods
– Places
Objects identified with /programmes and /music are also to be found within other
domains: Mechanism to map between equivalent terms
Linking CIS Concepts to DBpedia
5
7. www.sti-innsbruck.at
Linking BBC Domains
• DBpedia weighted Label Lookup using Wikipedia inter-article-links as weight
indicator
– links(redirect)*log2(weight(article))
• Context-Based Disambiguation
– Disambiguate possible concept matches to identify similarity contexts of CIS terms by clustering matches
and finding according contexts in DBpedia
7
8. www.sti-innsbruck.at
Linking Documents to Concepts
• Named entity extraction system Muddy Boots
– Instead of solutions from OpenCalais, Twine and Zemanta because it reuses existing
web identifiers, i.e. Wikipedia/Dbpedia URIs
• BBC News articles, recognize entities in those articles
• Use DBpedia identifier for those entities
• Content Link Tool to add or remove DBpedia identifiers from any given
BBC URL
8
9. www.sti-innsbruck.at
Create User Journeys:
Topic Pages and Navigation Badges
• Topic pages
– Creation of aggregation pages of unstructured and structured content
– Pull together the modeled world of BBC programmes (CIS identifiers mapped to
DBpedia) and unstructured world of BBC News articles
• Navigational Badges
– Once a user has entered an area of BBC content there are few links through to other
related content
– Providing this link is the role of the navigation badge
9
10. www.sti-innsbruck.at
Conclusions
• User experience in the center of BBC efforts
• Semantics as enabler
• What we can learn form the BBC
– User should be in the center of efforts
– Pages not strictly structured according to domain model
– Semantics primarily enable smart interlinking to additional content
– Well hidden magic
– Simplicity of domain models is beauty
• For more information refer to “Beyond the polar bear presentation”
– http://www.slideshare.net/reduxd/beyond-the-polar-bear
10