This presentation was provided by Marydee Ojala of Information Today during the NISO event "The Impact of the Interface: Traditional and Non Traditional Content," held on November 20, 2019.
This document discusses best practices for content delivery platforms to support artificial intelligence projects. It recommends that platforms (1) accept that they do not have all the data needed and should integrate third-party sources, (2) provide consistent tagging of content, (3) offer a lightweight programmatic interface, (4) embrace allowing large amounts of content to be taken offline for analysis, and (5) enable complex filtering and selection of data. The document also suggests platforms could consider offering preprocessed datasets or AI tools as new products.
This document discusses best practices for supporting open science. It recommends adopting existing solutions where possible rather than developing new ones. It also suggests engaging with researchers, incentivizing open practices, allowing for innovation and failure, collaborating with peers, and keeping service delivery options open. The document concludes by inviting attendees to a workshop on delivering research data management services.
About the Webinar
The development and rising popularity of the massive open online course (MOOC) presents a new opportunity for libraries to be involved in the education of patrons, to highlight the resources libraries provide and to further demonstrate the value of the library to administrators. There are, of course, a host of logistics to be considered when deciding to organize or support a MOOC. Diminished library budgets and staffing levels challenge libraries both monetarily and administratively. Marketing the course, mounting it on a site, securing copyright permissions and negotiating licensing for course materials, managing the course while in progress and troubleshooting technical problems add to the issues that have caused some libraries to hesitate in joining the MOOC movement. On the other hand, partnerships such as that between Georgetown University and edX, itself an initiative of Harvard and MIT, allow a pooling of resources thereby easing the burden on any one library. In some cases price breaks for certain course materials used in MOOCs can help draw students to the course, though the pricing must still be negotiated by the course organizer. A successful MOOC, such as the RootsMOOC, created by the Z. Smith Reynolds Library at Wake Forest University and the State Library of North Carolina, can bring awareness of library resources to a broad audience.
In the end, libraries must ask whether the advantages of participating in a MOOC outweigh the challenges. The speakers for this webinar will consider these issues surrounding MOOCs and libraries and try to answer the question of whether the impact of libraries on MOOCs has been realized or is still brewing.
Agenda
Introduction
Todd Carpenter, Executive Director, NISO
MOOCS: Assessing the Landscape and Trends of Open Online Learning
Heather Ruland Staines, Director Publisher and Content Strategy, ProQuest SIPX
The RootsMOOC Project or: that time we threw a genealogy party and 4,000 people showed up
Kyle Denlinger, eLearning Librarian, Wake Forest University Z. Smith Reynolds Library
Rebecca Hyman, Reference and Outreach Librarian, Government and Heritage Library, State Library of North Carolina
MOOCS and Me: Georgetown's Experience with MOOC Production
Barrinton Baynes, Multimedia Projects Manager, Gelardin New Media Center, Georgetown University Library
The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...Anna De Liddo
The Evidence Hub is a tool that harnesses collective intelligence to build evidence-based knowledge. It allows communities to gather and debate evidence for ideas and solutions. Users can easily add evidence, counter-evidence, and have conversations to share knowledge. Visual analytics show social dynamics like key players and agreements/disagreements. Future research focuses on defining participation roles and processes, and developing reporting, discourse analytics, and geo-deliberation analytics.
This document discusses the FAIR data principles and increasing adoption of FAIR. It begins by explaining the 15 FAIR principles for findable, accessible, interoperable and reusable data. It then discusses how adoption is increasing through funder requirements, the role of FAIR within EOSC, and related projects. However, it notes that most data is still not managed or shared according to FAIR principles due to barriers like time and effort required as well as lack of incentives and rewards. The document argues that both cultural and technical aspects must be addressed to fully implement FAIR.
Software Repositories for Research-- An Environmental ScanMicah Altman
This document provides a summary of the state of software curation based on an environmental scan of research software repositories and related practices. The summary finds:
1) There are no comprehensive indices of software archives and orders of magnitude fewer software archives than data archives. Institutional repositories offer little functionality for software archiving.
2) Very few funders have policies addressing software curation. There is little available advice for researchers who wish to curate, cite, and preserve software.
3) Substantial reproducibility failures continue to be reported due to a lack of software preservation. In summary, software curation looks a lot like data curation did a decade ago, with no universal standards for citing and archiving software.
An presentation of how the Swedish speaking department of the Finnish public broadcaster Yle has used semantic metadata to tie it's services together, link content and make recommendations and prepare for personification.
This document discusses best practices for content delivery platforms to support artificial intelligence projects. It recommends that platforms (1) accept that they do not have all the data needed and should integrate third-party sources, (2) provide consistent tagging of content, (3) offer a lightweight programmatic interface, (4) embrace allowing large amounts of content to be taken offline for analysis, and (5) enable complex filtering and selection of data. The document also suggests platforms could consider offering preprocessed datasets or AI tools as new products.
This document discusses best practices for supporting open science. It recommends adopting existing solutions where possible rather than developing new ones. It also suggests engaging with researchers, incentivizing open practices, allowing for innovation and failure, collaborating with peers, and keeping service delivery options open. The document concludes by inviting attendees to a workshop on delivering research data management services.
About the Webinar
The development and rising popularity of the massive open online course (MOOC) presents a new opportunity for libraries to be involved in the education of patrons, to highlight the resources libraries provide and to further demonstrate the value of the library to administrators. There are, of course, a host of logistics to be considered when deciding to organize or support a MOOC. Diminished library budgets and staffing levels challenge libraries both monetarily and administratively. Marketing the course, mounting it on a site, securing copyright permissions and negotiating licensing for course materials, managing the course while in progress and troubleshooting technical problems add to the issues that have caused some libraries to hesitate in joining the MOOC movement. On the other hand, partnerships such as that between Georgetown University and edX, itself an initiative of Harvard and MIT, allow a pooling of resources thereby easing the burden on any one library. In some cases price breaks for certain course materials used in MOOCs can help draw students to the course, though the pricing must still be negotiated by the course organizer. A successful MOOC, such as the RootsMOOC, created by the Z. Smith Reynolds Library at Wake Forest University and the State Library of North Carolina, can bring awareness of library resources to a broad audience.
In the end, libraries must ask whether the advantages of participating in a MOOC outweigh the challenges. The speakers for this webinar will consider these issues surrounding MOOCs and libraries and try to answer the question of whether the impact of libraries on MOOCs has been realized or is still brewing.
Agenda
Introduction
Todd Carpenter, Executive Director, NISO
MOOCS: Assessing the Landscape and Trends of Open Online Learning
Heather Ruland Staines, Director Publisher and Content Strategy, ProQuest SIPX
The RootsMOOC Project or: that time we threw a genealogy party and 4,000 people showed up
Kyle Denlinger, eLearning Librarian, Wake Forest University Z. Smith Reynolds Library
Rebecca Hyman, Reference and Outreach Librarian, Government and Heritage Library, State Library of North Carolina
MOOCS and Me: Georgetown's Experience with MOOC Production
Barrinton Baynes, Multimedia Projects Manager, Gelardin New Media Center, Georgetown University Library
The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...Anna De Liddo
The Evidence Hub is a tool that harnesses collective intelligence to build evidence-based knowledge. It allows communities to gather and debate evidence for ideas and solutions. Users can easily add evidence, counter-evidence, and have conversations to share knowledge. Visual analytics show social dynamics like key players and agreements/disagreements. Future research focuses on defining participation roles and processes, and developing reporting, discourse analytics, and geo-deliberation analytics.
This document discusses the FAIR data principles and increasing adoption of FAIR. It begins by explaining the 15 FAIR principles for findable, accessible, interoperable and reusable data. It then discusses how adoption is increasing through funder requirements, the role of FAIR within EOSC, and related projects. However, it notes that most data is still not managed or shared according to FAIR principles due to barriers like time and effort required as well as lack of incentives and rewards. The document argues that both cultural and technical aspects must be addressed to fully implement FAIR.
Software Repositories for Research-- An Environmental ScanMicah Altman
This document provides a summary of the state of software curation based on an environmental scan of research software repositories and related practices. The summary finds:
1) There are no comprehensive indices of software archives and orders of magnitude fewer software archives than data archives. Institutional repositories offer little functionality for software archiving.
2) Very few funders have policies addressing software curation. There is little available advice for researchers who wish to curate, cite, and preserve software.
3) Substantial reproducibility failures continue to be reported due to a lack of software preservation. In summary, software curation looks a lot like data curation did a decade ago, with no universal standards for citing and archiving software.
An presentation of how the Swedish speaking department of the Finnish public broadcaster Yle has used semantic metadata to tie it's services together, link content and make recommendations and prepare for personification.
The document discusses three options for libraries to adopt linked data: BIBFRAME 2.0, Schema.org, and Linky MARC. BIBFRAME 2.0 is a library standard that allows standardized RDF interchange but is not recognized outside libraries. Schema.org is the de facto web standard that improves discovery on the web but lacks detail for library needs. Linky MARC adds URIs to MARC without changing its format. The document evaluates the pros and cons of each and who may want to adopt each standard.
This document discusses three options for libraries to implement linked data: BIBFRAME 2.0, Schema.org, and Linky MARC. BIBFRAME 2.0 is a library standard for linked data but is not recognized outside the library community. Schema.org is the main standard for structured data on the web and could increase library discoverability, but lacks detail for library cataloging. Linky MARC adds HTTP URIs to existing MARC records to preserve entity identifiers without converting to linked data. The document also proposes a new open project called "bibframe2schema.org" to map BIBFRAME to Schema.org and promote its adoption for libraries.
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infrastructure
Robert McDonald - HathiTrust Research Center Executive committee member; Associate Dean for Library Technologies, Indiana University
This document provides an introduction to the Semantic Web and Linked Open Data. It discusses how standards like RDF, XML, and OWL allow machines to better understand the meaning of data on the web. It describes how ontologies provide a vocabulary to define relationships between resources. The document outlines the benefits of publishing data as Linked Open Data using these standards, including making data more interoperable and accessible to both humans and machines. Examples are given of biomedical research projects that use Semantic Web technologies to integrate and link different types of data.
Organization identifiers are a key part of the scholarly
communications infrastructure. At the beginning of 2017
Crossref, DataCite and ORCID formed a working group to
establish principles and specifications for an open, independent, non-profit identifier registry focused on the disambiguation of researcher affiliations. The group published a set of recommendations and a Request for Information (RFI) to solicit comment and interest from the broader scholarly community in developing the registry. This session will give an overview of the work and an update on current progress.
FAIR Data Experiences - Kees van Bochove - The HyveKees van Bochove
Talk at Bio IT World 2018 FAIR Data for Genomic Applications track.
Implementation of the FAIR Data Principles is a crucial step for all organizations pursuing a (biomedical) data-driven strategy, both to improve the effectiveness of scientists and doctors as well as computerized aides and autonomous programs. This talk will provide a number of concrete examples of how various customers of The Hyve, including large pharma companies, biobanks and registries and national health data sharing initiatives, have employed data FAIRification strategies to improve the (re)usability of their healthcare and biology data, and of the open source software tools and standards that are used and being further developed for that purpose.
Fair webinar, Ted slater: progress towards commercial fair data products and ...Pistoia Alliance
Elsevier is a global information analytics business that helps institutions and professional’s
advance healthcare and open science to improve performance for the benefit of humanity.
In this webinar, we discuss how Elsevier is increasingly leveraging the FAIR Guiding Principles to improve its products and services to better serve the scientific community.
American Art Collaborative Planning Grant Educational Briefings
Linked Data and Tools
Pedro Szekely - USC/Information Sciences Institute
September 30, 2014
Schema.org Structured data the What, Why, & HowRichard Wallis
This document discusses Schema.org structured data, including its origins in the Semantic Web and Linked Open Data movements. Schema.org was created in 2011 to provide a common vocabulary for structured data markup on web pages. It allows search engines and other applications to understand the intended meaning and relationships of information on web pages. The document provides examples of using Schema.org structured data and microdata, and recommends applying it across various page types to help search engines better understand websites.
Building a collaborative RDM community, research data networkJisc RDM
This document summarizes Dr. Marta Teperek's presentation on building a collaborative research data management (RDM) community. The presentation covered how not to start RDM services by mandating data sharing, and instead focusing on the benefits of sharing. It discussed Cambridge University's democratic approach to developing RDM services by empowering researchers, and the positive feedback received. Collaboration, open communication, and shaping services and policies with researchers were emphasized as key to success.
The document discusses solutions to overcoming the tragedy of the data commons through shared metadata. It describes how large scientific projects can share data at low cost by starting from overlapping common metadata terms and having their metadata teams work together. Reusing shared metadata leads to increased reusability of data across projects. The document advocates for developing metadata as evolving, linked resources rather than predefined standards, and provides examples of how this approach has helped scientific collaborations and government data sharing initiatives succeed.
Access to biomedical data is increasingly important to enable data driven science in the research community.
The Linked Open Data (LOD) principles (by Tim Berner-Lee) have been suggested to judge the quality of data by its accessibility (open data access), by its format and structures, and by its interoperability with other data sources.
The objective is to use interoperable data sources across the Web with ease.
The FAIR (findable, accessible, interoperable, reusable) data principles have been introduced for similar reasons with a stronger emphasis on achieving reusability.
In this presentation we assess the FAIR principles against the LOD principles to determine, to which degree, the FAIR principles reuse LOD principles, and to which degree they extend the LOD principles.
This assessment helps to clarify the relationship between both schemes and gives a better understanding, what extension FAIR represents in comparison to LOD.
We conclude, that LOD gives a clear mandate to the openness of data, whereas FAIR asks for a stated license for access and thus includes the concept of reusability under consideration of the license agreement.
Furthermore, FAIR makes strong reference to the contextual information required to improve reuse of the data, e.g., provenance information.
According to the LOD principles, such meta-data would be considered interoperable data as well, however, the requirement of extending of data with meta-data does indicate that FAIR is an extension of the LOD (in contrast to the inverse).
American Art Collaborative Linked Open Data presentation to "The Networked Cu...American Art Collaborative
An August 2017 presentation by Eleanor Fink to "The Networked Curator: Association of Art Museum Curators Foundation Digital Literacy Workshop for Art Curators"
Ken Chad presented the keynote at the EDS (Ebsco Discovery Services) conference at Regents University, London in July 2016. He reviewed future trends for Google and enterprise search including factors such as voice (‘conversational’) search, the ‘ultimate assistant’, entities (‘things not strings’), visual search and the role of big data, context and intention. He then looked and some trends in library discovery services. There will continue to be a multiplicity of approaches open to users and Ken recommended that libraries do more to focus on the needs of users– the ‘jobs’ they were trying to do– in order to acquire and/or innovate new approaches to library discovery services.
This document summarizes a discussion on using Linked Open Data (LOD) for museums. It discusses:
1) The American Art Collaborative (AAC), a consortium of US museums working to implement LOD within their collections to provide open access and interconnect data.
2) The benefits of LOD include telling fuller stories, augmenting collection information by connecting to other institutions, and making data more usable for developers.
3) Challenges include mastering ontologies, data inconsistencies, maintaining accuracy of tools, and understanding implications of different data models.
4) The AAC is developing best practices guides, apps, and open source tools from their experience implementing an LOD initiative over
Biomedical Research as an Open Digital EnterprisePhilip Bourne
The document discusses the challenges and opportunities facing biomedical research as it transitions to becoming a fully digital open enterprise. It notes issues around reproducibility, limited funding, and the need to better connect different elements of the research lifecycle like data capture, analysis, and publication. The author proposes the "Commons" as a conceptual framework to help address these issues by providing shared resources like cloud-based storage and computing, tools to discover and access data and software, and standards to improve reproducibility. The goal is to foster an ecosystem that maximizes the benefits of digital technologies for biomedical research.
Broad introduction to information retrieval and web search, used to teaching at the Yahoo Bangalore Summer School 2013. Slides are a mash-up from my own and other people's presentations.
This 2-hour lecture was held at Amsterdam University of Applied Sciences (HvA) on October 16th, 2013. It represents a basic overview over core technologies used by ICT companies such as Google, Twitter or Facebook. The lecture does not require a strong technical background and stays at conceptual level.
The document discusses three options for libraries to adopt linked data: BIBFRAME 2.0, Schema.org, and Linky MARC. BIBFRAME 2.0 is a library standard that allows standardized RDF interchange but is not recognized outside libraries. Schema.org is the de facto web standard that improves discovery on the web but lacks detail for library needs. Linky MARC adds URIs to MARC without changing its format. The document evaluates the pros and cons of each and who may want to adopt each standard.
This document discusses three options for libraries to implement linked data: BIBFRAME 2.0, Schema.org, and Linky MARC. BIBFRAME 2.0 is a library standard for linked data but is not recognized outside the library community. Schema.org is the main standard for structured data on the web and could increase library discoverability, but lacks detail for library cataloging. Linky MARC adds HTTP URIs to existing MARC records to preserve entity identifiers without converting to linked data. The document also proposes a new open project called "bibframe2schema.org" to map BIBFRAME to Schema.org and promote its adoption for libraries.
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infrastructure
Robert McDonald - HathiTrust Research Center Executive committee member; Associate Dean for Library Technologies, Indiana University
This document provides an introduction to the Semantic Web and Linked Open Data. It discusses how standards like RDF, XML, and OWL allow machines to better understand the meaning of data on the web. It describes how ontologies provide a vocabulary to define relationships between resources. The document outlines the benefits of publishing data as Linked Open Data using these standards, including making data more interoperable and accessible to both humans and machines. Examples are given of biomedical research projects that use Semantic Web technologies to integrate and link different types of data.
Organization identifiers are a key part of the scholarly
communications infrastructure. At the beginning of 2017
Crossref, DataCite and ORCID formed a working group to
establish principles and specifications for an open, independent, non-profit identifier registry focused on the disambiguation of researcher affiliations. The group published a set of recommendations and a Request for Information (RFI) to solicit comment and interest from the broader scholarly community in developing the registry. This session will give an overview of the work and an update on current progress.
FAIR Data Experiences - Kees van Bochove - The HyveKees van Bochove
Talk at Bio IT World 2018 FAIR Data for Genomic Applications track.
Implementation of the FAIR Data Principles is a crucial step for all organizations pursuing a (biomedical) data-driven strategy, both to improve the effectiveness of scientists and doctors as well as computerized aides and autonomous programs. This talk will provide a number of concrete examples of how various customers of The Hyve, including large pharma companies, biobanks and registries and national health data sharing initiatives, have employed data FAIRification strategies to improve the (re)usability of their healthcare and biology data, and of the open source software tools and standards that are used and being further developed for that purpose.
Fair webinar, Ted slater: progress towards commercial fair data products and ...Pistoia Alliance
Elsevier is a global information analytics business that helps institutions and professional’s
advance healthcare and open science to improve performance for the benefit of humanity.
In this webinar, we discuss how Elsevier is increasingly leveraging the FAIR Guiding Principles to improve its products and services to better serve the scientific community.
American Art Collaborative Planning Grant Educational Briefings
Linked Data and Tools
Pedro Szekely - USC/Information Sciences Institute
September 30, 2014
Schema.org Structured data the What, Why, & HowRichard Wallis
This document discusses Schema.org structured data, including its origins in the Semantic Web and Linked Open Data movements. Schema.org was created in 2011 to provide a common vocabulary for structured data markup on web pages. It allows search engines and other applications to understand the intended meaning and relationships of information on web pages. The document provides examples of using Schema.org structured data and microdata, and recommends applying it across various page types to help search engines better understand websites.
Building a collaborative RDM community, research data networkJisc RDM
This document summarizes Dr. Marta Teperek's presentation on building a collaborative research data management (RDM) community. The presentation covered how not to start RDM services by mandating data sharing, and instead focusing on the benefits of sharing. It discussed Cambridge University's democratic approach to developing RDM services by empowering researchers, and the positive feedback received. Collaboration, open communication, and shaping services and policies with researchers were emphasized as key to success.
The document discusses solutions to overcoming the tragedy of the data commons through shared metadata. It describes how large scientific projects can share data at low cost by starting from overlapping common metadata terms and having their metadata teams work together. Reusing shared metadata leads to increased reusability of data across projects. The document advocates for developing metadata as evolving, linked resources rather than predefined standards, and provides examples of how this approach has helped scientific collaborations and government data sharing initiatives succeed.
Access to biomedical data is increasingly important to enable data driven science in the research community.
The Linked Open Data (LOD) principles (by Tim Berner-Lee) have been suggested to judge the quality of data by its accessibility (open data access), by its format and structures, and by its interoperability with other data sources.
The objective is to use interoperable data sources across the Web with ease.
The FAIR (findable, accessible, interoperable, reusable) data principles have been introduced for similar reasons with a stronger emphasis on achieving reusability.
In this presentation we assess the FAIR principles against the LOD principles to determine, to which degree, the FAIR principles reuse LOD principles, and to which degree they extend the LOD principles.
This assessment helps to clarify the relationship between both schemes and gives a better understanding, what extension FAIR represents in comparison to LOD.
We conclude, that LOD gives a clear mandate to the openness of data, whereas FAIR asks for a stated license for access and thus includes the concept of reusability under consideration of the license agreement.
Furthermore, FAIR makes strong reference to the contextual information required to improve reuse of the data, e.g., provenance information.
According to the LOD principles, such meta-data would be considered interoperable data as well, however, the requirement of extending of data with meta-data does indicate that FAIR is an extension of the LOD (in contrast to the inverse).
American Art Collaborative Linked Open Data presentation to "The Networked Cu...American Art Collaborative
An August 2017 presentation by Eleanor Fink to "The Networked Curator: Association of Art Museum Curators Foundation Digital Literacy Workshop for Art Curators"
Ken Chad presented the keynote at the EDS (Ebsco Discovery Services) conference at Regents University, London in July 2016. He reviewed future trends for Google and enterprise search including factors such as voice (‘conversational’) search, the ‘ultimate assistant’, entities (‘things not strings’), visual search and the role of big data, context and intention. He then looked and some trends in library discovery services. There will continue to be a multiplicity of approaches open to users and Ken recommended that libraries do more to focus on the needs of users– the ‘jobs’ they were trying to do– in order to acquire and/or innovate new approaches to library discovery services.
This document summarizes a discussion on using Linked Open Data (LOD) for museums. It discusses:
1) The American Art Collaborative (AAC), a consortium of US museums working to implement LOD within their collections to provide open access and interconnect data.
2) The benefits of LOD include telling fuller stories, augmenting collection information by connecting to other institutions, and making data more usable for developers.
3) Challenges include mastering ontologies, data inconsistencies, maintaining accuracy of tools, and understanding implications of different data models.
4) The AAC is developing best practices guides, apps, and open source tools from their experience implementing an LOD initiative over
Biomedical Research as an Open Digital EnterprisePhilip Bourne
The document discusses the challenges and opportunities facing biomedical research as it transitions to becoming a fully digital open enterprise. It notes issues around reproducibility, limited funding, and the need to better connect different elements of the research lifecycle like data capture, analysis, and publication. The author proposes the "Commons" as a conceptual framework to help address these issues by providing shared resources like cloud-based storage and computing, tools to discover and access data and software, and standards to improve reproducibility. The goal is to foster an ecosystem that maximizes the benefits of digital technologies for biomedical research.
Broad introduction to information retrieval and web search, used to teaching at the Yahoo Bangalore Summer School 2013. Slides are a mash-up from my own and other people's presentations.
This 2-hour lecture was held at Amsterdam University of Applied Sciences (HvA) on October 16th, 2013. It represents a basic overview over core technologies used by ICT companies such as Google, Twitter or Facebook. The lecture does not require a strong technical background and stays at conceptual level.
This document discusses digital discoverability strategies for performing arts organizations. It defines discoverability as the ability for something to be discovered online through search and recommendation engines. It outlines various methods of online discovery like advertising, publicity, niche communities, social networks and search/recommendation engines. It provides best practices for search engine optimization, including strategic language use, backlinking, semantic optimization, localization and structured data. It also discusses the benefits of using linked open data sources like Wikidata to enrich arts discoverability.
Card Sorting Your Way to Meaningful MetadataRob Bogue
Card sorting is one of the most powerful techniques for improving the information architecture and taxonomy that you create. In this session we'll put card sorting in context and show you how to use them to create meaningful metadata.
You can download this presentation now by visiting https://www.thorprojects.com/connect/gifts/presentations/card-sorting-your-way-to-meaningful-metadata.
This document discusses information architecture and taxonomy design. It begins by outlining the problem of information overload. It then defines information architecture as creating a structure and tools to efficiently store, retrieve, and manage information. Key aspects of information architecture discussed include people, processes, technology, collaboration, taxonomy, orientation, and perspective. The document provides examples of taxonomy types and approaches. It emphasizes the importance of prototyping, reducing cognitive load, and first designing a taxonomy before creating navigation. Methods like content inventory, metadata, card sorting, and taxonomy validation are presented for building an effective information architecture.
Slides for the iDB summer school (Sapporo, Japan) http://db-event.jpn.org/idb2013/
Typically, Web mining approaches have focused on enhancing or learning about user seeking behavior, from query log analysis and click through usage, employing the web graph structure for ranking to detecting spam or web page duplicates. Lately, there's a trend on mining web content semantics and dynamics in order to enhance search capabilities by either providing direct answers to users or allowing for advanced interfaces or capabilities. In this tutorial we will look into different ways of mining textual information from Web archives, with a particular focus on how to extract and disambiguate entities, and how to put them in use in various search scenarios. Further, we will discuss how web dynamics affects information access and how to exploit them in a search context.
This document summarizes a panel discussion on technology skills needed for 21st century law librarians. The panelists discussed skills such as coding, social media use, customizing library websites, creating digital repositories, mobile access, and keeping current on new technologies. They emphasized the importance of collaboration between librarians and IT staff. Panelists also noted that librarians need strong research skills in addition to technical skills, and that willingness to learn and adapt is key for career development in this changing environment.
This document discusses key factors to consider when evaluating a search engine, including:
1) Understanding the type of search engine (e.g. free text, directory, meta search) and its search functionality/operators.
2) Benchmarking a search engine by running sample searches and comparing results to preferred engines.
3) Analyzing how search results are ranked and algorithms are evaluated/updated.
4) Noting difficulties in evaluating search results due to ambiguity in search intents.
This document provides information about internet use and finding information online. It discusses the growth of internet hosts from 1977 to 2022. It also summarizes different ways to search for information online including search engines, subject directories, the invisible web, meta-search engines, and specialized search engines. The document gives examples of specific search tools and services and provides tips on how to effectively search for information on the internet through simple and advanced search techniques.
Information Discovery and Search Strategies for Evidence-Based ResearchDavid Nzoputa Ofili
This event was on May 2, 2017 at Wesley University, Ondo State, Nigeria. I trained the university's staff (academic and non-academic) on "Information Discovery and Search Strategies for Evidence-Based Research" in an information/digital literacy session.
Marjorie M.K. Hlava, President and founder of Access Innovations, Inc. and the Data Harmony suite of indexing software, gives the Miles Conrad Memorial Lecture at the 2014 Annual Conference for the National Federation of Advanced Information Services (NFAIS).
The Miles Conrad Award and accompanying lecture was established in 1965 in commemoration of NFAIS founder, G. Miles Conrad. Hlava earned the Miles Conrad Award this January for her past and continuing services to NFAIS and the Information and Knowledge Management industries.
1. SharePoint 2010 introduces a new Managed Metadata Service that allows for centralized storage and management of terms across sites and site collections. This provides a consistent way to organize content.
2. The Managed Metadata Service supports both taxonomies for structured terms as well as folksonomies for user-generated keywords and tags. It integrates with other features like Business Connectivity Services.
3. While powerful, the Managed Metadata Service requires planning to set up terms and administer the term store. Considerations include importing structures metadata, separating terms with commas, and preventing misspellings.
The document provides an overview of research data management and the importance of avoiding a "DATApocalypse" or data disaster. It discusses the definition of research data, why data management is important, questions to consider, best practices for data management planning, documentation, and long-term preservation. The goal is to help researchers and institutions properly manage data to enable sharing and preservation, as required by most major funders.
Trendspotting: Helping you make sense of large information sourcesMarieke Guy
This document provides an overview of a presentation on trendspotting and making sense of large information sources. The presentation introduces qualitative data analysis and thematic coding. It discusses collecting and organizing qualitative data, identifying themes and patterns through coding, and presenting findings through reports, visualizations and infographics. Practical exercises are included to have participants analyze text data by identifying codes and themes in small groups. Resources on qualitative analysis techniques are also provided.
How to crack Big Data and Data Science rolesUpXAcademy
How to crack Big Data and Data Science roles is the flagship event of UpX Academy. This slide was used for the event on 10th Sept that was attended by hundreds of participants globally.
This document provides an introduction to data science, including definitions of data science, its impact and importance. It discusses how data science affects organizations and provides competitive advantages. Examples of data science applications are given across various domains like banking, healthcare, transportation and more. The document also outlines the road to becoming a data scientist and what skills are required, such as learning to code, mathematics, machine learning techniques and software engineering. In summary, data science uses scientific methods to extract knowledge and insights from data, it benefits society in areas like healthcare, transportation and environment, and becoming a data scientist requires strong coding and analytical skills.
The document provides an overview of data science, big data, data mining, and data mining techniques. It defines data science as a multi-disciplinary field that uses scientific methods to extract knowledge from structured and unstructured data. Big data is described as large, diverse datasets that are too large for traditional databases to handle. Common data mining tasks like prediction, classification, clustering and association rule mining are summarized. Finally, specific techniques like decision trees, k-means clustering, and association rule mining are overviewed.
This presentation was provided by Racquel Jemison, Ph.D., Christina MacLaughlin, Ph.D., and Paulomi Majumder. Ph.D., all of the American Chemical Society, for the second session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session Two: 'Expanding Pathways to Publishing Careers,' was held June 13, 2024.
This presentation was provided by Rebecca Benner, Ph.D., of the American Society of Anesthesiologists, for the second session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session Two: 'Expanding Pathways to Publishing Careers,' was held June 13, 2024.
This presentation was provided by Steph Pollock of The American Psychological Association’s Journals Program, and Damita Snow, of The American Society of Civil Engineers (ASCE), for the initial session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session One: 'Setting Expectations: a DEIA Primer,' was held June 6, 2024.
This presentation was provided by William Mattingly of the Smithsonian Institution, during the closing segment of the NISO training series "AI & Prompt Design." Session Eight: Limitations and Potential Solutions, was held on May 23, 2024.
This presentation was provided by William Mattingly of the Smithsonian Institution, during the seventh segment of the NISO training series "AI & Prompt Design." Session 7: Open Source Language Models, was held on May 16, 2024.
This presentation was provided by William Mattingly of the Smithsonian Institution, during the sixth segment of the NISO training series "AI & Prompt Design." Session Six: Text Classification with LLMs, was held on May 9, 2024.
This presentation was provided by William Mattingly of the Smithsonian Institution, during the fifth segment of the NISO training series "AI & Prompt Design." Session Five: Named Entity Recognition with LLMs, was held on May 2, 2024.
This presentation was provided by William Mattingly of the Smithsonian Institution, during the fourth segment of the NISO training series "AI & Prompt Design." Session Four: Structured Data and Assistants, was held on April 25, 2024.
This presentation was provided by William Mattingly of the Smithsonian Institution, during the third segment of the NISO training series "AI & Prompt Design." Session Three: Beginning Conversations, was held on April 18, 2024.
This presentation was provided by Kaveh Bazargan of River Valley Technologies, during the NISO webinar "Sustainability in Publishing." The event was held April 17, 2024.
This presentation was provided by Dana Compton of the American Society of Civil Engineers (ASCE), during the NISO webinar "Sustainability in Publishing." The event was held April 17, 2024.
This presentation was provided by William Mattingly of the Smithsonian Institution, during the second segment of the NISO training series "AI & Prompt Design." Session Two: Large Language Models, was held on April 11, 2024.
This presentation was provided by Teresa Hazen of the University of Arizona, Geoff Morse of Northwestern University. and Ken Varnum of the University of Michigan, during the Spring ODI Conformance Statement Workshop for Libraries. This event was held on April 9, 2024
This presentation was provided by William Mattingly of the Smithsonian Institution, during the opening segment of the NISO training series "AI & Prompt Design." Session One: Introduction to Machine Learning, was held on April 4, 2024.
This presentation was provided by William Mattingly of the Smithsonian Institution, for the eight and final session of NISO's 2023 Training Series on Text and Data Mining. Session eight, "Building Data Driven Applications" was held on Thursday, December 7, 2023.
This presentation was provided by William Mattingly of the Smithsonian Institution, for the seventh session of NISO's 2023 Training Series on Text and Data Mining. Session seven, "Vector Databases and Semantic Searching" was held on Thursday, November 30, 2023.
This presentation was provided by William Mattingly of the Smithsonian Institution, for the sixth session of NISO's 2023 Training Series on Text and Data Mining. Session six, "Text Mining Techniques" was held on Thursday, November 16, 2023.
This presentation was provided by William Mattingly of the Smithsonian Institution, for the fifth session of NISO's 2023 Training Series on Text and Data Mining. Session five, "Text Processing for Library Data" was held on Thursday, November 9, 2023.
This presentation was provided by Todd Carpenter, Executive Director, during the NISO webinar on "Strategic Planning." The event was held virtually on November 8, 2023.
More from National Information Standards Organization (NISO) (20)
A Visual Guide to 1 Samuel | A Tale of Two HeartsSteve Thomason
These slides walk through the story of 1 Samuel. Samuel is the last judge of Israel. The people reject God and want a king. Saul is anointed as the first king, but he is not a good king. David, the shepherd boy is anointed and Saul is envious of him. David shows honor while Saul continues to self destruct.
Andreas Schleicher presents PISA 2022 Volume III - Creative Thinking - 18 Jun...EduSkills OECD
Andreas Schleicher, Director of Education and Skills at the OECD presents at the launch of PISA 2022 Volume III - Creative Minds, Creative Schools on 18 June 2024.
Gender and Mental Health - Counselling and Family Therapy Applications and In...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
Temple of Asclepius in Thrace. Excavation resultsKrassimira Luka
The temple and the sanctuary around were dedicated to Asklepios Zmidrenus. This name has been known since 1875 when an inscription dedicated to him was discovered in Rome. The inscription is dated in 227 AD and was left by soldiers originating from the city of Philippopolis (modern Plovdiv).
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumMJDuyan
(𝐓𝐋𝐄 𝟏𝟎𝟎) (𝐋𝐞𝐬𝐬𝐨𝐧 𝟏)-𝐏𝐫𝐞𝐥𝐢𝐦𝐬
𝐃𝐢𝐬𝐜𝐮𝐬𝐬 𝐭𝐡𝐞 𝐄𝐏𝐏 𝐂𝐮𝐫𝐫𝐢𝐜𝐮𝐥𝐮𝐦 𝐢𝐧 𝐭𝐡𝐞 𝐏𝐡𝐢𝐥𝐢𝐩𝐩𝐢𝐧𝐞𝐬:
- Understand the goals and objectives of the Edukasyong Pantahanan at Pangkabuhayan (EPP) curriculum, recognizing its importance in fostering practical life skills and values among students. Students will also be able to identify the key components and subjects covered, such as agriculture, home economics, industrial arts, and information and communication technology.
𝐄𝐱𝐩𝐥𝐚𝐢𝐧 𝐭𝐡𝐞 𝐍𝐚𝐭𝐮𝐫𝐞 𝐚𝐧𝐝 𝐒𝐜𝐨𝐩𝐞 𝐨𝐟 𝐚𝐧 𝐄𝐧𝐭𝐫𝐞𝐩𝐫𝐞𝐧𝐞𝐮𝐫:
-Define entrepreneurship, distinguishing it from general business activities by emphasizing its focus on innovation, risk-taking, and value creation. Students will describe the characteristics and traits of successful entrepreneurs, including their roles and responsibilities, and discuss the broader economic and social impacts of entrepreneurial activities on both local and global scales.
2. About me
• Editor-in-Chief, Online Searcher
• Successor title to ONLINE, which merged with Searcher in 2013, published by
Information Today, Inc. (infotoday.com/onlinesearcher)
• Write Dollar Sign column about business resources
• Conference Program Director
• Internet Librarian International (internet-librarian.com)
• Enterprise Search & Discovery Summit (enterprisesearchanddiscovery.com)
• Data Summit (dbta.com/datasummit)
• Contributor to WebSearch University, Computers in Libraries,
Internet Librarian, other library/technology conferences
3. Sophisticated user
• A sophisticated user of what?
• Library services
• Library collections
• Digital information
• Sophisticated searcher
• Super searcher
• How much sophistication will be needed going forward?
4. What makes a super searcher?
• Lessons from Super Searcher books
• Joy of the chase
• Intrigued by intricacies of search strategizing
• Understands information architecture
• Willing to try almost anything to get to the answer
• Enjoys finding needles in haystacks
• Tenacious and curious
• Likes language
5. Brief history of search
• Designed for info pros, intermediary searchers
• Extreme Boolean
• Structured data
• Textual
• Expensive
• Some of search legacy still exists
6. Search today from users’ perspective
• Search = Google
• Google Scholar can replace library subscription databases
• Research requires more than Google
• Googley expectations
• Simplify, simplify, simplify
• TL;DR
• Google does not do Boolean
7. Search strategies
• Sophisticated searchers still use Boolean to good effect in searching
bibliographic databases
• Strategies depend on intent
• A few good articles for an undergraduate paper
• A business decision for an MBA student
• A search to determine patentability
• A systematic review
• Examples from business, patent, medical
8. Business
• ((pet OR dog OR cat OR fish) ADJ food) AND ((market ADJ3 (share OR
size OR trends)) AND (us OR united states OR canada OR britain OR uk
OR united kingdom)
• Factiva syntax
• ((pet OR dog OR cat OR fish) PRE/2 food) AND (market NEAR/3 (share
OR size OR penetration)) AND (us OR united states OR uk OR united
kingdom OR Britain)
• ProQuest syntax
9. Patent
• Patbase search (courtesy Tom Wolff, “Mistakes Happen: A
Patentability Case Study”, Online Searcher, Nov/Dec 2019) for trigger-
activated animal nail clippers
• tac=((trigger* or activat* or actuat* or releas*) w5 (clipper* or
trimmer* or cutter* or nipper*))
• tac=((nail* or claw*) w5 (clip* or trim* or cut* or nip*))
• sc=A45D29/02 – CPC/IPC patent class on Nail clippers or cutters
• uc=30/28 – US patent class on Manicure… nippers
10. Medical
• Metformin AND (“adverse drug reaction” OR “drug overdose” OR
“drug misuse” OR “drug abuse” OR “substance abuse” OR pregnancy
OR “drug efficacy” OR “drug withdrawal” OR “drug tolerance” OR
“medication error” OR death OR “drug interaction” OR
carcinogenicity OR “off label drug use” OR “occupational exposure”
OR toxicity OR intoxication OR “drug contraindication” OR “congenital
disorder” OR “drug treatment failure” OR lactation OR “case report”
OR “environmental exposure” OR “treatment
contraindication”)
• PubMed syntax
11. Unsophisticated searcher
• Pet food
• Clippers
• Metaformin
• Assumption is that Google (or another web search engine) will intuit
what the searcher wants to know
• Single interface to all knowledge
12. Web search is different
• Boolean doesn’t really work
• Long search queries with Boolean logic, nested terms, bound phrases
don’t work
• The NOT command is problematic
• Proximity operators are close to non-existent except for exact phrase
• Relevancy is determined by machine learning
• The interface is increasingly voice
• This sets expectations for library database platforms
13. Who is searching
• Everybody
• Do they want to learn how to search?
• No, they just want to search
• And they all think they are expert, sophisticated searchers
• Browsing versus searching
• Web search is grounded in shopping, transactional, quick answers
16. What students want
• Fast response time, convenience
• Relevant results
• Complete answers
• Unshelved.com – July 29, 2010 – Complicated question
• Visualization – show me
• Analytics – explain what it means
17. Disambiguating
• Civil War
• Whose Civil War?
• What countries?
• When?
• Did they even call it a Civil War?
• Could new technology answer these questions without a librarian
doing a reference interview?
• Geolocation, knowing what in which courses the student is enrolled
• Previous search history, personalization
18. Evolution of search technology
• From text to multimedia
• From command language to AI-driven
• Graph databases, Predictive analytics, Natural Language Processing, Semantic
search
• Machine learning
• Thesauri, Metadata, Controlled vocabulary
• Big data
• What happens when everything is digitized and machines can read all of it
19. Multimedia
• Default for web search
• Art students want images
• Search by color, technique?
• Music students want sound
• Search by tune?
• Theater students want actual performances
• Search by stage, by costume?
• We’re not quite there yet
20. AI technologies
• Knowledge graphs – semantic technology, network of things we want
to describe and how they are related – used extensively by web
search engines
• Predictive analytics – text analytics – analyzes documents to
determine what actions to take, identify outliers
• Natural Language Processing – disambiguating language in full text
search
• Semantic search – context not just content
• Machine learning – Indicator of relevance based on prior search
behaviors
22. Big data
• Pattern matching in millions of documents
• Unstructured information
• Overwhelming amount of available information
• Legal contracts
• Digital humanities
• Predicting recidivism
• Possibility of bias
23. Search platforms
• Content dictates information architecture
• One search box to rule them all won’t happen
• Intent remains critical component
• What is intuitive to you may not be intuitive to me
• Relevancy is in the eye of the beholder
• Digital transformation is ongoing
24. Sophisticated user
• How much sophistication will be needed going forward?
• When will search strings become obsolete?
• Will search results be totally non-textual?
• Can computers take over research?
• Who checks for bias?
• Information professionals must be flexible and willing to unlearn
techniques they swore by in the past