Presented by Bob Kasenchak of Access Innovations, Inc. at the 2014 Special Libraries Association (SLA) annual meeting in Vancouver, British Columbia on June 7, 2014.
An all-day version of Access Innovations' Taxonomy Fundamentals workshop, presented by Marjorie M.K. Hlava and Bob Kasenchak at the 2014 Special Libraries Association (SLA) annual meeting in Vancouver, British Columbia on June 7, 2014.
Opening presentation for Track 1 of the 2012 Taxonomy Boot Camp, October 16, 2012.
Presented by Marjorie M.K. Hlava of Access Innovations and Heather Hedden of Hedden Information Management.
On the uses and implementation of taxonomy on the Web, with a particular focus on the taxonomy as part of an enterprise information environment. Presented by Marjorie M.K. Hlava during Content Week 2005 in Miami, Florida.
Presentation given on March 12, 2013 by Marjorie M.K. Hlava of Access Innovations, Inc. as a webinar for the San Francisco chapter of the Special Libraries Association.
How to make your content users more productive using Access Innovations, Inc.'s Navtree and Machine Aided Indexer (M.A.I.™), parts of the Data Harmony® software suite.
An all-day version of Access Innovations' Taxonomy Fundamentals workshop, presented by Marjorie M.K. Hlava and Bob Kasenchak at the 2014 Special Libraries Association (SLA) annual meeting in Vancouver, British Columbia on June 7, 2014.
Opening presentation for Track 1 of the 2012 Taxonomy Boot Camp, October 16, 2012.
Presented by Marjorie M.K. Hlava of Access Innovations and Heather Hedden of Hedden Information Management.
On the uses and implementation of taxonomy on the Web, with a particular focus on the taxonomy as part of an enterprise information environment. Presented by Marjorie M.K. Hlava during Content Week 2005 in Miami, Florida.
Presentation given on March 12, 2013 by Marjorie M.K. Hlava of Access Innovations, Inc. as a webinar for the San Francisco chapter of the Special Libraries Association.
How to make your content users more productive using Access Innovations, Inc.'s Navtree and Machine Aided Indexer (M.A.I.™), parts of the Data Harmony® software suite.
Semantic search helps business people find answers to pressing questions by wading through oceans of information to find nuggets of meaningful information. In this presentation we’ll discuss how semantic search and content analysis technologies are starting to appear in the marketplace today. We’ll provide a recap of what semantic search is and what the key benefits are, then we’ll answer the following questions:
• Is semantic search a feature, an application, or enterprise system?
• How can I add semantic search to my existing work processes?
• Will I need to replace my existing content technologies?
• What will I need to do to prepare my content for semantic search?
• Is semantic search just for documents or can I search my data too?
• Can I use semantic search to find information on the internet and other public data sources?
• Are there standards to consider?
Improve your Searches, Get Trained up on Expernova!Expernova
Access the Best Experts Worldwide and Manage your Company's Networks thanks to Expernova.
Discover in this presentation helpful tips and examples on how to carry out more complex searches using the operators available with the solution.
Obtain even more relevant results!
Open science can contribute to AI trustworthiness. This talk is a categorization of scientific data platforms, and a framing of AI trustworthiness with pointers to open science contributions.
Should We Expect a Bang or a Whimper? Will Linked Data Revolutionize Scholar Authoring and Workflow Tools?
Jeff Baer, Senior Director of Product Management, Research Development Services, Proquest
Information Extraction and Linked Data CloudDhaval Thakker
In the media industry there is a great emphasis on providing descriptive metadata as part of the media assets to the consumers. Information extraction (IE) is considered an important tool for metadata generation process and its performance largely depend on the knowledge base it utilizes. The advances in the “Linked Data Cloud” research provide a great opportunity for generating such knowledge base that benefit from the participation of wider community. In this talk, I will discuss our experiences of utilizing Linked Data Cloud in conjunction with a GATE-based IE system.
Funding For Research!
Carol Anne Meyer, @meyercarol, who is responsible for Business Development and Marketing at CrossRef describes CrossRef's FundRef funder identification service, which correlates funding organizations with the scholarly articles and other documents that result from their research expenditures The FundRef taxonomy allows researchers to choose from a controlled vocabulary of thousands of funder names when they submit papers for publication. FundRef Search and other tools help funders demonstrate and measure the impact of their activities. CrossRef Member Publishers participating in FundRef will be able serve the author/researcher community by helping them meet their funder compliance and reporting requirements and by displaying funding information through the CrossMark service. Carol will also introduce CrossRef services that allow researchers and publishers to reduce the time and effort necessary to arrange the necessary permissions for text and data mining, She will also explain the relationship between these services and initiatives to increase public access to scholarly content.
The presentation discusses the following topics:
- What Is ORCID?
- Why ORCID Important?
- ORCID Features
- Create an ORCID Account
- ORCID Researcher Profile
Cape Town - Bioschemas workshop before the Bioinformatics Education Summit.
Explains schema.org, Bioschemas, TeSS Case study, and the tools and implementation techniques adopters can use
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Spark Summit
Elasticsearch provides native integration with Apache Spark through ES-Hadoop. However, especially during development, it is at best cumbersome to have Elasticsearch running in a separate machine/instance. Leveraging Spark Cluster with Elasticsearch Inside it is possible to run an embedded instance of Elasticsearch in the driver node of a Spark Cluster. This opens up new opportunities to develop cutting-edge applications. One such application is Dataset Search.
Oscar will give a demo of a Dataset Search Engine built on Spark Cluster with Elasticsearch Inside. Motivation is that once Elasticsearch is running on Spark it becomes possible and interesting to have the Elasticsearch in-memory instance join an (existing) Elasticsearch cluster. And this in turn enables indexing of Datasets that are processed as part of Data Pipelines running on Spark. Dataset Search and Data Management are R&D topics that should be of interest to Spark Summit East attendees who are looking for a way to organize their Data Lake and make it searchable.
Semantic search helps business people find answers to pressing questions by wading through oceans of information to find nuggets of meaningful information. In this presentation we’ll discuss how semantic search and content analysis technologies are starting to appear in the marketplace today. We’ll provide a recap of what semantic search is and what the key benefits are, then we’ll answer the following questions:
• Is semantic search a feature, an application, or enterprise system?
• How can I add semantic search to my existing work processes?
• Will I need to replace my existing content technologies?
• What will I need to do to prepare my content for semantic search?
• Is semantic search just for documents or can I search my data too?
• Can I use semantic search to find information on the internet and other public data sources?
• Are there standards to consider?
Improve your Searches, Get Trained up on Expernova!Expernova
Access the Best Experts Worldwide and Manage your Company's Networks thanks to Expernova.
Discover in this presentation helpful tips and examples on how to carry out more complex searches using the operators available with the solution.
Obtain even more relevant results!
Open science can contribute to AI trustworthiness. This talk is a categorization of scientific data platforms, and a framing of AI trustworthiness with pointers to open science contributions.
Should We Expect a Bang or a Whimper? Will Linked Data Revolutionize Scholar Authoring and Workflow Tools?
Jeff Baer, Senior Director of Product Management, Research Development Services, Proquest
Information Extraction and Linked Data CloudDhaval Thakker
In the media industry there is a great emphasis on providing descriptive metadata as part of the media assets to the consumers. Information extraction (IE) is considered an important tool for metadata generation process and its performance largely depend on the knowledge base it utilizes. The advances in the “Linked Data Cloud” research provide a great opportunity for generating such knowledge base that benefit from the participation of wider community. In this talk, I will discuss our experiences of utilizing Linked Data Cloud in conjunction with a GATE-based IE system.
Funding For Research!
Carol Anne Meyer, @meyercarol, who is responsible for Business Development and Marketing at CrossRef describes CrossRef's FundRef funder identification service, which correlates funding organizations with the scholarly articles and other documents that result from their research expenditures The FundRef taxonomy allows researchers to choose from a controlled vocabulary of thousands of funder names when they submit papers for publication. FundRef Search and other tools help funders demonstrate and measure the impact of their activities. CrossRef Member Publishers participating in FundRef will be able serve the author/researcher community by helping them meet their funder compliance and reporting requirements and by displaying funding information through the CrossMark service. Carol will also introduce CrossRef services that allow researchers and publishers to reduce the time and effort necessary to arrange the necessary permissions for text and data mining, She will also explain the relationship between these services and initiatives to increase public access to scholarly content.
The presentation discusses the following topics:
- What Is ORCID?
- Why ORCID Important?
- ORCID Features
- Create an ORCID Account
- ORCID Researcher Profile
Cape Town - Bioschemas workshop before the Bioinformatics Education Summit.
Explains schema.org, Bioschemas, TeSS Case study, and the tools and implementation techniques adopters can use
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Spark Summit
Elasticsearch provides native integration with Apache Spark through ES-Hadoop. However, especially during development, it is at best cumbersome to have Elasticsearch running in a separate machine/instance. Leveraging Spark Cluster with Elasticsearch Inside it is possible to run an embedded instance of Elasticsearch in the driver node of a Spark Cluster. This opens up new opportunities to develop cutting-edge applications. One such application is Dataset Search.
Oscar will give a demo of a Dataset Search Engine built on Spark Cluster with Elasticsearch Inside. Motivation is that once Elasticsearch is running on Spark it becomes possible and interesting to have the Elasticsearch in-memory instance join an (existing) Elasticsearch cluster. And this in turn enables indexing of Datasets that are processed as part of Data Pipelines running on Spark. Dataset Search and Data Management are R&D topics that should be of interest to Spark Summit East attendees who are looking for a way to organize their Data Lake and make it searchable.
Life Science Database Cross Search and MetadataMaori Ito
Life science databases are sometimes difficult to understand due to lack of information. I'd like to add metadata into databases and improve search results.
WOTS2E: A Search Engine for a Semantic Web of ThingsAndreas Kamilaris
A Semantic Web of Things (SWoT) brings together the Semantic Web and the Web of Things (WoT), associating
semantically annotated information to web-enabled physical de-
vices, services and their data, towards seamless data integration and better understanding of real-world information. A missing element in order to realize SWoT is a standardized, scalable and flexible way to globally discover in (near) real time web-connected embedded devices, as well as their semantic data. To address this gap, we propose WOT Semantic Search Engine (WOTS2E), which is a search engine for the SWoT, based on web crawling, being able to discover Linked Data endpoints and, through them, WoT-enabled devices and their services. In this presentation, we describe the design, development and implementation of WOTS2E, as well as an evaluation procedure showing its operation and performance across the web.
Lesson 7 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
Royal society of chemistry activities to develop a data repository for chemis...Ken Karapetyan
The Royal Society of Chemistry publishes many thousands of articles per year, the majority of these containing rich chemistry data that, in general, in limited in its value when isolated only to the HTML or PDF form of the articles commonly consumed by readers. RSC also has an archive of over 300,000 articles containing rich chemistry data especially in the form of chemicals, reactions, property data and analytical spectra. RSC is developing a platform integrating these various forms of chemistry data. The data will be aggregated both during the manuscript deposition process as well as the result of text-mining and extraction of data from across the RSC archive. This presentation will report on the development of the platform including our success in extracting compounds, reactions and spectral data from articles. We will also discuss our developing process for handling data at manuscript deposition and the integration and support of eLab Notebooks (ELNS) in terms of facilitating data deposition and sourcing data. Each of these processes is intended to ensure long-term access to research data with the intention of facilitating improved discovery.
The Royal Society of Chemistry publishes many thousands of articles per year, the majority of these containing rich chemistry data that, in general, in limited in its value when isolated only to the HTML or PDF form of the articles commonly consumed by readers. RSC also has an archive of over 300,000 articles containing rich chemistry data especially in the form of chemicals, reactions, property data and analytical spectra. RSC is developing a platform integrating these various forms of chemistry data. The data will be aggregated both during the manuscript deposition process as well as the result of text-mining and extraction of data from across the RSC archive. This presentation will report on the development of the platform including our success in extracting compounds, reactions and spectral data from articles. We will also discuss our developing process for handling data at manuscript deposition and the integration and support of eLab Notebooks (ELNS) in terms of facilitating data deposition and sourcing data. Each of these processes is intended to ensure long-term access to research data with the intention of facilitating improved discovery.
Real-time Recommendations for Retail: Architecture, Algorithms, and DesignJuliet Hougland
Users are constantly searching for new content and to stay competitive organizations must act immediately based on up-to-date data. Outdated recommendations decrease the likelihood of presenting the right offer and make it harder to maintain customer loyalty. In order to provide the most relevant recommendations and increase engagement, organizations must track customer interactions and re-score recommendations on the fly.
Data sources have expanded dramatically to include a wealth of historical data and a constant influx of behavior data. The key to moving from predictive models, applied in batch, to models that provide responses in real time, is to focus on the efficiency of model application. The speed that recommendations can be served is influenced by:
Architecture of the recommendation serving platform
Choice of recommendation algorithm
Datastore access patterns
In this presentation, we’ll discuss how developers can use open source components like HBase and Kiji to develop low-latency recommendation models that can be easily deployed by e-commerce companies. We will give practical advice on how to choose models and design data stores that make use of the architecture and quickly serve new recommendations.
The Research Data Alliance (RDA) has developed a Catalogue of Metadata standards and tools aimed at researchers and those who support them. In its new version, the Metadata Standards Catalog will provide much greater detail about metadata standards and tools, and through its new API - it will be usable within other applications. It will also provide a platform for furthering the work of the RDA Metadata Interest Group, which is seeking to improve the interoperability of metadata in different standards by working towards semi-automatically generated converters.
Data science is a multidisciplinary field that combines scientific methods, algorithms, and systems to extract knowledge and insights from structured and unstructured data. It involves analyzing, interpreting, and deriving actionable information from large and complex datasets to support decision-making and solve problems in various domains.
Key components of data science include:
Data Collection and Preparation: Data scientists gather and collect data from various sources, which may include databases, websites, sensors, social media, or other digital platforms. They clean, transform, and preprocess the data to ensure its quality and suitability for analysis.
Data Exploration and Visualization: Data scientists explore and visualize the data using statistical techniques and visualization tools. They look for patterns, trends, and relationships within the data to gain a deeper understanding of the underlying insights and potential correlations.
Machine Learning and Predictive Modeling: Data scientists apply machine learning algorithms and predictive modeling techniques to build models that can make predictions or classifications based on the available data. This involves training models on historical data and evaluating their performance on new or unseen data.
Statistical Analysis: Statistical analysis is a fundamental aspect of data science. Data scientists use statistical methods to analyze data, test hypotheses, identify significant variables, and quantify uncertainties to make informed decisions.
Data Interpretation and Communication: Data scientists interpret the results of their analysis and communicate their findings to stakeholders in a clear and meaningful way. They use data visualization techniques, storytelling, and data-driven insights to convey complex information and facilitate decision-making.
Domain Knowledge: Data scientists often work in specific domains or industries and require domain knowledge to understand the context and interpret the results effectively. This allows them to identify relevant variables, apply appropriate techniques, and generate actionable insights.
Data science has applications across various sectors, including finance, healthcare, marketing, retail, telecommunications, and more. It helps organizations gain a competitive advantage, optimize processes, identify trends, improve customer experiences, and drive data-informed decision-making.
To work in data science, proficiency in programming languages (such as Python or R), statistical knowledge, data manipulation skills, and experience with machine learning algorithms are typically required. Data scientists also need critical thinking, problem-solving abilities, and effective communication skills to effectively analyze data and communicate insights to both technical and non-technical stakeholders.
Have you ever wondered how search works while visiting an e-commerce site, internal website, or searching through other types of online resources? Look no further than this informative session on the ways that taxonomies help end-users navigate the internet! Hear from taxonomists and other information professionals who have first-hand experience creating and working with taxonomies that aid in navigation, search, and discovery across a range of disciplines.
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsAccess Innovations, Inc.
In today's highly charged atmosphere of anxiety and anticipation about AI, and especially LLMs,
one of the biggest concerns is how to ensure that it returns accurate results (meaning both true
and pertinent to its audience). This is particularly important to scholarly, scientific, and other
technical organizations, whose constituents are often in very specific domains, such as
medicine, engineering, history, biology, chemistry, etc. One extremely useful tool to incorporate in an AI-based process in such cases is a comprehensive and well-structured knowledge domain which is based on a controlled vocabulary.
Smart Submit and Client Support
Michael Millar, Junior Software Developer, and Frank Coates, Client Support Manager
Get a peek at the new and improved Smart Submit and learn about new, easier ways to contact the support team at Access Innovations.
How a Good Taxonomy Can Provide Valuable Business Insights
Kristen Monahan, Public Library of Science (PLOS)
Kristen is a business analyst and she won’t be talking about the PLOS taxonomy but rather how she uses that taxonomy to drill down into the massive amount of content, metadata, and usage and process data that is PLOS for deep, detailed analysis and to drive business decisions. Much of this work involves trend analysis. For example, trend analysis of submissions can look at the time it takes from submission to decision by subject (narrow subjects like Covid, broad subjects like biotechnology), or by institution, or by country, etc. to see not just the overall big picture but where in their submission and peer review workflows the bottlenecks might be. A trend analysis of topics over time can prompt them to issue a call for papers for a topic they think needs to be better covered–and then look at both short-term and long-term trends resulting from that call to papers. Their taxonomy doesn’t just make their content smarter–it makes how they publish that content smarter too.
Editor and Peer Reviewer Assignments Using Data Harmony
Andrew Smeall, Hindawi Publishing
Andrew will show how Hindawi, an open access publisher, applies their taxonomy to make editor and reviewer assignments for incoming submissions to their journals.
Cloud Deployment of Data Harmony
Jeffrey Gordon, Lead Developer, Access Innovations, Inc.
Jeffrey will describe the cloud deployment of the Data Harmony software.
Marjorie M. K. Hlava, President, Chair of the Board, and Chief Scientist, Access Innovations, Inc.
During this annual highlight of the DHUG meetings, Margie will discuss the exciting new changes and additions to the Data Harmony software. She will be joined by some members of our software development team to talk about specific initiatives we have worked on over the past year.
Access Innovations and Atypon: Beyond Content Tagging
Hong Zhou and Gerasimos Razis, Atypon
Gerasimos and Hong will discuss the changes to the Atypon platform since DHUG 2020.
Getting to the Point: Using AI and Taxonomies to Craft Meta -Titles
Travis Hicks, American Society of Clinical Oncology (ASCO)
Looking to better leverage SEO and include key terms in the url construct for research abstracts, ASCO is working with Access Innovations to evaluate how to programmatically create short titles for abstracts. The idea is to index titles against existing taxonomies as a way of producing a short title that succinctly identified what an abstract is about for purposes of constructing a new url configuration. Travis will discuss the need, challenges, and early results of the project.
Expanding the Use of MAIstro at ASCE
Xi Van Fleet, American Society for Civil Engineers
Using MAIstro, ASCE created the subject/topic taxonomies for their publications to enhance content discovery and business insight. After achieving their primary goal, they have been expanding its use for other applications.
Lessons Learned From Building a Taxonomy and Indexing 140 Years of Content
Michael Darr, Project Manager, D33 – American Chemical Society Pubs IT
Michael will talk about the things they would do differently if they were to build a new taxonomy and index a legacy file, and the things they did right the first time.
Bill’s talk is entitled “WHAT’S IN A NAME? How Kew helps drug regulators disambiguate the messy welter of medicinal plant names to shore up regulation and save lives”. It’s really eye-opening to realize how complicated and imprecise names can get, with multiple scientific, pharmaceutical and popular names for the same thing or with one name used for completely different things.
This has real-world consequences. For example, the EU mistakenly banned a useful plant we use every day when intending to ban a poisonous one because of a naming problem. How Kew is using semantic and taxonomic tools and technologies to bring order to this complexity (I almost said chaos) is really fascinating. They’re also helping to disambiguate nomenclature and provide links to authoritative information for botanical terms for use in journal articles, among other things.
3. OUTLINE
• Data
• Structured Data
• Unstructured Data
• Metadata
• Subject Metadata
• Entity (author, institution) Metadata
• Document Type Metadata
• Automating Metadata
• Heuristic/Statistical/Inferential
• Rule-based
I Don’t Have Time for Metadata!
5. STRUCTURED VS. UNSTRUCTURED DATA
Present different problems – and possible solutions – for
automatically adding metadata
I Don’t Have Time for Metadata!
6. STRUCTURED VS. UNSTRUCTURED DATA
I Don’t Have Time for Metadata!
Association,in view of abuses and lack of consistency
in published reports, has asserted that the all-inclusive
income statement,containing allincome items
recognized as determinantsof net income, is the answer
to these questions.2 The Securities and Exchange
Commission has also
strongly favored this solution.3 On the 1 Committeeon
Accounting Procedure, American
Instituteof Accountants, "Income and Earned Surplus,"
Accounting Research BulletinNo. 32 (December,
1947). 2 (1) "A TentativeStatementof Accounting
Principles Affecting Corporate Reports," THE
ACCOUNTING REvIEw, June, 1936, pp. 187-191; (2)
Accounting
7. STRUCTURED VS. UNSTRUCTURED DATA
I Don’t Have Time for Metadata!
<volume>325</volume>
<issue>5945</issue>
<fpage seq="c">1206</fpage>
<lpage>1206</lpage>
<history><date date-type="received"><day>26</day><month>02</month><year>2009
</year></date><date date-type="accepted"><day>11</day><month>08</month>
<year>2009</year></date></history>
<permissions>
<copyright-statement>Copyright © 2009</copyright-statement>
<copyright-year>2009</copyright-year>
<copyright-holder>Your name here</copyright-holder>
</permissions>
<abstract>
<p>Our extended ontogenetic growth model is a theoretical model based on conservation
of energy and general biological mechanisms underlying ontogenetic growth. We do not
believe that the comments of Makarieva <italic>et al</italic>. and Sousa <italic>et al
</italic>. expose substantive problems with our model. Nevertheless, they raise
interesting, still unresolved questions and point to philosophical differences about the role
of theory and of simple, general models as opposed to complicated, specific models.</p>
</abstract>
8. STRUCTURED VS. UNSTRUCTURED DATA
• Just extracting basic information
• Author
• Institution
• Title
• Document type
• Accession number(s)
…can be a challenge.
However…
I Don’t Have Time for Metadata!
9. STRUCTURED VS. UNSTRUCTURED DATA
• Predictability
• Positionality
I Don’t Have Time for Metadata!
Journal name/
Issue/Vol./etc.
Article Title
Copyright info
Author info
Abstract
10. UNSTRUCTURED DATA => STRUCTURED DATA!
<journal>Transactions on Vehicular Technology</journal>
<article-title>Relationship of Average Transmitted and Received Energies in Adaptive
Transmission</article-title>
<authors><author-surname>Kotelba</author-surname><author-firstname>Adrian</author-
firstname><affiliation>Member, IEEE</affiliation></authors>
<copyright-info><copyright-date>2009</copyright-date></copyright-info>
<abstract><p>This paper studies the…</p></abstract>
NOTE: Some cleanup may be required
I Don’t Have Time for Metadata!
11. STRUCTURED VS. UNSTRUCTURED DATA
• Basic information already tagged, labeled, and easy to
extract
• Author info
• Title
• Journal/Volume/Issue etc.
• We can add semantic (or subject) metadata
• Targeting only those parts of the text we require
• Title
• Abstract
• Full text body
• Exclude references, etc.
I Don’t Have Time for Metadata!
12. SEMANTIC METADATA
Uncontrolled
Automatic keyword extraction
Crowdsourced/folksonomic tags
Controlled – from a Thesaurus (or Taxonomy…)
Inferential (Heuristic; Statistical)
Rule-based
I Don’t Have Time for Metadata!
13. SEMANTIC METADATA: HOW?
Controlled – from a Thesaurus (or Taxonomy…)
Inferential (Heuristic; Statistical)
Rule-based
Manual tagging
Automatic tagging
I Don’t Have Time for Metadata!
15. SEMANTIC METADATA: MANUAL ENTRY
I Don’t Have Time for Metadata!
A Thought Experiment
• Let’s say a manual indexer can index 10 records/hour
• Let’s say the manual indexers are perfectly consistent (they’re not)
• Let’s say your manual indexers are paid $10/hour (good luck with that)
If you have 10,000 articles/pieces of content:
It would take a manual indexer 1000 hours (25 weeks) and cost $10,000
If you have 100,000 articles:
It would take a manual indexer 10,000 hours (250 weeks, or almost 5 years)
and cost $100,000
If you have 1,000,000 articles:
It would take a manual indexer 100,000 hours (~48 years) and $1,000,000
17. SEMANTIC METADATA: WHY?
Disambiguate the ambiguous
Specify most specific topics
Improve information retrieval
Search
Browse
Enable advanced analytics
I Don’t Have Time for Metadata!
20. SEMANTIC METADATA: SPECIFICATION
Beyond exact string matches: Context. Matters.
Indexing to most specific term
- Microscopes
- Electron microscopes
- Scanning electron microscopes
I Don’t Have Time for Metadata!
22. SEMANTIC METADATA: WHY?
Improving information retrieval: Search
Allows user to search by tags
Ensures consistent and reliable retrieval
Speeds electronic search
I Don’t Have Time for Metadata!
24. SEMANTIC METADATA: WHY?
Improving information retrieval: Search
I Don’t Have Time for Metadata!
Metadata-based
Search
Results
Based on
metadata
25. SEMANTIC METADATA: WHY?
Improving information retrieval: Browse
I Don’t Have Time for Metadata!
Taxonomy
browse
Results
Based on
metadata
26. SEMANTIC METADATA: WHY?
Improving information retrieval: Browse
I Don’t Have Time for Metadata!
Taxonomy
browse
Additional
Search
filters
27. SEMANTIC METADATA: WHY?
Improving information retrieval: Analytics
Combine subject metadata with metadata about
Authors
Institutions
Publications (Journals, Magazines, etc.)
Publication Types
…to create detailed informatics about your data, users,
authors, and whatever else is relevant or useful
I Don’t Have Time for Metadata!
28. SEMANTIC METADATA: WHY?
Improving information retrieval: Analytics
I Don’t Have Time for Metadata!
Taxonomy
term
Narrower
terms
Broader
Term(s)
Authors who publish
on this topic
29. I DON’T HAVE TIME FOR METADATA!
I Don’t Have Time for Metadata!
Since Metadata allows you to do things you already have
want
need to do:
It’s always time for metadata.