The document summarizes a research project conducted by the Cataloging and Metadata Services unit at Utah State University to analyze user search behavior and the performance of MARC records in search results. The project involved analyzing web logs of searches, scraping search results pages, and coding records and fields in Airtable. Key findings included that MARC records make up around 20% of search results on average, vendor records appear more often than locally created records, and the 245 and 505 fields were most important for retrieving records while the 505, 520 and 650 fields had the greatest impact if missing from records. Guidelines for cataloging practice were proposed based on the findings.
Is what's 'trending' what¹s worth purchasing?NASIG
Presenters:
Stacy Konkiel, Outreach & Engagement Manager, Altmetric
Rachel Miles, Kansas State University Libraries
Sarah Sutton, Assistant Professor in the School of Library and Information Management at Emporia State University
New forms of usage data like altmetrics are helping librarians to make smarter decisions about their collections. A recent nationwide study administered to 13,000+ librarians at R1 universities shines light on exactly how these metrics are being applied in academia. This presentation will share survey results, including as-yet-unknown rates of technology and metrics uptake among collection development librarians, the most popular citation databases and altmetrics services being used to make decisions, and surprising factors that affect attitudes toward the use of metrics. This presentation will also offer actionable insights on how altmetrics are being paired with bibliometrics and usage statistics to form a more complete picture of “trending” scholarship that’s worth purchasing. Through sharing the survey results and opening up a discussion about the potential altmetrics hold for informing collection development, the presenters aim to provide a learning opportunity for attendees which will enhance their competencies for e-resource management, specifically, core competence for e-resource librarians 3.5, use of bibliometrics for collection assessment, and 3.7, identity and analyze emerging technologies.
Merging Traffic: A Combined Reference and Access Services DeskFuWaye Bender
Fu Zou, Melanie Church, Tom Burns presentation at Chinese American Librarans Association (CALA) Midwest Chapter Annual Conference on May 21, 2011 at Indiana University Southeast (IUS) Library
Is what's 'trending' what¹s worth purchasing?NASIG
Presenters:
Stacy Konkiel, Outreach & Engagement Manager, Altmetric
Rachel Miles, Kansas State University Libraries
Sarah Sutton, Assistant Professor in the School of Library and Information Management at Emporia State University
New forms of usage data like altmetrics are helping librarians to make smarter decisions about their collections. A recent nationwide study administered to 13,000+ librarians at R1 universities shines light on exactly how these metrics are being applied in academia. This presentation will share survey results, including as-yet-unknown rates of technology and metrics uptake among collection development librarians, the most popular citation databases and altmetrics services being used to make decisions, and surprising factors that affect attitudes toward the use of metrics. This presentation will also offer actionable insights on how altmetrics are being paired with bibliometrics and usage statistics to form a more complete picture of “trending” scholarship that’s worth purchasing. Through sharing the survey results and opening up a discussion about the potential altmetrics hold for informing collection development, the presenters aim to provide a learning opportunity for attendees which will enhance their competencies for e-resource management, specifically, core competence for e-resource librarians 3.5, use of bibliometrics for collection assessment, and 3.7, identity and analyze emerging technologies.
Merging Traffic: A Combined Reference and Access Services DeskFuWaye Bender
Fu Zou, Melanie Church, Tom Burns presentation at Chinese American Librarans Association (CALA) Midwest Chapter Annual Conference on May 21, 2011 at Indiana University Southeast (IUS) Library
Presenters:
Patricia Cleary, Global eProduct Development Manager, Springer
Kristen Garlock, ITHAKA/JSTOR
Denise D Novak, Acquisitions Librarian, Carnegie Mellon University
Ethen Pullman, Carnegie Mellon University
Academic libraries and publishers are fielding an increasing number of faculty/researcher text mining requests. This program will address these needs and offer some best practices. Specific examples from academic libraries will highlight the administrative and technical issues, while the resource provider perspective will focus on the challenges of rights management clearance and how to deliver the information, as well as the publisher philosophy on supporting digital scholarship efforts. The session will capture the issues from both sides and provide attendees with a framework for handling requests at their own institutions. In keeping with the theme "Embracing New Horizons" we will use this time to explore possibilities for better communication around digital scholarship issues, and the development of best practices, through appropriate channels.
Learn about preliminary results of research undertaken to answer the question how have the Core Competencies for Electronic Resource Librarians, adopted in July 2013 by NASIG, affected the qualifications for and responsibilities of electronic resources librarians as they are depicted in job ads posted between 2012 and 2014.
15 Student Data Secrets that Could Change Your Library, Number 5 Will Shock YouTiffany Garrett
For two years librarians at Nevada State College have been collecting student-level data on library resource use and matching it to student success outcomes like retention and GPA. This presentation will share what we’ve learned about collecting, storing, and securing student-level data sets.
Escape the data dungeon: Shedding light on strategies to share your findingsKimberly Vardeman
Presentation by Kimberly Vardeman and Jessica van Haaften at Designing for Digital on March 5, 2018.
We explore how to share user research results with library colleagues, students, and faculty. Our goal is to report information in a meaningful and useful way, while providing transparency and presenting the Library positively. We want to communicate how we take action—that user feedback doesn’t disappear into a data dungeon. We offer a review of how university library websites report findings publicly. We present successes and failures of sharing our work internally and externally.
Any questions or feedback, please contact me.
Resources in uct libraries is_hons_masters_2017Susanne Noll
An introduction to University of Cape Town (UCT) Libraries resources, including navigating the website, understanding print and digital resources, getting to know a reference managing tool and enabling students to evaluate resources.
Crossref webinar: Anna Tolwinska - Crossref Participation Reports Metadata 09...Crossref
Online discovery portals are providing information about your content to researchers and linking to your site via Crossref. A richer record can result in significantly more traffic from places you weren’t expecting.
Learn about where publisher metadata goes, how it is used, and the importance of depositing rich metadata in making the most of these downstream services.
Our speakers include Stephanie Dawson of ScienceOpen; Pierre Mounier of OPERAS, OpenEdition, and the HIRMEOS project; and Laura J. Wilkinson and Anna Tolwinska of Crossref.
Webinar held September 11, 2018
Academic Library Impact: Improving Practice and Essential Areas to ResearchLynn Connaway
Connaway, Lynn Silipigni, William Harvey, Vanessa Kitzie, and Stephanie Mikitish. 2017. “Academic Library Impact: Improving Practice and Essential Areas to Research.” Presented at the Update on Value of Academic Libraries Initiative (ACRL) at the ALA Annual Conference, Chicago, Illinois, June 25.
Presentation for my co-authored paper "Open University Data" on the CIIT conference in 2012. It describes the process and benefits of opening parts of the Faculty of Computer Science and Engineering data in a structured format.
How are MARC records performing in our search environment? This presentation will look at the process and results of a research project that analyzed how users’ search terms matched up with MARC fields, as well as how and where MARC records were displayed in search results lists. Presenters will discuss the process, the results of the project, and outline how attendees can implement similar research projects at their institutions, including tools and techniques they can use to analyze how their own records are surfacing in a search environment.
Presenters:
Patricia Cleary, Global eProduct Development Manager, Springer
Kristen Garlock, ITHAKA/JSTOR
Denise D Novak, Acquisitions Librarian, Carnegie Mellon University
Ethen Pullman, Carnegie Mellon University
Academic libraries and publishers are fielding an increasing number of faculty/researcher text mining requests. This program will address these needs and offer some best practices. Specific examples from academic libraries will highlight the administrative and technical issues, while the resource provider perspective will focus on the challenges of rights management clearance and how to deliver the information, as well as the publisher philosophy on supporting digital scholarship efforts. The session will capture the issues from both sides and provide attendees with a framework for handling requests at their own institutions. In keeping with the theme "Embracing New Horizons" we will use this time to explore possibilities for better communication around digital scholarship issues, and the development of best practices, through appropriate channels.
Learn about preliminary results of research undertaken to answer the question how have the Core Competencies for Electronic Resource Librarians, adopted in July 2013 by NASIG, affected the qualifications for and responsibilities of electronic resources librarians as they are depicted in job ads posted between 2012 and 2014.
15 Student Data Secrets that Could Change Your Library, Number 5 Will Shock YouTiffany Garrett
For two years librarians at Nevada State College have been collecting student-level data on library resource use and matching it to student success outcomes like retention and GPA. This presentation will share what we’ve learned about collecting, storing, and securing student-level data sets.
Escape the data dungeon: Shedding light on strategies to share your findingsKimberly Vardeman
Presentation by Kimberly Vardeman and Jessica van Haaften at Designing for Digital on March 5, 2018.
We explore how to share user research results with library colleagues, students, and faculty. Our goal is to report information in a meaningful and useful way, while providing transparency and presenting the Library positively. We want to communicate how we take action—that user feedback doesn’t disappear into a data dungeon. We offer a review of how university library websites report findings publicly. We present successes and failures of sharing our work internally and externally.
Any questions or feedback, please contact me.
Resources in uct libraries is_hons_masters_2017Susanne Noll
An introduction to University of Cape Town (UCT) Libraries resources, including navigating the website, understanding print and digital resources, getting to know a reference managing tool and enabling students to evaluate resources.
Crossref webinar: Anna Tolwinska - Crossref Participation Reports Metadata 09...Crossref
Online discovery portals are providing information about your content to researchers and linking to your site via Crossref. A richer record can result in significantly more traffic from places you weren’t expecting.
Learn about where publisher metadata goes, how it is used, and the importance of depositing rich metadata in making the most of these downstream services.
Our speakers include Stephanie Dawson of ScienceOpen; Pierre Mounier of OPERAS, OpenEdition, and the HIRMEOS project; and Laura J. Wilkinson and Anna Tolwinska of Crossref.
Webinar held September 11, 2018
Academic Library Impact: Improving Practice and Essential Areas to ResearchLynn Connaway
Connaway, Lynn Silipigni, William Harvey, Vanessa Kitzie, and Stephanie Mikitish. 2017. “Academic Library Impact: Improving Practice and Essential Areas to Research.” Presented at the Update on Value of Academic Libraries Initiative (ACRL) at the ALA Annual Conference, Chicago, Illinois, June 25.
Presentation for my co-authored paper "Open University Data" on the CIIT conference in 2012. It describes the process and benefits of opening parts of the Faculty of Computer Science and Engineering data in a structured format.
How are MARC records performing in our search environment? This presentation will look at the process and results of a research project that analyzed how users’ search terms matched up with MARC fields, as well as how and where MARC records were displayed in search results lists. Presenters will discuss the process, the results of the project, and outline how attendees can implement similar research projects at their institutions, including tools and techniques they can use to analyze how their own records are surfacing in a search environment.
A Close Look at the Four Million Archival MARC Records in WorldCatOCLC
Standards for archival description have been in place for more than thirty years, but what does actual practice look like? In this OCLC Research Library Partners Works in Progress webinar presented 3 December 2015, OCLC Research Program Officer Jackie Dooley gave an overview of her deep dive into the four million records for archival materials in WorldCat.
This is an archive on a webinar delivered on January 12, 2012. Description: If you’re really new to cataloging, this session is for you. In this 90-minute online session, facilitated by NEKLS technology librarian Heather Braum, you will:
learn the basic principles behind cataloging,
discover why librarians catalog,
learn to read a basic MARC record,
see what a good MARC record looks like,
learn basic cataloging terminology,
and practice describing different materials.
Special thanks to Robin Fay for allowing me to use a couple of the ideas shared in this webinar and presentation. See her outstanding slides: http://www.slideshare.net/robinfay/cataloging-basics-presentation.
Search is now normal behaviour: what do we do about that? November 2009Caroline Jarrett
An industry case study presented to OzCHI 2009: 21st Annual Conference of the Australian Computer-Human Interaction Special Interest Group (CHISIG) of the Human Factors and Ergonomics Society of Australia (HFESA), Melbourne, Australia
Search & Recommendation: Birds of a Feather?Toine Bogers
In just a little over half a century, the field of information retrieval has experienced spectacular growth and success, with IR applications such as search engines becoming a billion-dollar industry in the past decades. Recommender systems have seen an even more meteoric rise to success with wide-scale application by companies like Amazon, Facebook, and Netflix. But are search and recommendation really two different fields of research that address different problems with different sets of algorithms in papers published at distinct conferences?
In my talk, I want to argue that search and recommendation are more similar than they have been treated in the past decade. By looking more closely at the tasks and problems that search and recommendation try to solve, at the algorithms used to solve these problems and at the way their performance is evaluated, I want to show that there is no clear black and white division between the two. Instead, search and recommendation are part of a much more fluid continuum of methods and techniques for information access.
(Keynote at "Mind The Gap '14" workshop at the iConference 2014 in Berlin, Germany)
Discovery Systems: Connecting the 21st Century Academic User to ContentAthena Hoeppner
Describes three projects using Discovery to serve academic users: Bibliometric studies of discovery content for graduate and faculty papers; Exposing Open Access content in the Discovery service; Integrating Discovery into the course page editor in a Learning Management System.
Athena Hoeppner. "Discovery Systems: Connecting the 21st Century Academic User to Content." II Seminario Bibliotecas Universitarias del siglo XXI, Bogota, Columbia, 24 March 2015.
OA in the Library Collection: The Challenge of Identifying and Managing Open ...NASIG
Librarians, researchers, and the general public have largely embraced the concept of open access (OA). Yet, incorporating OA resources into existing discovery and tracking systems is often a complicated process. Open access material can be delivered through a variety of publishing or archival mechanisms, creating certain challenges, particularly for those managing e-resources. Although an increasing proportion of research output is becoming open access each year, organization and discovery of these resources remains imperfect.
The debate between the relative merits of Green and Gold OA is regularly discussed in academic circles but less attention is devoted towards Hybrid OA and the challenges inherent in this model. Most major publishers offer open access through one or more of these models, but open access metadata standards seem to be lacking among these content providers. The presenters will discuss some of these challenges identified in the literature and through other mechanisms, including data gathered by NISO and an original survey. By identifying these issues, the scholarly communication community can work together to improve discovery for end users.
Chris Bulock
Electronic Resources Librarian, SIUE Lovejoy Library
Chris is an Electronic Resources Librarian and NASIG member from the St. Louis area. His research and work are focused on improving the library user's experience. Chris is the recipient of the 2012 HARRASSOWITZ Charleston Conference Scholarship.
Nathan Hosburgh
Discovery & Systems Librarian, Rollins College
Nate Hosburgh is currently the Discovery & Systems Librarian at Rollins College in Winter Park, Florida as part of a revamped Collections & Systems department that includes ILL, collection development, acquisitions, systems, and technical services. Previously, he held positions managing e-resources at Montana State University and managing interlibrary loan & document delivery at Florida Institute of Technology in Melbourne
SharePoint Search out of the box for a word or two isn't that powerful. When combined with powerful properties and operators, search can really sing. To the informed user there are simple ways of getting the search results your looking for by learning some KQL the Keyword Query Language. In this session we spend most of the time in demo in the search interface, but these slides contain lots of tips and tricks for better search for users.
Toward an automated student feedback system for text based assignments - Pete...Blackboard APAC
As the use of blended learning environments and digital technologies become integrated into the higher education sector, rich technologies such as analytics have the ability to assist teaching staff identify students at risk, learning material that is not proving effective and learning site designs that aid and facilitate improved learning. More recently consideration has been given to automated essay scoring. Such systems can be used in a formative way, such as providing feedback on initial assignment drafts or summatively through the analysis of final assignment submissions. Further, providing students with quick feedback on written assignments opens the opportunity through formative feedback to improved learning outcomes.
This presentation details a current project developing a system to analyse text-based assignments. The project is being developed for broad application, but the findings focus on an undergraduate pilot subject: ‘Ideas that Shook the World’ (a compulsory first year Bachelor of Arts subject taught on 5 campuses to more than 1000 students by 15 staff). Preliminary results of a fist scan of assignments are presented and the issues raised in developing the system presented together with an outline of additional work planned for the project. It is believed the work will have wide application where text-based assignments are utilised for assessment.
From Exploration to Construction - How to Support the Complex Dynamics of In...TimelessFuture
Search engines on the Web provide a world of information at our fingertips, and the answers to many of our common questions are just one click away. However, for the complex and multifaceted tasks involving a process of knowledge construction, various information seeking models describe an intricate set of cognitive stages (Kuhlthau, 2004; Vakkari, 2001). These stages influence the interplay of users’ feelings, thoughts and actions. Despite the evidence of the models, common search engines, nowadays the prime intermediaries between information and user, still feature a streamlined set of 'ten blue links'. While efficient for lookup tasks, this approach may not be beneficial for supporting sustained information-intensive tasks and knowledge construction. Would there be other approaches to support the complex dynamics of these ventures? Based on previous experiments, this talk discusses how the utility of search functionality during different stages of complex tasks is essentially dynamic. This provides opportunities for designing 'stage-aware' search systems, which may evolve along with a user's information journey.
Workshop presented at Webdagene 2013 (http://webdagene.no/en/) September 9, 2013; UX Lisbon (http://www.ux-lx.com), May 12, 2011; UX Hong Kong (http://www.uxhongkong.com/), February 17, 2011.
Presentation made during the Intelligent User-Adapted Interfaces: Design and Multi-Modal Evaluation Workshop (IUadaptME) workshop conducted as part of UMAP 2018
Avoiding a Level of Discontent in Finding Aids: An Analysis of User Engagemen...Andrea Payant
As part of a multi-faceted research project examining user engagement with various types of descriptive metadata, Utah State University Libraries Cataloging and Metadata Services unit (CMS) investigated the discoverability of local Encoded Archival Description (EAD) finding aids. The research team put two versions of the same finding aid online with one described at the file (box or folder) level and the other at the item-level. Over a year later, the team pulled the analytics for each guide and assessed which descriptive level was most frequently accessed. The research team also looked at the type of search terms patrons utilized and wherein the finding aid they were located. Usage data shows that personal names are the most common type of search term, search terms are most commonly found in the Collection Inventory, and that the availability of item-level description improves discovery by an average of 6,100% over file-level descriptions.
At Utah State University, a pilot project is under development to evaluate the benefits of tracking data sets and faculty publications using the online catalog and the Library’s institutional repository.
With federal mandates to make publications and data open, universities look for solutions to track compliance. At Utah State University, the Sponsored Programs Office follows up with researchers to determine where data has been or will be deposited, per the terms of their grant.
Interested in making this publicly discoverable, the Library, Sponsored Programs, and Research Office are working together to pilot a project that enables the creation of publicly accessible MARC and Dublin Core records for data deposited by USU faculty. This project aims to make data sets, as well as publications, visible in research portals such as WorldCat, as well through Google searches.
This presentation will describe the project and anticipated benefits, as well as outline the roles of the cataloging staff and data librarian, and the involvement of the Research Office.
Mitigating the Risk: identifying Strategic University Partnerships for Compli...Andrea Payant
Payant, A., Rozum, B., Woolcott, L. (2016). Mitigating the Risk: Identifying Strategic University Partnerships for Compliance Tracking of Research Data and Publications. International Federation of Library Associations (IFLA) Satellite Conference: Data in Libraries: The Big Picture
Just Keep Cataloging: How One Cataloging Unit Changed Their Workflows to Fit ...Andrea Payant
Utah State University Libraries Cataloging and Metadata Services (CMS) unit, including student workers, transitioned to remote cataloging in March 2020 due to the COVID-19 pandemic. The presentation will outline the process undertaken by supervisors to evaluate and modify services and workflows to continue cataloging materials through the different phases of library capacity from shutting down most of the library, to a hybrid limited staff capacity, through staff back in the library full-time.
But Were We Successful: Using Online Asynchronous Focus Groups to Evaluate Li...Andrea Payant
USU launched a program in 2016 to connect researchers seeking federal funding with librarians to assist them with data management. This program assisted over 100 researchers, but was it successful? Our presentation will discuss how we evaluated the success of this program using online asynchronous focus groups (OAFG) in conjunction with a traditional survey. Our cross-institutional research team will share our findings as well as the challenges and successes of using OAFGs to assess library services.
Assessment and Visualization Tools for Technical ServicesAndrea Payant
A survey and demonstration of open source, freely available tools to help technical services units assess their work, collect and analyze data, create infographics, and visually demonstrate their impact on the library and their patrons.
liwalaawiiloxhbakaa (How We Lived): The Grant Bulltail Absáalooke (Crow Natio...Andrea Payant
USU was selected to host a unique collection of oral histories from Grant Bulltail, Crow Storyteller and 2019 NEA National Heritage Fellow, representing the stories and knowledge of the Crow Nation as passed down by his ancestors. The collection spans 20+ years of field work and collaboration across library departments and regional partners.
Crowdsourcing Metadata Practices at USUAndrea Payant
USU Libraries’ Cataloging and Metadata Unit has successfully investigated several methods to engage the public to involve them in the creation of metadata for USU’s Digital History Collections. Most, if not all the techniques we have tested have yielded positive results and have improved the relevancy and accuracy of our descriptive metadata.
Homeward Bound: How to Move an Entire Cataloging Unit to Remote WorkAndrea Payant
Utah State University Libraries Cataloging and Metadata Services (CMS) unit, including student workers, transitioned to remote cataloging in March 2020 due to the COVID-19 pandemic. This presentation will outline the process undertaken by supervisors to evaluate and modify services and workflows to continue cataloging service during the time when the library was shut down.
Outlines the development of the two single-service point and education initiatives, describes feedback gathered from a survey, and discusses how the Cataloging and Metadata Services unit plans to adapt services based on findings
Charting Communication: Assessment and Visualization Tools for Mapping the Co...Andrea Payant
Outlines the methodologies and tools used for analyzing communication patterns to better inform cataloging decisions, increase communication opportunities, and enhance awareness of cataloging and metadata contributions to librarianship
Memes of Resistance, Election Reflections, and Voices from Drug Court: Social...Andrea Payant
Folklorists and librarians have long championed social justice and advocacy issues. Today, the skills garnered through principled academic discourse, community based ethnographic fieldwork, and ethical librarianship are being utilized to collect, preserve, present, and educate around social themes and issues. USU folklorists and librarians are working to create robust digital collections that focus on timely social issues with informed and ethical metadata.
Giving Credit Where Credit is Due: Author and Funder IDsAndrea Payant
A process to include standardized funder and author identifiers into institutional repository and ILS records which are associated with funded research data
Wisdom of the Crowd: Successful Ways to Engage the Public in Metadata CreationAndrea Payant
Utah State University Libraries’ Cataloging and Metadata Unit has successfully used several methods to engage the public in metadata creation for USU’s Digital History Collections.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
1. MARC-y MARC and the Coding Bunch
Anna-Maria Arnljots
Metadata Assistant
anna-maria.arnljots@usu.edu
Paul Daybell
Archival Cataloging Librarian
paul.daybell@usu.edu
Kurt Meyer
Government Information and E-
Resource Cataloger
kurt.meyer@usu.edu
Andrea Payant
Metadata Librarian
andrea.payant@usu.edu
Becky Skeen
Special Collection Cataloging Librarian
becky.skeen@usu.edu
Liz Woolcott
Cataloging and Metadata Services Unit Head
liz.woolcott@usu.edu
Utah Library Association Annual Conference
May 21, 2021
2. 2
Background
• Multi-year research into user search behavior for all metadata
standards employed by the unit
First phase: MARC
Next phases: EAD, Dublin Core
• Project started just as the library moved everyone to work from
home
• Whole unit was able to participate in the coding project
3. Problem Statement
What is the correlation between
user search terms, the placement
of MARC records in search results
lists, and the performance of
individual MARC fields in a search
process?
4. Research Questions
• What is the frequency and
placement of MARC records in
search results list?
• Where are Search terms
located in MARC Records?
6. • Focused on the Discovery Layer (Encore)
because it was the primary search portal used
by patrons
• Pulled list of all URLs accessed on three days
• Put into Airtable and coded
Web Log Analysis
7. • Filtered for URLs that lead to search results pages
• Fed URLs into Octoparse, a web-scrapping tool
• Scrapped the list of search results, URLs, pagination,
and results #
• Numbered the results and put into Airtable, linked to
originating URL
Web Scraping
8. • Search Results List and URLs
Extracted bib #
Created formula to link to MARC view of bib
Unit members pulled up Bib record and copy/pasted it into
Airtable
Assigned codes for :
o Creator of record
o Material type
o MARC fields where term was found
o Fields that were not present
Automated formula examined wordcount of record
Airtable
9. • Web Log URLs
Coded for basic search features:
o Page Types
o Advanced Search fields used
o Facets used
o Page Number
Coded the queries (search terms) for:
o Search term construction
o Search categories (known item, topical)
o User Path
o Known Item Titles
Airtable (continued)
10. • Known Items pulled out specifically and coded (most for a
separate project looking at the discovery layer)
Format/Genre
Availability
Physical or Electronic
Location
Steps to access
Listed by
Final Content Provider
Checkouts
Discoverability in Google Scholar
o Steps to Access
Airtable (continued)
12. Analysis 1.1:
How frequently are MARC records showing up in search results?
Batch 1 Batch 2 Batch 3 Combined
MARC-based catalog records 5264 3299 4749 13312
Records from other platforms 20326 17560 16811 54697
Total Records 25603 20859 21560 68022
Percent MARC records 20.56% 15.82% 22.03% 19.57%
13. Analysis 1.2:
Is there a difference between locally created records and vendor supplied records in
the frequency of listing in search results?
Record Creator
# Records in
results list
% Total records in
results list
# Records
accessed
% Total records
accessed
Vendor 7,727 58.05% 163 39.00%
Cataloging and Metadata Services 5,066 38.06% 239 57.18%
Distance Campus Libraries 410 3.08% 5 1.20%
Record unavailable at time of coding 52 0.39% 2 0.48%
Patron Services, Library Media Collections, or
Resource Sharing and Document Delivery
33 0.25% 8 1.91%
Acquisitions 16 0.12% 0 0.00%
Unknown 5 0.04% 1 0.24%
Natural History Library 3 0.02% 0 0.00%
Total 13,312 418
14. Analysis 1.3:
How are MARC records ranked in the search results list?
• Most common position for MARC records in a search
result set of 25 items, is position 4
• MARC records appear in the top five search results
25.35% of the time
15. Analysis 1.4:
Where do MARC records for known items rank in the search results list?
Percentage of Times Available Whole Object Appeared in Search Results by Position Number
Result 1 Result 2 Result 3 Result 4 Result 5
Results
6-10
Results
11-15
Results
16-20
Results
21-25
Total # 125 107 61 49 37 104 67 56 35
% in
results
18.7% 16.0% 9.1% 7.3% 5.5% 15.6% 10.0% 8.4% 5.2%
17. Analysis 2.1:
What fields are used most in retrieving records?
9100
4998 4806
3700
1328
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
245 505 650 520 600
Number
of
Records
MARC Fields
MARC Fields Where Search Terms Were Located (Top 5)
18. Analysis 2.2:
For records accessed by the patron, is there a difference in where search terms are
located?
• The 245 Title statement remained highest, appearing 64% more
often than the next most utilized field
• Instead of the 505 Formatted Contents Note being in second
place, the 650 Subject Added Entry is the next most used field
• The 505 Formatted Contents Note and 520 Summary fields
retained a spot in the top four fields
19. Analysis 2.3:
For locally created records and vendor-supplied records, is there a difference in
where search terms are located?
Percentage of fields used in record retrieval (top 5 most frequent)
Field Field Description CMS Records Vendor Records
245 Title Statement 43.80% 51.64%
505 Formatted Contents Note 28.13% 69.65%
650 Subject Added Entry - Topical 40.89% 56.58%
520 Summary, etc. 23.41% 76.03%
600 Subject Added Entry – Personal Name 59.94% 32.68%
20. Analysis 2.4:
What fields are not present in the records?
CMS Vendor
Not Present Present Not Present Present
Author (both 1xx and 7xx) 0.75% 99.25% 1.18% 98.82%
Subject (any authorized) 4.46% 95.54% 6.73% 93.27%
505 Formatted Contents Note 63.96% 36.04% 45.54% 54.46%
520 Summary Note 75.60% 24.40% 50.45% 49.55%
All Categories Present 14.86% 33.26%
21. Analysis 2.5:
Which fields would make the greatest impact if not included in the record?
• The top four fields with the greatest impact on retrieval, if not
found in a record: 505, 245, 520, and 650
• Without the 505 or 520, 16.86% of all records appearing in
results would not have shown up
• In contrast, without 650 and 600 fields, only 0.66% of records
would not have appeared in the search results
23. 23
• Non-MARC records
have advantage
over MARC
Of all records in search results
are Non-MARC
Analysis
• MARC vendor records
appear more often
than locally created
MARC records
Of MARC records place in the
top 5 search results.
Occur more
frequently in
vendor records
Occur at the same
rate in Vendor and
Locally created
records
24. 24
Analysis
Title fields are most important over all, but…
• Ranked higher than
245 for records where
search terms matched
only one field
• Consistently in the
top 4 fields that
retrieved a record
(along with 520)
• If missing, 12% of
all MARC results
would not have
been displayed
25. 25
Analysis
Subject fields are important But…
Most important field for
matching search terms
Most important field for
records viewed by patrons
Would not have
been displayed if
field were missing
Instance of
subject fields
being “clicked on”
1xx fields were much more likely to be “clicked on”
26. ▫ Cataloger will retain ability to make best judgment for each
record, but will be asked to consider the following
guidelines:
- More emphasis on creating 505 and 520 notes in local
records
- Less emphasis on 6xx fields as an entry point
- More emphasis on 1xx fields as an entry point
26
Take-Aways
27. MARC-y MARC's Coding Bunch
• Anna-Maria Arnljots
• Josee Butler
• Ryan Bushman (Stats)
• Paul Daybell
• Barbara Fleming
• Maddie Gardner
• Alisha Grant
• Bryn Larsen
• Sabrina Leatham
• Rachel Olsen
• Andrea Payant
• Kurt Meyer
• Jessica Mills
• Abby Rodabough
• MaKayla Roundy
• Melanie Shaw
• Becky Skeen
• Sara Skindelien
• Seth Westenburg
• Liz Woolcott
29. Full Procedures: https://usulibrary.atlassian.net/l/c/8H7jgU98
Article with final results:
Liz Woolcott, Andrea Payant, Becky Skeen & Paul Daybell (2021) Missing the
MARC: Utilization of MARC Fields in the Search Process, Cataloging &
Classification Quarterly, 59:1, 28-52, DOI: 10.1080/01639374.2021.1881010
Related articles
Robert Heaton & Liz Woolcott. Unraveling the (Search) String: Assessing Library
Discovery Layers Using Patron Queries. Library Assessment Conference, January
2021
• Presentation: https://www.libraryassessment.org/program/2020-
schedule/#jan21
• Paper: https://www.libraryassessment.org/2020-proceedings/
30. Questions?
Anna-Maria Arnljots
Metadata Assistant
anna-maria.arnljots@usu.edu
Paul Daybell
Archival Cataloging Librarian
paul.daybell@usu.edu
Kurt Meyer
Government Information and E-
Resource Cataloger
kurt.meyer@usu.edu
Andrea Payant
Metadata Librarian
andrea.payant@usu.edu
Becky Skeen
Special Collection Cataloging Librarian
becky.skeen@usu.edu
Liz Woolcott
Cataloging and Metadata Services Unit Head
liz.woolcott@usu.edu
I will now give you a quick overview of our methodology for our project
In order to determine how MARC records interacted with the user search process, the research team examined the logs of URLs that were generated by Encore, our library’s discovery layer.
Each search session in Encore generates a combination of static and dynamic URLs. Dynamic URLs capture a user’s search terms and any facets selected, advanced search categories used, additional search result pages accessed, and bibliographic record numbers for MARC record pages.
Google Analytics was used to gather reports of time-stamped, URL logs generated over the course of multiple days.
Resulting data was put into Airtable, a relational database for further analysis
The Google analytics report of URL logs was downloaded, and dynamic URLs that led to a search results page were isolated from the main report and fed into Octoparse, a web scraping tool. Each resulting page from the dynamic URL was scraped by Octoparse to gather data for the search terms used, the number of results on the page, the total number of results available to the user, and the title and link of each item in the list of results presented to the user on that page.
The results were numbered and added to our Airtable database and then linked to the originating URL.
Search results list and urls were coded to identify the bibliographic record number.
A formula was created within the system to link out to the MARC view which was used to access and copy the full text of the MARC record into Airtable.
Codes were assigned for record creator (whether generated by library personnel or vendor supplied) and material type.
Codes also identified where the search terms appeared in the MARC record and they also related prominent categories of fields that were not present in the record.
For every instance where the search term appeared in the field, that field was copied into a separate column for further analysis.
Also, an automated formula examined the word count of each record.
Web logs URLs were also coded for basic search features, including page types, advanced search fields, facets used, and search result page numbers
Queries, or search terms, were coded as well to parse out how search terms were constructed, search categories (either known item or topical), user paths, and known item titles.
Finally, known item searches were pulled out and coded. The search terms entered by the user were analyzed through a multi-step process that reran the same terms in a browser to ascertain if the search terms reasonably matched the title or identifier of a known item.
When found, the corresponding URLs were tagged as Known Items and coded for format, availability, medium, location, keywords used etc.
Following this coding, each known item was double checked by a research team member to determine if the library provided access to it, either physically or in electronic format.
Paul will now go over the results of our data and coding
So, just to summarize what Paul said. Non-MARC records have clear advantage over MARC in our discovery layer. 80% of all results came from non-MARC sources, despite non-MARC records making up 60% of the database. AND MARC records only place in top 5 results a quarter of the time.
If we just look at MARC records by themselves, though, we see that Vendor records appear more often than locally created records and are more likely to include the 505 and 520 fields. They have the same frequency of author and subject fields as records cataloged locally, though, so 1xx and 6xx fields are not making a difference between the two types of records.
We suspect that full text search in non-MARC records and the greater presence of 505 and 520 fields in Vendor records provide more words and phrases for the index to search against. And that our own work is less visible because we aren’t putting our emphasis in these places.
In fact, if we look further into how the 505 functions, we find that while title fields were the most important field overall, the 505 ranked higher than 245 for records where search terms matched only one field (meaning those search terms weren’t found anywhere else in the record.) The 505 and 520 Summary Notes were consistently in the top 4 fields that retrieved a record
Most telling of all was that in 12% of all records, if 505 had not been present, the record would not have been displayed in the search results list AT ALL. The only other field more significant that this was the Title field
Let’s take a look now at how authorized fields like the subject and author field interact with search terms. Subject fields are important, but results on how they interact with search terms are mixed, It is the 3rd most important field for matching search terms and the 2nd most important field for records viewed by patrons, but only .55% of records would not have been displayed if the Subject field had been missing. So, while the data demonstrated that search terms matched subject headings frequently, it also demonstrated that those same terms were frequently available elsewhere in the record already.
Additionally, it was very obvious that subject headings were rarely ever used as a means for finding other materials (for instance, when we envision a patron "clicking on" a subject link to find like materials.") There was only one instance of subject fields being “clicked on” to bring up related records. This is, in large part, due to the visibility of subject headings on the main search page,. You can only access the terms through the record itself (if the patron clicks on it) or on occasion in a “tag” field at the bottom of the facet column. Whether due to interface design or to the utility of the field itself, we cannot definitely say. However, 1xx creator fields were the most likely authorized heading fields to be used and the data displayed evidence of them being used to find related records and materials. They are also the more visible of the authorized headings fields – not only showing up in the search results list, but also being actionable from that list without having to enter the record.
In reviewing all the data, the unit developed a few "take-aways" that we could incorporate in our day-to-day work. These included taking more time to add 505 Formatted Content Notes or 520 Summary fields to locally created records. We felt the data demonstrated that additional 505 and 520 fields would likely make our records more visible to the search algorithms. Additionally, we will place less emphasis on the subject fields as part of our workflow. This doesn't mean eliminating subject work from what we do – but rather just not spending as much time developing subject headings as before. We will also continue our authority work on the 1xx creator fields, as they are the most visible of the controlled headings fields and also highly visible in the search results page. These aren't hard and fast rules, but rather guidelines to follow. Our catalogers will continue to be able to exercise their own judgment when creating records. But having this understanding of how the records are used will be imperative in that judgment making process.
We would like to thank the following people for all of their help in making this research process possible. The whole Cataloging unit at USU Libraries, including catalogers, cataloging assistants, and student technicians participated in this project. We would also like to thank Ryan Bushman, the assistant to our Assessment Librarian for all his help with the statistics for this project. We are so appreciative to this whole coding bunch!
If you would like to try out this process yourself – we have put our step by step instructions online at the URL you see above. This will include all of the procedures we used to pull the data from Google Analytics, scrape the data with Octoparse, and our codebooks that all of the project contributors used. We will also put this link into the chat for you.
You can also read about this process and the results in our recently published article in Cataloging and Classification Quarterly. It is titled "Missing the MARC: Utilization of MARC Fields in the Search Process." and the link DOI above is a link to the article. We will also put that into the chat for you. Note that both of these links are available on the handout for this session, too.
The data from this project was also used in a recent publication and presentation at the Library Assessment Conference which examined how patrons used the Library Discovery Layer Encore. The links are available on this slide and we will put them into chat as well. Just note that the proceedings are quite up yet, but should be soon.
Thank you for your time! Does anyone have any questions?