Jay Henry, Ringgold’s Chief Marketing Officer, presented at the Council of Science Editors Annual Conference in Philadelphia, and discussed 'Emerging Standards: Data and Data Exchange in Scholarly Publishing' on Sunday 17 May 2015.
Small Data, Big Benefits - Christine Orr at SSP 2016Ringgold Inc
Small Data, Big Benefits: Mining for End User Relationships
In today’s environment publishers need more user interest and engagement in order to keep institutional subscriptions and submissions strong and growing.
Persistent Identifiers - The 5 Things You Need To KnowRinggold Inc
Ringgold presented at the Frankfurt Book Fair Hot Spots stage on Wednesday 19 October 2016. The use of persistent identifiers has become much more widespread in scholarly communication. Ringgold, the institutional identification experts, explained the importance of persistent identifiers and why you should be using them to your advantage whatever your role in scholarly communications.
Institutional Identifiers in Practice: Christine Orr at CESSE 2015Ringgold Inc
Christine Orr, North American Sales Director for Ringgold, spoke at the CESSE 2015 annual meeting session 'Adding Value to Your Process: Supporting Researchers and Data Requirements'.
Christine Orr, Sales Director for North America, spoke at SSP on Wednesday May 27. This pre-meeting seminar addressed Implementing Next Generation ID Standards for the New Machine Age: 'The Ties That Find'.
Small Data, Big Benefits - Christine Orr at SSP 2016Ringgold Inc
Small Data, Big Benefits: Mining for End User Relationships
In today’s environment publishers need more user interest and engagement in order to keep institutional subscriptions and submissions strong and growing.
Persistent Identifiers - The 5 Things You Need To KnowRinggold Inc
Ringgold presented at the Frankfurt Book Fair Hot Spots stage on Wednesday 19 October 2016. The use of persistent identifiers has become much more widespread in scholarly communication. Ringgold, the institutional identification experts, explained the importance of persistent identifiers and why you should be using them to your advantage whatever your role in scholarly communications.
Institutional Identifiers in Practice: Christine Orr at CESSE 2015Ringgold Inc
Christine Orr, North American Sales Director for Ringgold, spoke at the CESSE 2015 annual meeting session 'Adding Value to Your Process: Supporting Researchers and Data Requirements'.
Christine Orr, Sales Director for North America, spoke at SSP on Wednesday May 27. This pre-meeting seminar addressed Implementing Next Generation ID Standards for the New Machine Age: 'The Ties That Find'.
Metadata & Standards in Scholarly CommunicationRinggold Inc
Ringgold was excited to present at the 2015 Frankfurt Book Fair, Professional & Scientific Information Hot Spots Stage.
'Metadata & Standard Identifiers in Scholarly Publishing’ showed how your organization can benefit from our data services in the ever-challenging scholarly landscape.
Persistent Identifiers in Scholarly Communications - Christine Orr at SSP 2016Ringgold Inc
Persistent Identifiers in Scholarly Communications: What, Why, How, Where, and Who?
Persistent identifiers (PID) are vital to a strong research infrastructure. Unambiguous connections between people, places, and things that PIDs enable build trust in, improve discoverability of, and enable recognition for research contributions. And, in a world where researchers and their institutions are increasingly required to report on their research contributions across multiple systems, it’s critical to be able to do so in as simple, streamlined, and accurate a way as possible. PIDs can help by enabling the automated processes, validating and ensuring correct attribution of works, and facilitating discoverability across multiple platforms and systems. This session brought together representatives from organizations that create different types of PIDs with those who use them. After a brief introduction to what PIDs are and why they’re important, the panel demonstrated how they are being used in researcher systems and workflows, provided an update on recent and upcoming developments, and discussed the challenges and opportunities for widespread adoption of PIDs across the scholarly community. Speakers induded a publisher, a librarian, a manuscript submission system vendor, and representatives from PID organizations. The session included a brief overview from each followed by an informal panel discussion and audience Q&A.
Good metadata is critical to helping people find information. Metadata can be used to enhance search tools, drive navigation and relate documents to one another. Unfortunately, manually adding metadata to content is cumbersome for small batches of content and impractical or impossible for large content sets.
Enterprise Knowledge understands the difficulty and importance of maintaining metadata. In this session, we will share 6 different ways to simplify and/or automate metadata management even on extremely large content sets. We will share the tools and techniques we have used with our clients to make metadata management possible and provide real world examples as to how these techniques can be applied to your content.
An introduction to the FAIR principles and a discussion of key issues that must be addressed to ensure data is findable, accessible, interoperable and re-usable. The session explored the role of the CDISC and DDI standards for addressing these issues.
Presented by Gareth Knight at the ADMIT Network conference, organised by the Association for Data Management in the Tropics, in Antwerp, Belgium on December 1st 2015.
New Initiatives - Geoffrey Bilder - London LIVE 2017Crossref
Presentation by Geoffrey Bilder at Crossref London LIVE, 26th September 2017. New initiatives at Crossref including organisational and grant identifiers.
In May 2014, we introduced ProtoView to our free webinar series. With ProtoView we promote your titles through professionally created abstracts, bibliographic entries, and expanded metadata delivered to the scholarly supply chain. In this webinar, we talked about the new developments in academic markets and how to maximize your titles' presence in web scale discovery services. (Hint: It's all about discoverability.)
We discussed the metadata elements included in ProtoView, the different levels of service available for print and electronic books and journals, and custom solutions available by sending electronic data in conjunction with print review copies.
Emerging Standards: Data and Data Exchange in Scholarly Publishing - Jay Henr...Ringgold Inc
Ringgold is one of several organizations that are putting forth ideas to standardize data and data exchange throughout scholarly publishing. This session discussed new initiatives that address such challenges as standardizing conflict of interest reporting, easily identifying funding sources, clarifying contributor roles for research papers, and managing institution disambiguation.
Metadata & Standards in Scholarly CommunicationRinggold Inc
Ringgold was excited to present at the 2015 Frankfurt Book Fair, Professional & Scientific Information Hot Spots Stage.
'Metadata & Standard Identifiers in Scholarly Publishing’ showed how your organization can benefit from our data services in the ever-challenging scholarly landscape.
Persistent Identifiers in Scholarly Communications - Christine Orr at SSP 2016Ringgold Inc
Persistent Identifiers in Scholarly Communications: What, Why, How, Where, and Who?
Persistent identifiers (PID) are vital to a strong research infrastructure. Unambiguous connections between people, places, and things that PIDs enable build trust in, improve discoverability of, and enable recognition for research contributions. And, in a world where researchers and their institutions are increasingly required to report on their research contributions across multiple systems, it’s critical to be able to do so in as simple, streamlined, and accurate a way as possible. PIDs can help by enabling the automated processes, validating and ensuring correct attribution of works, and facilitating discoverability across multiple platforms and systems. This session brought together representatives from organizations that create different types of PIDs with those who use them. After a brief introduction to what PIDs are and why they’re important, the panel demonstrated how they are being used in researcher systems and workflows, provided an update on recent and upcoming developments, and discussed the challenges and opportunities for widespread adoption of PIDs across the scholarly community. Speakers induded a publisher, a librarian, a manuscript submission system vendor, and representatives from PID organizations. The session included a brief overview from each followed by an informal panel discussion and audience Q&A.
Good metadata is critical to helping people find information. Metadata can be used to enhance search tools, drive navigation and relate documents to one another. Unfortunately, manually adding metadata to content is cumbersome for small batches of content and impractical or impossible for large content sets.
Enterprise Knowledge understands the difficulty and importance of maintaining metadata. In this session, we will share 6 different ways to simplify and/or automate metadata management even on extremely large content sets. We will share the tools and techniques we have used with our clients to make metadata management possible and provide real world examples as to how these techniques can be applied to your content.
An introduction to the FAIR principles and a discussion of key issues that must be addressed to ensure data is findable, accessible, interoperable and re-usable. The session explored the role of the CDISC and DDI standards for addressing these issues.
Presented by Gareth Knight at the ADMIT Network conference, organised by the Association for Data Management in the Tropics, in Antwerp, Belgium on December 1st 2015.
New Initiatives - Geoffrey Bilder - London LIVE 2017Crossref
Presentation by Geoffrey Bilder at Crossref London LIVE, 26th September 2017. New initiatives at Crossref including organisational and grant identifiers.
In May 2014, we introduced ProtoView to our free webinar series. With ProtoView we promote your titles through professionally created abstracts, bibliographic entries, and expanded metadata delivered to the scholarly supply chain. In this webinar, we talked about the new developments in academic markets and how to maximize your titles' presence in web scale discovery services. (Hint: It's all about discoverability.)
We discussed the metadata elements included in ProtoView, the different levels of service available for print and electronic books and journals, and custom solutions available by sending electronic data in conjunction with print review copies.
Emerging Standards: Data and Data Exchange in Scholarly Publishing - Jay Henr...Ringgold Inc
Ringgold is one of several organizations that are putting forth ideas to standardize data and data exchange throughout scholarly publishing. This session discussed new initiatives that address such challenges as standardizing conflict of interest reporting, easily identifying funding sources, clarifying contributor roles for research papers, and managing institution disambiguation.
Using Data to Drive Discovery of New Scholarly WorksRinggold Inc
Jean Brodahl, Publisher and Library Relations for Ringgold's ProtoView service, presented at the Previews Session: New and Noteworthy Product Presentations at SSP on Thursday 28 May. She showed how ProtoView helps publishers increase the profile of their content within the scholarly supply chain.
Metadata Standards: A Golden Age Arrives? - Christine Orr at STMRinggold Inc
Metadata standards for describing information about authors, institutions, and funders make possible a high level of precision and clarity on published research and data.
Together, standards promise even more: An interoperable world of scientific and scholarly information where end-to-end workflow solutions drive innovation and collaboration.
Recent developments suggest a Golden Age looms for STM publishers, brought on by widespread adoption of standards. Have we reached the tipping point, at last? Are publishers united in their enthusiasm? Will authors prove to be the last piece in the puzzle?
Ringgold Webinar Series: 2. Core Strength - Standard Identifiers as the Found...Ringgold Inc
The second session took place on Wednesday January 29 and discussed Ringgold IDs - what they are and what other identifiers can do for your business. We addressed:
- The current landscape of standard identifiers applicable to scholarly publishing including Ringgold IDs, ISNI, and ORCID. What are they, and why are they important?
- How & why to incorporate them into your internal data silos and into your supply chain activities
- Ringgold Identifiers and the Identify database: Service overview & typical use cases
In today’s enterprise environments, data architects are required to incorporate and integrate data from a variety of sources, such as multiple relational and unstructured databases, public-use data, information managed in cloud/hosted SaaS systems, and data absorbed from a wide variety of “open” sources. Without a process for synthesizing the format, structure and semantics of the data sources and verifying conformance among them, you run the risk of blending data in an incompatible way leading to misunderstandings in the best case and poor business decisions in the worst case.
To help you address the blossoming challenges of working with different kinds of data, David Loshin will share his thoughts and experiences and discuss how business-process-oriented data and metadata modeling can help:
+ Establish control over the interpretation of multiple data sources
+ Harmonize information concepts
+ Facilitate semantic consistency and information conformance
Closing the data source discovery gap and accelerating data discovery comprises three steps: profile, identify, and unify. This white paper discusses how the Attivio
platform executes those steps, the pain points each one addresses, and the value Attivio provides to advanced analytics and business intelligence (BI) initiatives.
GDPR BigDataRevealed Readiness Requirements and EvaluationSteven Meister
This GDPR methodology can evaluate your GDPR readiness. For those feeling GDPR ready, you may uncover complex issues often neglected. For those that have waited, you can gain knowledge providing for a more successful GDPR outcome.
https://youtu.be/uE4Q7u0LatU https://youtu.be/R37S9mIiVAk https://youtu.be/AQf3if7DnuM
In the healthcare sector, data security, governance, and quality are crucial for maintaining patient privacy and ensuring the highest standards of care. At Florida Blue, the leading health insurer of Florida serving over five million members, there is a multifaceted network of care providers, business users, sales agents, and other divisions relying on the same datasets to derive critical information for multiple applications across the enterprise. However, maintaining consistent data governance and security for protected health information and other extended data attributes has always been a complex challenge that did not easily accommodate the wide range of needs for Florida Blue’s many business units. Using Apache Ranger, we developed a federated Identity & Access Management (IAM) approach that allows each tenant to have their own IAM mechanism. All user groups and roles are propagated across the federation in order to determine users’ data entitlement and access authorization; this applies to all stages of the system, from the broadest tenant levels down to specific data rows and columns. We also enabled audit attributes to ensure data quality by documenting data sources, reasons for data collection, date and time of data collection, and more. In this discussion, we will outline our implementation approach, review the results, and highlight our “lessons learned.”
Office 365 : Data leakage control, privacy, compliance and regulations in the...Edge Pereira
If you are considering Microsoft Office 365 you need to understand what compliance features and capabilities are available. These features may replace existing compliance functionality or create new opportunities to meet compliance obligations. This session will explore features and capabilities, such as data retention, data leakage protection and discovery and demonstrate usages. The information presented is this session will help you navigate an organisational journey to Microsoft Office 365 by providing knowledge and examples to smooth the transition, whilst maintaining or enhancing your industry's regulatory compliance.
Balancing data democratization with comprehensive information governance: bui...DataWorks Summit
If information is the new oil, then governance is its “safety data sheet.” As demand for data as the raw material for competitive differentiation continues to rise in enterprises, enterprises are having bigger challenges identifying and valuing data and ensuring its appropriate use to extract the right information. In order for organizations to make effective business decisions, organizations need to have trust in their data so that they can impute the right value and use it for the right purposes while satisfying any organizational or regulatory mandates. A number of analytics and data science initiatives fail to reach their potential due to lack of an information governance framework in place. Robust information governance capabilities can help organizations develop trust in their data and empower them to make decisions confidently.
In this session Sanjeev Mohan, Research Analyst at Gartner, and Srikanth Venkat, Sr. Director of Product Management at Hortonworks, will walk you through an end-to-end architectural blueprint for information governance and best practices for helping organizations understand, secure, and govern diverse types of data in enterprise data lakes.
Speaker
Sanjeev Mohan, Gartner, Research Analyst
Srikanth Venkat, Hortonworks, Senior Director, Product Management
In this session Sanjeev Mohan, Research Analyst at Gartner, and Srikanth Venkat, Sr. Director of Product Management at Hortonworks, will walk you through an end-to-end architectural blueprint for information governance and best practices for helping organizations understand, secure, and govern diverse types of data in enterprise data lakes.
Supporting GDPR Compliance through Data ClassificationIndex Engines Inc.
The GDPR consists of 99 articles that mandate how data is to be handled, but how do you manage years of data on various platforms?
Index Engines gives organizations the ability to leverage metadata to enforce governance policies across their data centers. Using high-level buckets, classifying and tagging relevant data, the process of protecting the content will be simplified.
http://www.indexengines.com/ediscovery-governance/solutions-for/gdpr-compliance
Unified Information Governance, Powered by Knowledge GraphVaticle
As a knowledge graph database, Grakn is ideal for storing metadata and data lineage information. Many applications, such as data discovery, data governance, and data marketplaces, depend upon metadata for management. User experiences can be enhanced by leveraging a hyper-scalable graph database like Grakn, rather than traditional graph databases. Additionally, inference-driven use cases predominantly depended on RDF Triple Stores, requiring additional plug-ins to derive the inferences. With Grakn, this can now be achieved natively.
(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...Priyanka Aash
This session is about how to implement any privacy program in any organization - big or small - the foundational step is to understand what Personal Data an organization deals with, where it lies, how it flows (within & outside the organization), who does what with that data, what are the underlying assets involved, etc. Without this foundation, the organization cannot build the necessary controls required to implement and manage Privacy. However, this is not an easy probem to address. This session does a deep dive into the challenges faced, the methodologies used and tools that can be employed to build AND sustain an organization's data map.
Why not start with data sharing? Asset sharing reduces costs, improves utilization and sustainability. Only data assets are still managed in silos.
One of the few successful examples for data sharing is the CDQ Data Sharing Community for business partner data. This talk analyzes the approach and distinguishes two levels of data sharing:
(1) data knowledge sharing (semantics, rules and reference data), and
(2) data asset sharing (peer-based sharing of validated data)
Data sharing leads to higher data quality, lower data maintenance efforts, reduced risks and higher trust in data.
This presentation starts off by discussing powerful examples of The Power of Data and the benefits of Data Driven architectures. A Data Governance program is important for the success of Data Driven architectures. We then discuss the challenges of implementing a Data Governance framework on a Big Data Data Lake with open source software including DataPlane, Apache Atlas and Apache Ranger. And finally, we discuss the importance of the democratization of data and the switching to a speed of thought framework with Hive LLAP.
Using your Data to Drive Revenue – Laura Cox at London Book Fair 2018 Ringgold Inc
Laura Cox, Ringgold Chief Financial and Operating Officer, chaired this session at London Book Fair 2018 on 10 April.
The session featured expert speakers across several types of publishing data and gave practical advice including:
How to utilise the information held about customers
How to use taxonomies to help improve search and discovery
How to evaluate technologies that will help organisations make the most of their content through effective storage and semantic exploitation
More details: https://www.londonbookfair.co.uk/en/Sessions/58553/Use-your-Data-to-Drive-Revenue
The final session took place on Wednesday 26 February and we offered concrete, simple take-aways that will allow you to quickly improve the state of your most valuable customer and prospect records. Also, we discussed how our auditing service may help those needing a more robust solution:
- 3 Quick Tips to improve the state of your most valuable client records - all of which you can do on your own
- Ringgold’s Audit service: outsourcing the standardization of your subscriber records
Ringgold Webinar Series: 3. Lean and Mean - Publication Metadata to Enhance D...Ringgold Inc
The third session took place on Wednesday 15 February and covered making content easily discoverable. Well-structured and complete metadata about your published works are the key to ensuring content can be easily found, purchased, and used - particularly within the emerging Demand Driven Acquisition Model. The discussion explored:
- The changing landscape of discovery and collection development
- Current industry initiatives surrounding publication metadata
- Review of discovery platforms and discovery layers
- Ringgold's ProtoView service - supporting publishers with the creation and targeted dissemination of quality metadata
This introductory session on Wednesday 15 January covered the following:
- A review of what constitutes good data health
- Data health plan: data governance and how it can drive your business
- Overview of standard identifiers currently used in the scholarly publishing supply chain
- Introduction to Ringgold services and how we support our clients
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...sameer shah
"Join us for STATATHON, a dynamic 2-day event dedicated to exploring statistical knowledge and its real-world applications. From theory to practice, participants engage in intensive learning sessions, workshops, and challenges, fostering a deeper understanding of statistical methodologies and their significance in various fields."
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
6. *This means data that can be linked together through
unambiguous identification and exchanged with others
Governed
Trusted
Transparent
And contain appropriate metadata
In order to be effective, identifiers must be:
7. Persistent numeric or alpha-numeric
designations associated with a single entity
Entities can be an institution, person, or piece of
content (People, Places, & Things)
1. Disambiguate, aka enforce uniqueness
2. Enable linking, aka data integration and interoperability
In other words, they provide a simple
basis for data governance
8. ◦ Break down silos
◦ Keep data current and
synchronised
◦ Enable staff to interact
with data more effectively
◦ Simplify data exchange
◦ Improve overall data
quality
Institutional
Identifiers
CRM
Electronic
document
storage
Usage
statistics
Author
Database
Fulfilment
system
Membersh
ip system
License
Validation
Manuscript
Submission
System
9. • Resources & personnel required to join existing
records to IDs or an authority file
• Build customized solutions mapping systems
together
• Improve data capture to require an ID upon record
creation
• Manual vs programmatic cost-benefit questions
• Design new reporting and analysis tools to
leverage newly linked datasets
10. Researchers – create Current Research Information
Systems (CRIS) – one portal to figure out how to
best conduct research, who to work with, who will
fund it, what else has been contributed to the
subject thus far, where is the best equipment to
help further the research.
Funders – Want to track areas of interest, identify
worthwhile pursuits, and see where their money
goes.
Institutions – Demonstrate research output more
accurately and precisely describe the institution’s
contribution and who is affiliated with that work.
Publishers – Facilitate transactions of all types from
content discovery to delivery of author royalties.
Improved market analysis and targeted advertising.
11. ISO Standard 27729
ISNI is designed to be a
“bridge identifier”
Covers any type of entity
ISNI Number ISNI Number
Party ID 2Party ID 1
Proprietary
Information and/or
Metadata
Proprietary
Information and/or
Metadata
12. In cooperation with ProQuest, OCLC, and
other public and commercial entities,
Ringgold has been working to map ISNIs
to deeper datasets for the past two
years.
It’s taken time due to the problems with
the raw source data, and the policies for
assignment of the unique ISNI identifier.
13. At the same time ISNI records are loaded
to the Ringgold Identify Database we will
being issuing ISNIs for institutions.
ProQuest (Bowker) is a Registration
Agency as well, focusing on individuals.
17. It was a desire to “help” authors differentiate and disambiguate
themselves that got ISNI started.
Along the way, a lot has been learned. A specific example, that
often doesn’t get a lot of attention, is the need for privacy
protection whenever there is an Identification process underway…
this holds true for individuals and institutions.
Our industry spends a great deal of time discussing “open data”,
but there are many times when that data should not (or cannot) be
made public (physicist romance author, animal tester, military
applications, etc….)
18. The Semantic Web cannot exist
without well structured data
Things take on a life
of their own
Vastness
Vagueness
Uncertainty
Inconsistency
Deceit
The challenges to creating a world
of content tagged with meaning:
Standard Identifiers can help with the
middle three – Artificial Intelligence
will handle Vastness and Deceit
Let’s take a moment to orient ourselves on the big picture…
Our trees are interesting! Publications, vendors, authors… all the people places and things can be described using standard taxonomies and identifiers.
This aerial view of our forest home provides a bit more perspective – but we’re really headed to a place where we can use standardized descriptions to develop new information
Here, we’ve virtualized our understanding of the world by using data ---from this perspective, not only can we look at things far beyond our immediate sight, but are able to view our surroundings in different contexts and with much deeper analysis than by simply looking around– this is where we are headed when looking at the universe of people, the world of places, or all the stuff in it. We used to look at long lists of people we though might be customer, those that were already customers… and now we have ways to better understand who is really using our content, who is funding the most highly accessed research, and who’s are the individuals and institutions involved?
You’ll note that I’ve used the term “Standard Identifiers” as opposed to just “Standards”… I’ll be focusing on using standard identifiers as the main data hooks that will allow us to aggregate information for the purpose of synthesizing knowledge.
Interoperability implies communication; how we communicate something is very different than how we describe things.
I should take a moment to clarify that the Ringgold ID is not a standard – not an ISO Certified standard, in any case, but in many cases, our data has become a defacto standard through application; some of you might be wondering what then constitutes a big “S” Standards – if any system uses a predefined taxonomy as an authority file to validate data (thereby achieving identify data entries for each and every instance it is needed) then a standard has been achieved.
How data is exchanged is quite different than the data itself, and of course, standards may be applied to both. For my part, I’m going to talk about the data itself, not how it is exchanged– So, in terms of the data itself, what are we trying to standardize? Descriptions – the wrapper around highly unique content. More importantly, as an industry—as a species, really—we are creating data elements that can be interpreted by machines – I should say, easily interpreted by machines – I’ll come back to this topic near the end of my presentation.
INTERNAL – Let’s look at your own ecosystem.
Linking of data: Enable staff to use your data more efficiently, and keep the same view of an institution regardless of what system they are using. See overlaps and outliers when comparing two or more datasets.
Example 1: Compare your fulfillment – active subscriber list – w your doc storage system, and see which subscribers have never submitted their license agreement.)
Example 2: We’ve got a client that uses 3 systems to take and fulfill institutional subscriptions: CRM, authentication, and an accounting platform. Before linking these systems up with identifiers, there were disconnects that affected their clients: sometimes it was impossible to tell why the auth system was granting journal access to a particular institution – the access seemed unconnected to the payment.
Loads of benefits: IF STANDARDS ARE INTEROPERABLE
Bridge Identifier – this is an extremely important concept– there are identifiers, and there’s data… and while identifiers are data, not all identifiers operate or are maintained in the same way, and this is the important difference between and ISNI and a Ringgold ID.
Mention that we are now board members (Laura).
(ISNI straddles persons and institutions, so this will make a nice segue.)
INTERNATIONAL STANDARD NAME IDENTIFIER, Iso standard.
SCOPE: ISNI is meant to identify all things considered to be public parties, mostly which are creators of content, or otherwise appear in library & union catalogs (including fictional characters?). Typical records hold name variants, as you can see here. It is not limited to the scholarly or research sector, but covers all manner of popular authors, musicians, and contributors. (Original ISNI dataset was populated with VIAF records & other bibliographic sources like the Library of Congress and other international sources.)
Our relationship: RIN is an ISNI registration agency, which means we will be working as a conduit for new record creation within our scope, which is primarily institutions in the scholarly supply chain. It is our plan to hold ISNIs for all institutions in our Identify database, and we are now we are working to ensure that all ISNI records which map to RINs are correct, and that we can achieve clean one to one matches. We are also working with them to create new ISNI IDs for insts in RIN, but that are not yet in ISNI. By mapping our database completely to theirs, we hope to put our clients at the starting line, so that our clients may maximize their supply chain linking.
To look at a few specific records: Here’s an ISNI, but an institutional record rather than the personal record we saw earlier. Again, note all the name variants as they appear in library holdings records.
I should mention that this record illustrates one of the biggest problems everyone is confronted with--- the Many-to-One (One “Golden Record” as one major publisher refers to their internal authorative record). Here we have many names for the same institution… all attributed to the 1 ISNI – this is not ulinke what Ringgold does– we have ‘alternate names’ for each Ringgold ID stored within the Identify database, and by the end of June, the ISNI names will also be linked to the Ringgold ID (mostly… there’s not a 100% 1|1 match between ISNI and Ringgold… that’s another story.