1. ROUGH EDITED COPY
ALA-SPECIAL TOPICS
APPLICATION PROFILES
AUGUST 12,2019
CART CAPTIONING PROVIDED BY:
ALTERNATIVE COMMUNICATION SERVICES, LLC
www.CaptionFamily.com
* * * * *
This is being provided in a rough-draft
format. Communication Access Realtime
Translation (CART) is provided in order to
facilitate communication accessibility and
may not be a totally verbatim record of the
proceedings
* * * * *
>> Hi, everyone. We have about five minutes
until we get started. If you haven't already
done so, feel free to introduce yourself or your
group in the chat space while we wait to begin.
In between sound checks there will be only
silence, thank you.
>> Hi, everyone, we are going to get started
in just under two minutes. If you haven't
already, use the chat space on the side of the
screen for yourself or your group. We'll be
starting shortly so stay tuned. Thank you.
>> It's noon Eastern Time. We are going to
go ahead and get started. Thanks for being with
us today. We're happy to bring you the final
webinar in the special topics series. Most of
you have attended other sessions in the series
already, for those joining for the first time I'm
2. going to cover some brief technical -- you can
use the chat area to interact with the
representer -- already some chat in that area.
You can chat in the space at any time. If you
don't see the chat window, click on the chat
bubble icon in the bottom center of the screen.
You can view the captions in the multimedia
viewer in the lower right-hand side of the
screen.
Click continue as it is a safe site. If you
have technical questions during the event, we ask
that you private chat host. The private chat is
simply click down the pull down window and --
we'll have a few Q&A sessions throughout today's
webinar. To ask Gordon questions, type into the
chat space and make sure your to box is set to
all participants. We have got 100 people in
attendance today. If your audio breaks up or
drops out during your presentation reconnect by
hanging up -- communicate in the top of the
screen -- bottom menu row and selecting audio
connection. If you're listening through your
audio broadcast and you notice echo, make sure
you don't have -- simultaneously. If you do,
simply close one and please note the internet
audio quality can be affected by any number of
factors. If you're having trouble, try clicking
disconnect and reconnecting. We are recording
this presentation.
ALA Publishing solutions offers array of --
you can view -- on the ALA store. We are
thrilled to have Gordon Dunsire with us for
today's webinar. Gordon is the RDA technical
team liaison indentifier. He's a member of the
changing section of IFLA and is a --
bibliographic standards including the Library
Reference Model -- and with all that out of the
way I'm going to turn things over to Gordon.
Welcome.
>> Thank you very much, Colton. Welcome to
everyone attending this webinar.
So this webinar is in the special topics
workshops series and the topic for today is
3. application profiles. Application profiles are
probably a new idea to most of the people
attending this webinar. But as I hope to show in
fact, it's a case of old wine in new bottles
because some of the concepts behind application
profiles should be very familiar to users of RDA.
So here are the stated learning objectives
for today's webinar. There are three.
An understanding of the general purpose,
components and utility of an application profile.
Practical knowledge of the methods for
implementing and using an application profile
with the RDA toolkit.
And an appreciation of issues in the
development and management of RDA application
profiles for institutions and communities and,
indeed, individuals.
As I'm going to do the webinar in three
parts. There will be a pause between each of the
parts for questions and obviously questions at
the end when I've finished.
So the first part of the webinar will look at
the current provision of choice in RDA toolkit.
The second part we'll look at the characteristics
of an application profile. And the final part
we'll discuss the documentation, management, and
development of application profiles.
So choice in the original or current RDA
toolkit is everywhere, although perhaps not
obvious at first glance. So I'm going to take
you through a number of areas where there is
clear choice in the current toolkit. And the
first of these, and I think the most obvious one,
are the components of access at points. The
current toolkit has a generic approach to the
construction of access points. And in
particular, the addition of qualifiers to create
authorized access points.
And part of this approach is that the data
can be recorded either as a separate element, in
this instants we're looking at associated with
corporate body, or the same data might be
4. recorded as part of an access point and, in fact,
there's a third choice because you can do both.
So this choice over recording or how to
record or where to record specific pieces of data
has always been there in the original toolkit.
The same can be said for recording methods.
The original toolkit has several areas where
we have a choice for recording relationships
between agents, the choice is usually in the form
of an identifier or an authorized access point.
And again, we see the choice of recording both
or, indeed, either of these. So there's three
choices present. We now recognize these as being
different recording methods.
There's also a great deal of choice in the
original RDA with relationships and the choice of
relationships. So the general guidelines on
using relationship designators states that the
level of specificity that is considered
appropriate should be chosen for purposes of the
agent creating the data. And that purposes of
the agent creating the data is really what is in
the mind of the agent when they are creating data
for a specific application.
And RDA offers a hierarchy of what used to be
called designators and relationship elements.
And the instructions allow a choice of any
appropriate designator in the hierarchy. And
that choice depends on the application, the
policies of the agency, and the utility, the end
utility, of the data. So as this example shows,
we can choose the relationship designator screen
writer or the more designator, author. There's a
second layer of choice in the toolkit with
relationship granularity too, which is that if
the designators listed in the appendicses are --
to indicate the nature of the relationship. So
there's quite a lot of choice involved in this
couple of paragraphs.
We also have embedded options. These are not
listed as options but they're optionalities
indicated by the use of or. In this example, we
have a condition which is a manifestation
5. consists of more than one carrier type. And
there are three options given for tickling this
situation. The first is to record the carrier
type and extent of each of the carriers present
in the manifestation. Alternatively, the carrier
type extent and other characteristics can be
recorded for each carrier. Or only the
predominant carrier type needs to be recorded.
And that's just done in general terms. Then
there's actually a latent option underneath these
fairly obvious options, those record additional
characteristics of particular carriers if
considered important for identification or
selection.
Although I haven't made a specific slide for
this, that phrase "if considered important for
identification and selection" is ubiquitous
throughout the current toolkit, and the
importance is subjective that's associated with
the application the data is intended to be used
for.
We also have options mentioned explicitly in
the current toolkit. These started to creep in.
And I'll explain why shortly.
Here we have an example, again, of recording
predominant carrier types in terms. We are given
a basic instruction to record the predominant
carrier type and the extent of the manifestation
as a whole.
But there's also an optional omission, which
is to omit the number of units if it can't be
readily ascertained or approximated. And as an
optional addition which says if the carriers are
in a container that's an addition, then name the
container and record its dimensions. I don't
know how many options we actually have here, and
are you allowed to make the omission and the
addition at the same time, I guess.
What is worth noting, however, is the
increasing number, as we've gone through the
examples, of policy statements that are being
associated with the options. Here we have LC and
LA British library German speaking countries and
6. [Away from mic] Canada. And I think that must be
Norway, going for the optional omission, but a
similar set also having something to say about
the optional addition. And these options were
clearly required by the agencies represented by
those policy statements.
And then finally, the final general category
of options in the current toolkit, are the
alternative instructions. These are clearly
marked as alternatives. And here the example is
a manifestations consisting of moving images and
it's to do with the preferred source of
information for describing the manifestation. So
the basic instruction is to use title frame or
title screen as the preferred source, but there's
a heavily used alternative which says use a label
that is affixed permanently to the resource.
Now, I can't remember the full details of this
alternative when it was introduced into the
toolkit, but I'm guessing that there's a mix here
of practicality and resource consideration. It's
not always practical to look at the title frame
of a moving image, manifestation, and in some
instances resources are simply not available to
do so. And the useful information can be
obtained from labels just by glancing at the
label of a VHS cassette, for example.
So what's been driving all of this choice?
I marked the beginning of this as being the
year 2011, and that was the year when the Anglo
American agreement started to fall apart because
a non-anglo American community joined the non-RDA
government structure. And as newbies, the German
national library remained relatively quiet and
attentive for the first year or so of their
participation in the development of RDA.
But as their confidence grew and as the joint
steering committee and the RDA steering committee
started to consider further development of RDA
toolkit, it became apparent that we needed to
start introducing alternatives, as we called them
at the time, in order to meet the expectations
and cataloging practices of this non-anglo
7. American community. The writing was on the wall
and as a direct result of the participation of
the German national library in the government
structure, the committee of principles took the
eventual decision a few years later to revise the
strategy for RDA toolkit as a business, as a
product.
And that strategy, which remains in operation
but is now being reviewed, identifies three
strategic markets for expanding the use of RDA.
They are the international communities as
directly stimulated by the relationship of the
DNB. The cultural heritage communities, many of
the existing national institutions involved in
RDA development were beginning to have
requirements to extend cataloging instructions to
non-library materials, archives and indeed,
museum collections.
And a third main community for the strategy
is to make RDA more compatible for linked data
communities, and indeed, that process started a
couple of years earlier with the London meeting
arranged by the committee of principles and the
-- the result is that 8 years later and up to
date, the 3R project, the toolkit redesign and
restructure project, now results in RDA saying
that nearly every instruction is what we used to
call an alternative in the current toolkit but we
now refer to it as an optional instruction.
So this has been the driver for this decision
that most things in RDA need to be optional.
Obviously, there are other factors going on.
There was the failure, even after 20 years of
investment, by the International Federation of
Library Associations and other international
organizations to get agreement on a top-down
approach to metadata standards and to have a kind
of single set of instructions that everybody
could agree were required for their applications.
So in the new toolkit, the restructuring has
resulted in multiple layers of modularization,
this is a deliberate design choice. There are
many reasons for it, some of which have been
8. explored in other webinars. But with the
modularization comes choice at multiple levels,
so there are choices of entities. An application
can say we will not make a distinction between
persons, families, and corporate bodies. Instead
we will just use agent as our entity. Some
applications may choose to use RDA entity, the
top level entity, in place of one of the entity
subtypes in RDA, the choices there. And then
there's the elements, all the elements are
optional. There is one single exception, which
is nomen string, but nomens are very strange
entities. But virtually every other element is
optional in the sense that it is not mandatory.
Perfectly good RDA metadata can be constructed
from any subset of the elements.
Next choice in recording methods, we have
extended the applicability of the four recording
methods which are latent in the current toolkit.
They have been extended to virtually every
element within the new toolkit.
Sometimes they are not applicable, but in
many instances all four methods are perfectly
feasible. And then finally, the bottom layer,
the most granular layer of modularization and
choice are individual instructions, most of which
are marked as options.
This is designed to allow a balance, which is
governed by context. It's the context of the
application. The context can vary from a child's
library up to a nuclear weapons research
Institute. It can range from a small rural
public library to a national library, and clearly
the context is what drives the application and
the choice of which parts of RDA are to be used
for that particular application.
And balancing the context, the context colors
the choice, but the choice is made by balancing
judgment over specification. Catalogers'
judgment has always been a feature of RDA, it's
an absolutely feature of RDA. But at the same
time, some catalogers are working on applications
where things are very specific, no judgment is
9. required, and the catalogers should feel
comfortable in just following a set of, in the
context of an application, what are now options,
they're just a set of instructions of what to do.
So to sum up, and to tackle the obvious
argument about whether all this choice leads to
inconsistency, this is in some sense a value
proposition from the new beta toolkit. The local
values of most metadata elements are going to
vary naturally. They will vary in the source of
information. They will vary for cultural factors
such as language or conventions for naming or
titling things. And they will vary by policy;
what is the audience for the value of this
element?
Should I use a long word or should I use a
simple word, et cetera.
There is no one size fits all.
But the RDA entities and elements are
consistent with the IFLA Library Reference Model.
The IFLA Library Reference Model itself is a very
highly structured and it has an
entity-relationship modeling methodology behind
it, which ensures some kind of internal
consistency within the model. And RDA has taken
that consistency and built on it by applying good
practice from the semantic web communities in
particular to provide the RDA entities and
elements with coherent semantics. What I mean by
that is those semantics are there to help
machines process the data the way we want that
data to be processed to fit the particular
application.
The result of this is that despite the level
of choice in RDA, RDA can offer assured levels of
global interoperability of metadata produced
using RDA's entities, elements and instructions,
even though what is recorded may vary wildly and
widely between applications.
The underlying semantics mean whatever we
choose to record as a value for a particular
element can be interoperated at the particular
level of assurance in larger systems where
10. metadata's brought together from different
sources.
And some of the mechanisms that achieve this
are the element hierarchies which are discussed
already, the fact that every relationship has a
reciprocal, and the new design feature in the
toolkit, which is we don't relegate designators
to an appendix, we give them equal treatment
because we recognize that all options are equal
in the context of the specific application.
And the end result is that provided we as a
metadata community agree on what is being
described, the things that are being described,
we are free to use local descriptions to describe
those entities and yet remain interoperable at a
global level.
So that's the end of the first part. I'll
take any questions about what I said already.
>> MODERATOR: Thanks, Gordon. I'm taking a
look at our question list here. I'm not seeing
anything here. We'll give people a few seconds
here to finish typing anything they might be
getting in.
There is a question that came in. Gordon, I
agree that there cannot be complete consistency
but still I think about 80% consistency between
libraries and the Anglo American and European
world can be achieved. Wasn't it worth having
this?
>> Well, it's still worth having it,
obviously, just not in the past tense. Libraries
in the Anglo American and the European world
merely have to agree on an application profile.
If they've already agreed on 80% consistency,
then I think they are 80% of the way to
maintaining that consistency using an application
profile. So where consistency exists already,
there's no reason why it can't continue to exist.
RDA isn't saying record things differently. RDA
is saying you might want to record things
differently and we aren't putting barriers in
your way.
11. >> MODERATOR: Thank you, Gordon. I'm not
seeing any other questions at this time.
>> Okay. So on to the middle part.
What is an application profile?
This is the current new toolkit definition.
It's a specification of the metadata that is used
in an application. That specification may
include the entities, the elements and vocabulary
and encoding schemes that are used, and the
mandatory and repeatable status of elements. In
fact, that scope notes only gives a flavor of the
kinds of things that may go into that
specification. So there's a wee bit more than
what's indicated here and I'll show that later in
the presentation. And I mentioned earlier that
the joint steering committee had already started
to interact with linked data communities roughly
12 years ago. And the linked data communities
that that meeting was set up following a meeting
between representatives of ALA Publishing and the
Dublin Core Metadata Initiative, and DCMI at that
point was in the final stages of approving what
they called the Singapore framework, which was a
kind of model. I'm not going to show it to you.
It's highly technical. But it was a kind of
model that showed how the different bits and
pieces floating around could actually fit
together into a useful set of interacting tools.
So the idea is not new to RDA, is the point
I'm making here.
But it's been familiar to RDA for the past
decade.
What do you profile for?
What utility does it have?
These are the four main ones that immediately
spring to mind. There are other uses.
The most obvious one is the most up front of
these utilities, is for the profile to act as a
front-end to the RDA toolkit. A profile will
tell you what elements to use. It will tell you
a lot more about which choice to make from the
toolkit. You might think, therefore, of it as
being some kind of choice matrix that says use
12. this choice, use this choice, use this choice.
But the same information can be used to drive
data input for the application profile is going
to specify, for example, enough information for a
data input form to populate a dropdown list of
controlled terminologies to be picked from to
specify which recording method is to be used, the
data input form can control the input so it
conforms to a particular recording method, for
example.
Indeed, we've explored some of this utility
at the pre-conference workshop at AL amid winter
in Seattle earlier this year. And again, in the
pre-conference workshop in Washington annual,
where we were able to use RDA and many metadata
formats as a data input form.
That's the front-end.
At the back end, this is something systems
people get involved in.
You can use an application profile to
validate data coming from a set source of
information. If you have publisher metadata
being ingested, for example, or data being
supplied by a different community, then you can
use an application profile to validate that data.
The application profile may say this element is
mandatory. If the system doesn't find it in the
incoming data, it's flagged as bad. This can be
used for conformance and quality control
purposes. And particularly for linked data
applications, the profile can also be flipped
around as a back end tool and act as a filter for
extracting data from the cloud, as it were. So
if you've qualities of linked data in particular
out there, your application is not going to be
interested in the 99% of it. It's only going to
want the bit that it needs and the application
profile can specify what is required by the
application.
So how does it do this?
There are two sets of profile
characteristics. The first set are generic.
This is what you will typically find in any
13. application profile that is based on the Dublin
Core idea of application profile.
So the profile will typically specify what
elements are to be recorded as part of the
metadata description set for an entity. Whether
each of the elements is mandatory, required, and
whether each of them is repeatable or not.
The profile typically also specifies which
vocabulary encoding scheme should be used as a
source of data for the element.
And where there is no suitable vocabulary
encoding scheme, the profile can also specify
which string encoding scheme can be used to
assemble the data for an element using data from
other elements. For example, the assembly of an
authorized access point.
So those were the generic kinds of things
that are specified by a profile.
An RDA application profile, and that is a
controlled term, we made it up, we defined it, we
know what we mean. An RDA application profile is
a sub-type of profile and it's restricted to RDA
elements.
For the techies in the audience, I will
explain that the original idea of an application
profile from Dublin Core was to allow the
specification of elements from multiple different
name spaces, different agencies, different
metadata traditions so that you could mix them
all up and the specification would tell you how
to do that. RDA has 3,000 + elements and it's
sufficient to be able to just select from the RDA
elements. So an RDA application profile is
restricted to RDA elements only.
And in addition, there are a number of other
parameters that can be specified in the profile
that are unique to RDA, and that's a strong
claim. They're unique, at least in my knowledge,
to RDA. But for all I know, other standards may
very well have similar features.
So for RDA application profiles, we can link
to the RDA toolkit at those multiple levels of
granularity that I showed earlier.
14. We can specify the recording method that is
to be used for an element if a choice is
available. We can specify which specific
optional instruction is to be applied to an
element. And we can also specify the policy
statement to be applied to an element that will,
indeed, be a choice of policy statements as there
are in the current toolkit.
So I'll take you through each of these
parameters in turn, and to give you an
appreciation of what's involved. It's not nearly
as complicated as it may appear at first sight.
So first of all, we need to be able to specify in
an RDA application profile which RDA elements we
are going to use.
And we actually have several different
methods of specifying an RDA element. It can be
referenced by any of its toolkit label, its
registry label, its identifier or it's IRI. Each
of these is unique, and therefore, each of them
can be used to reference ambiguously a specific
element. And I give examples here of
content-type, the toolkit label's content type,
the registries' label in content type, the
identifier which is the equivalent of a compact
IRI.
How do we specify mandatory status?
There's a neat little trick involved here.
As indicated by the minimum number of occurrences
for each element. If I specify that an element
has zero minimum occurrences, it means that it's
optional, because I can simply ignore it.
And that's my option.
If the profile specifies the minimum
occurrence of one or indeed more than one, then
the element is mandatory, a value is expected to
be supplied, to be recorded, for that element,
and the number in the minimum occurrence, the
number that you would expect to supply. It's
very difficult in the bibliographic universe to
think of mandatory numbers being greater than
one, but in an institutional repository, which,
for example, may be dealing with research papers,
15. with typically 50 or 60 individuals named in the
statement of responsibility, then you may say,
okay, let's make the mandatory number two or
three. You have to record the first two or three
at least.
And the trick is continued in specifying
repeat ability status. This is indicated by the
maximum number of occurrences that an element is
expected to have. If it's not specified, then
you can assume that the element is repeatable.
It has no maximum number of occurrences, and you
can repeat it as often as you wish.
If you specify the maximum number of
occurrences as one, however, you are saying that
you can have one and only one values for this
element, and therefore, it's not repeatable.
So for those applications that wish to
restrict the description of a manifestation to
having a single title proper could set the
maximum occurrence of that element to not
repeatable. The example here, I set the maximum
occurrence of content type to one, and this has
implications because a condition of the
expression I am describing, it may very well have
more than one content type and I'm only allowed
to use one. So the application profile is going
to have to help me out somewhere else with making
the further choice.
The vocabulary encoding scheme in the current
toolkit has a slightly narrower from the beta
toolkit and this is a definition from the beta
toolkit. What we did during the 3R project was
expand the idea of a controlled list of values, a
controlled terminology, from restricting it to
the controlled values of attribute elements. And
we've extended it to the controlled values of
instance data. That is what you typically call
authority files. Access points, identifiers and
IRIs of individual persons, individual
manifestations, individual works, individual
places, et cetera.
16. So vocabulary encoding scheme is a controlled
terminology or authority control system, that's
the best way of thinking about it.
Now, RDA provides controlled vocabularies for
some attribute elements and content type is one
of them.
So there is a vocabulary for rdacontent type
and there is an option in the content type
element that say you can, to record a term from
that vocabulary.
We can specify which VES to use in the
application profile. But RDA, as the current RDA
does and it's continued into the beta toolkit,
always provides another option which is to record
a term from another suitable vocabulary encoding
scheme. So there are several that could be used
for content type, but a fairly well known one is
the Library of Congress content type scheme,
which is nearly identical to the vocabulary
encoding scheme of RDA.
And so, instead of specifying the RDA
encoding scheme, you can specify to use the LC
content type scheme instead.
If you don't find the value you want in the
vocabulary encoding scheme or you are indeed
creating data for an authority control system
where you're actually creating access points for
the instances of entities, then you may wish to
use a string encoding scheme to construct those
headings, those access points.
A string encoding scheme, it's an RDA take on
something called the syntax encoding scheme in
the Dublin core application profile. And it's a
set of string values and a set of rules that
describe a mapping between the mapping of strings
and values of an element. What it does is
specifies the values that should be included in a
compound string. A compound string can be an
authorized access point, but it could be
something like publication statement, which is
built up from subelements.
So a string encoding scheme will specify
which elements to be included, the values of
17. which elements are to be included, and any fixed
text. So when we are creating access points for
compilations, we may wish to add the fixed text
works to the authorized access point for the
title, for example.
So fixed state boilerplate can also be
specified. So finally the sequence of strings
and punctuation, how they are distinguished,
delimited, con joined, is also specified by the
string encoding string.
So there's a big warning here. Currently the
way that string encoding schemes are being
presented in the beta toolkit is very much under
construction and is not how it's going to look in
the end when we finish building it.
What you see here is taken from the beta
toolkit and it appears to be two separate
options, but in fact, they are linked together.
The first option specifies the order of the
elements to be used. And the second option
specifies the punctuation pattern that is to be
used. But we need to do something about this
display. It's an active development topic.
But there may be, I say may be because RSC is
still deciding to what extent we incorporate
string encoding schemes within the toolkit and to
what extent we leave them to external agencies.
And the best way of thinking about this right now
is the international standard bibliographic
description ISBD is one gigantic set of string
encoding schemes with all the punctuation
patterns that it has. So these seem to be taken
care of well enough outside of RDA. So RFC is
looking at this whole area of string encoding
schemes.
For the time being, however, we've
transferred over from the current toolkit the
instructions from the current toolkit to the beta
toolkit and then spread them out using this fake
device.
So don't pay too much attention to what the
beta toolkit is saying about this currently.
18. But in our application profile, we can
specify what string encoding string is to be
used. One way of thinking about this is which
ISBD punctuation pattern should I apply to my
string.
Those were the classic, traditional
parameters for a Dublin Core application profile.
Now I'll move on to the RDA specifics.
One of the features of RDA, the new toolkit,
is that every instruction actually has a link to
it. And these URLs, these hyperlinks, can link
from outside, from an external document into the
toolkit.
And in particular, we have links to the
instructions for an entire element. So if we
specify content type as one of our chosen
elements in the profile, we can provide an active
link that the cataloger can click on and go
straight through to the whole page of
instructions for content type.
That's one level of granularity.
The next thing that we can specify in an RDA
application profile is which recording method to
use. Are we going to use an unstructured or
note-type value?
Are we going to use the structure description
using the vocabulary encoding scheme or string
encoding scheme?
Are we going to select identifiers from an
authority control system?
Or are we going to record an IRI, which may
or may not come from an authority control system?
So the profile can specify which recording
method is to be used within a specific
application. And this can be done in several
different ways. Here, I've just indicated a
label for the recording method IRI, but there are
other ways of doing this. So we can link to that
specific recording method. So this is the
recording an IRI section of a content-type
element. And here is link. This means in our
application profile we could have the cataloger
19. click on the link and be taken straight to the
instructions for that specific recording method.
And finally, we can link directly to any of
the instructions, both options and non-optional
instructions, within the beta toolkit.
So here's an extract from the optional
instructions for the situation where there is
more than one content type in an expression. And
for this application, we're instructing our
catalogers to record only the predominant content
type. So the application profile, when they
click on the link, will take them directly to
this option. And there's no choice here. This
is what they are supposed to do. Record only the
prominent content type.
So I'll take another pause for questions at
this stage. And I've outlined the general and
RDA specific parameters that can be specified
within an application profile.
I'm happy to answer any questions about that.
>> MODERATOR: All right. Thank you, Gordon.
We have had plenty of questions come in this
section. If the emphasis is on structural
interoperability -- why not just use BIBFRAME?
>> Well, why not just use BIBFRAME is the
answer. If your application is happy enough
using BIBFRAME, that's fine, but to the best of
my knowledge, BIBFRAME doesn't have any content
instructions associated with it.
So you could quite easily use BIBFRAME,
specify BIBFRAME elements, but then you would
have to say how to populate those elements,
supply instructions. It's perfectly feasible to
use RDA instructions to populate BIBFRAME
elements. And indeed, that is something that
informally the RSC is looking into. But that's
the answer to the question. And that's
indicative of the symptom of the times we live
in; that there is an enormous amount of choice
out there. It's not just a question of RDA and
BIBFRAME, but there are many, many different
schema out there. Linked data schema, some with
and some without content standards attached to
20. them. And we have to make this interoperable,
and that's one of the great promises of linked
data environments and application profiles.
>> MODERATOR: Thanks so much. The next
question is up to now I have been thinking of an
application profile as a list of elements with
certain information given for these but I find it
difficult to imagine how an application profile
could also include the options to use. Wouldn't
this rather be done by policy statements?
>> It can be done by policy statements. It
can be done by taking a PDF of the RDA toolkit
and using Post-It.
The question is what's good for you?
What's the best approach that suits you,
institutionally and individually, and indeed, the
users of the application that you are creating
your metadata for.
>> And a related question. OCLC's
bibliographic formats and standards is an example
of an application profile or does an application
profile have to be more prescriptive this is how
to do this rather than descriptive?
>> The point about an application profile is
that it helps to make those decisions on choices
and it's a support function for the people who
have to operate within that particular
environment as catalogers in the name, although
systems librarians are important audiences for
this as well.
Sorry. Repeat the question. I've just
forgotten that one.
>> MODERATOR: What we were just talking
about was OCLC's bibliographic formats and --
>> Yeah. Well, you know, you could take the
whole of RDA toolkit as is and say we don't need
an application profile, I'm an expert in RDA, I
know where to look for stuff, I know how to make
the decisions and the choices and I know how the
application works. And that's perfectly good for
some people. I think at the other end of the
spectrum are people who exist in one downs in --
they administer the library and circulate stock.
21. And there's a vast range of people in between.
There are all sorts of tools available, all sorts
of standards to help people make those choices.
But an application profile is specifically there
to help people make the choice, to make the
correct choice where choices exist, in order to
provide the application with a level of quality
and consistency that the application itself
requires. Now, I'll stress this, within the
application, there is no such thing as general
agreement about what happens between
applications, but within an application, an
application profile can certainly assist with
quality and consistency. So you could take the
Library of Congress set of tools and you can ask
the cataloger to figure on with it and many
catalogers may find that quite comfortable and
others might not. So you may very well wish to
draw up a profile that says how to use those
tools more effectively.
>> MODERATOR: Thank you so much, Gordon.
There is some conversation in the chat about
human read ability versus machine readability for
application profiles. Just a few related
comments talking about the minimum occurrence
from the machines that understand that zero and
one can define this. But for human catalogers
wouldn't it be better to say mandatory and
optional. And someone else was discussing --
understandable and have info that software can
use, correct?
>> Yes. I certainly agree with that last
comment. Application profiles are primarily
designed for machine usability. The whole
premise behind this is that as a vast
interconnected network of incredibly powerful
computing machinery at our disposal and we should
be maximizing our use of that infrastructure.
The application profiles are a way of specifying
in machine readable terms, certainly for those
back end functions I mentioned, we expect machine
interpretation of application profiles to say the
incoming data stream doesn't have a title proper
22. and therefore, the application, the incoming
software would mark the data as bad. So it would
have to be a mix. The front-end stuff, the data
input form, the element selection list, these are
all, obviously, aimed at humans. It's human
minds that create the metadata. If we had
machines to catalog stuff, we'd be out of a job
and not having this webinar. But people have
been looking for that solution for decades and it
doesn't appear to be happening. So what we need
here is an interaction between what humans are
good at and what machines are good at. And the
application profile, the choice in RDA, is this
balance as I showed in an earlier slide. The
content is how that balance is achieved. It also
includes its funders. Practical questions have
to be taken into account. So RDA has a specific
goal to provide maximum flexibility and to
provide support where there's a wide range of
applications as possible, the kinds of
applications that RFC has been hearing about for
years, it has to be this mix of human read
ability and machine interpretability and There's
no arena at all why a column at zero cannot be
displayed as repeatable or mandatory. So yeah, I
agree, it's a mix.
>> MODERATOR: Thank you so much, Gordon.
We'll take one more question now. Are there
certain mandatory characteristics for a body of
cataloging instructions as we call them, to be
considered an application profile?
Trying to find old school examples.
>> Well, that's an interesting question.
I don't have a specific slide for this. This
has come up in this last section. But if you
read the new toolkit very carefully, you'll see
that you can actually create a minimal
description of a manifestation or any kind of
entity with one appellation element, that's the
minimum requirement.
Obviously, that tells you what the name of
something is, but it doesn't tell you much else.
23. Sometimes the name may be descriptive and that
may be enough, who knows?
Eventually you might want to add slightly
near data to the description of an entity.
That's where we get the concept of effective
description, but that's entirely based on an
application. We all think we know the common set
of elements. We all think we know what Core is,
the current toolkit calls it. But we don't
really. If you take 10 catalogers from different
backgrounds and context into a room and try to
get them to agree on anything, you know the
answer. Where we think we agree on certain
things, we don't necessarily do that.
But generally speaking, and certainly what
we've tested in the pre-conferences have been a
common sense approach describing a manifestation
and you have to say carrier type, maybe extent,
maybe a simple note on manifestation, appellation
of the manifestation, maybe the title proper or
ISBN or both count as appellations. And that may
be effective. In some applications, that may be
sufficient. In other applications, you may have
to record maybe 50 or 60 different things about a
manifestation to be totally comprehensive. But
what RDA is attempting to say here is it doesn't
matter what that choice is. Your minimal data
will interact and interoperate with a national
standard metadata description set at a particular
level of interoperability. We are not saying
we'll always interoperate 100%. That's insane,
but they will interoperate at certain levels by
systems. And this is very important when we're
processing very large quantities of metadata from
multiple sources.
So I'll go on to this last section and then
we can return to some of these things at the end.
I talked about the existence of more choice
than meets the eye in the current toolkit and the
way that choice is now being made very obvious in
the beta toolkit. What the characteristics are
is a profile that can be used to help make sense
24. of that choice to suit the specific set of local
circumstances, the application.
And then this last section I'll talk about
some of the issues which need to be considered in
the development and management of profiles.
So the first thing is to an application
profile can be documented in various formats.
There is no particular format specified.
The Dublin Core original application profile
used a tabular format that, in fact, that seems
to be the generally accepted approach, but you
can write an application profile as a narrative.
I can easily imagine a 10-page pamphlet being
produced for copy catalogers working in the legal
deposit section of a national library where the
pamphlet just takes the form of a narrative,
saying stuff comes in on legal deposit and we try
to record its title proper and its publisher and
at each point there's a little hyperlink that
takes you to the relevant parts of the toolkit.
And we can go from that very unstructured
approach, which may or may not suit many
operators, and we can go from that highly human
oriented specification of a profile all the way
up to the complete machine readable
specification. There's a lot of activity going
on with application profiles right now. Some of
you may know that the PCC has a task force
looking at it at the Dublin core metadata
initiative has another task force looking at
application profiles and there are at least
another dozen metadata communities looking at
ontologies looking at application profiles.
In RDA terms, an application profile is just
another work, but as we've seen, there are
special characteristics which are not available
as RDA elements.
So we have this huge range of methods for
documenting profiles. And it's very early days,
yeah, there's a lot of activity, some consensus
is beginning to emerge.
But we do have some things that we can use
right now in the toolkit. So we have toolkit
25. notes. This is something that's already in the
current toolkit and is carried over into the beta
toolkit. This can be used for very simple, I
mean very simple, application profiles, perhaps
created by somebody who doesn't catalog very
often, is working in a very, very specialized
collection and just wants to tag certain parts of
the toolkit and then run through that set of
notes as their memoir as to which bits of the
toolkit they require. It's unstructured.
There is no structure present in one of these
notes, but you can specify a lot. So here I said
content type is mandatory in Gordon's
application, it's not repeatable and you should
use the RDA vocabulary encoding scheme.
In addition to that, there is also the user
generated documentation and workflows, and that
is partially still under construction. A number
of bugs are being fixed for the next release of
the toolkit. I know that the British library,
for example, who have used these workflows
extensively before are looking to use them again
with the new toolkit.
Another way of specifying an application
profile is to use the good ol' traditional policy
statement. Again, it can be unstructured or
structured. There is a very wide variety of
approaches, even in the current toolkit, to
policy statements.
There are varying approaches to granularity.
Some agencies favor very large, extensive policy
statements that exist at broad levels. Other
agencies prefer more had specific policy
statements, which can be attached to individual
options. And all of these have aspects of an
application profile attached to them.
Policy statement seems to be a reasonably
good way of specifying an application profile,
which is intended to be used at national
institution or international level and we expect
that to continue, I think, for the foreseeable
future. But there will be, and it's already
beginning to occur, an interaction between policy
26. statements, workflows and notes and this new
concept of application profile. So we can expect
some changes to occur as agencies start seeing
the benefits of using different approaches to
specify these things. Policy statements, as most
of you know, are definitely under construction.
This is one reason why there is a beta toolkit;
it's to allow the policy statement agencies to
start looking at the instructions and to plan
accordingly for incorporating them in the new
toolkit.
So I mentioned earlier that the communities
generally who are looking at application profiles
favor a kind of tabular approach. In this
approach, you have a table, it can be a
spreadsheet, database, a Word table. Each row is
used to specify a specific element. Generally
speaking, we expect applications to group
elements by entity, but that depends very much on
the implementation scenario that they are using.
If you're using a traditional bibliographic
authority approach, then your bibliographic
metadata description set is going to cover work
expression and manifestation, generally speaking.
And you may have an AP for the bibliographic
metadata description set as a whole, which mixes
work expression and manifestation elements. In
other applications, you would keep them separate.
Each column of the table then specifies a
specific profile parameter.
The jury is out on how specific a profile
needs to be. It doesn't have to have all of the
parameters that I have described in this webinar
specified. We can imagine partial profiles,
generic profiles, and highly specific profiles.
But these things are likely to be encoded at some
point in the machine readable format.
But we shall see because that's very much
under development.
This is a random list of the categories of
application that might require specific profiles.
So the first and obvious category is the type
of resource. So we're aware of what are
27. basically application profiles of descriptive
cataloging of rare materials, and for audio
visual materials, for manuscripts, et cetera.
And these all specify which elements to use,
which values to use in those elements for those
types of material.
As I said already, we may anticipate separate
different application profiles, depending on
sector.
The public library service, generally, is not
likely to require the depth of granularity that's
offered by RDA and may choose to take a broader
approach. On the other hand, the academic
library sector may want that full width and depth
of grain later. And finally we can see
application categories, profiles, being based on
geographic communities, cultural communities,
language-based communities like the German
speaking countries, et cetera et cetera.
So there's many, many different ways of
slicing and dicing the specification of an
application.
So apologies to some of you who have seen
this for the fourth time. This is a favorite
slide that I like to recycle. And normally it's
animated but we can't do animations on WebEx.
So I'll try to describe it to you for those
of you who have not seen it before. In the
background is an extract from the guidance
chapter on the coherent description of an
information resource. And that basically
specifies which primary elements could be used to
link a work, expression, and manifestation
together for a single resource.
It also gives the cardinality of the primary
elements. And that's a fancy way of saying are
they mandatory and are they repeatable. So
there's an element of application profiles, even
within this very, very short piece of guidance.
And layered on top of that is guidance on the
minimum description of a resource entity, which
basically says the resource entity is should have
28. an appellation element so humans can read, can
identify which entity is being described.
And then the effective description guidance
says there's this whole range of general and
specialized elements for each entity which can be
used to produce effective descriptions, that is
descriptions that suit a particular application.
So we can imagine one way of presenting
application profiles is designing and
administering them is as a kind of inherited or
nested set, built up in layers. Every metadata
description set for an entity should be a
coherent description. That's pretty well
mandatory. It needs to conform to the minimum
description, but that's a different set of
elements. And then for an application, it needs
to conform to an effective description.
So this strongly suggests that there may not
be a single application profile for a particular
application, if I can put it that way. That we
may be seeing the emergence of very small
profiles that specify only three or four
elements, but which are then inherited or
combined with profiles that specify another set
of elements. So we build up a kind of coherent
profile of standard components. But these are
very much ideas at the moment. As I said, we
have been exploring some of these recently with
real life catalogers. This is a screenshot from
RDA many metadata formats, version four, which is
compatible with the beta toolkit.
And this is showing you part of the set up we
have for the pre-conference in Washington. What
I'm showing you here is a set of templates. We
are going to describe an expression and we can
apply one or more templates to that and we were
exploring this idea of nested profiles. So I
can't remember what 01 is, but the idea is you
could go through, you had to do 01 and 02, and
then you would select one of 03 to 07. And it
allows you to fill in one application profile and
then to add a second profile into it, which
preserves what you filled in already, but then
29. adds additional elements. So you can gradually
build up a kind of data input form, but you can
also build up an application profile that's
nested or inherited. So this is really proof of
concept stuff. And I think we've successfully
proved that the concept is workable and RIMMF is
looking at making this a bit more sophisticated.
That template then is used and turned into a data
entry form. Here's the data entry form in
operation. This is very embryonic. It contains
only two or three parameters from an application
profile. But one of those is it's specifying
which vocabulary and encoding scheme to use. So
RIMMF knows to populate the content type data
entry field with this specific pop down pick
list.
And we hope that in the future, if it's at LC
content types list, then it would go and pull
that off the Library of Congress linked data
server and present that within this data entry
form. So this is showing you the front-end, the
two utilities of the front-end of a selection
device and data entry form.
So we are testing this. This is the last
slide, by the way, so have those questions ready.
So the final thing I say, because very much a
work in progress and the RSC is in the process of
setting up a new working group to look at this.
This is part of the new governance structure.
This will be task and finish. It will be given
certain tasks and it will be given a finish date.
And after that finish date, the group will indeed
be finished.
We're hoping that the group may be able to
start work this fall, but we are in the process
of drawing you want terms of reference for the
group and the initial set of tasks.
So the things we've identified so far that
the group needs to look into are architectures
for profiles, is this idea of nested and combined
profiles a reasonable one and what are the issues
in putting it into practice.
30. What are the issues involved in profile
management?
What additional guidance and documentation do
we need to put into the beta toolkit?
And the group will also work in liaison with
the RSC technical working group to look at the
technical things that are going on with encoding
formats. These are ontologies for specifying
application profiles in machine readable formats.
So those of you who are interested, keep your
eyes and ears open for the announcement of this
group and you'll be able to track its progress
and as it delivers recommendations, I hope RSC
will be putting them into practice in the new
toolkit.
So that's the end of the webinar. Fire away
with your questions.
>> MODERATOR: Thanks so much, Gordon. We
have got a little bit of time left. Feel free to
ask more questions in the chat space, everyone.
Is the RDA prefix in the element identifier
intended to be a name space prefix in XML?
>> Yes. If you look at the RDA vocabularies
released in GitHub, you will see that that
compact IRI -- at the end of the NT and N3
formulations.
>> MODERATOR: Great. Will the links
starting with beta be permanent?
>> Well, obviously not, no. What is likely
to happen is that the base at the beginning of
the link will change, but the rest of the link,
the 27 digit random number, will not change. And
it should be relatively simple to make that
switch over.
When the beta toolkit goes live, I'm guessing
the base domain will be something like audio
toolkit.org or something like that. But it will
be followed by exactly the same string of machine
readable digits as the beta.
>> The beta will change with access. This
is Jamie. It will change to access.toolkit.org
but still be a --
31. >> There will be a redirect for those
application profiles that haven't been updated
but as I say, if you create a machine readable
profile using a spreadsheet or Word or something
like that, then it's a simple global replace
operation that will take five seconds to update
when we're ready.
>> MODERATOR: Thank you so much.
Is there a beta application profile created
by anyone yet to show what it could or should be?
>> Kind of. I've mentioned the couple of
very, very embryonic attempts at pre-conferences.
Eurich has created a draft application profile
but they're busy updating it and along the
structural lines I have been describing in the
webinar, they created it last year before some of
the structural issues had been decided. So ure
goes is looking at a basic application profile.
So we hope to be able to show people a kind of
working application profile certainly before the
end of this year.
But there may be more and there really,
really is nothing to prevent any of the folks
listening in on this webinar going out there and
playing around for themselves. Can you write the
simple application profile?
It's going to do be a lot easier than you
actually think it is. And we're developing some
tools that we hope will help people to be able to
do this. But they're all still in development.
I really urge people to go off and play with
this. Most of the data you need is available
quite openly and downloadable from the RDA
registry. You can go to town on it. I've done
it myself. It's a lot of fun.
>> MODERATOR: Thanks Gordon. You've
mentioned this interoperability and stressed
optional aspects of most elements. Aren't they
mutually exclusive?
>> We can ensure the interoperability at
this -- it was a term coined at a liaison meeting
between ISSN and ISBD and JSC in Glasgow at least
10 years ago. We were discussing this. Was it
32. the case that ISSN manual, the ISBD and RDA would
have to specify exactly the same set of
instructions for describing serials. And during
the discussion, it was pointed out that if all
three organizations specified exactly the same
instructions, then at least two sets of those
instructions were, by definition, redundant, and
why would we duplicate each others' efforts?
So we evolved this idea of functional
interoperability. And that means there are 1,000
ways of naming me, some are pleasant and some
unpleasant, but people use these words to refer
to me. I don't have any control over this and I
don't have any control over how I'm listed in
official documentation, in bibliographies, social
security indexes. But it doesn't matter because
I know who I am. My identity remains the same,
but the strings which are used to name me and to
describe me vary depending on the point of view
or the application or the source of information
that is being used to record the metadata
statement.
So functional interoperability says it
doesn't matter what the strings are so long as
you have identified the identity and we know we
are talking about the same thing, then we can
have these multiple different descriptions of the
same thing. Now, this is a dream of the linked
data movement, but it's a feasible one. But it's
only feasible if we accept that there are many,
many different strings that can be used to name
the same thing.
And that's all I can say.
We have tried to get universal agreement on
we should all use the same strings to describe
the same things. We should all use an isney
number, social security number, there are
enormous problems in reaching that happy state of
affairs. We have a job to do. We have got to
connect people to information. That information
is dirty. It is ambiguous, it is all over the
place, and it comes now in multiple flavors.
That's the challenge that's facing us. And RSC
33. and RDA thinks this is probably the only way
forward is to try to encompass the choice, try
and guarantee that within that framework every
choice you make will work with someone else's
choice to a certain degree. I think that's
probably the best we can do.
And I finished exactly on the 90 minutes.
>> MODERATOR: Yeah, wow!
Look at that. We have reached the end of our
time and the end of our questions. Thank you so
much, Gordon, and thanks to all of you for all
those fantastic questions and discussion. We are
so happy to have you here.
We really appreciate it. We are so happy to
have you all here. Gordon, any final thoughts
before we sign off?
>> No. This is a difficult topic and I know
it's kind of not what we've all been taught, but
hey, stuff changes, life goes on, and we're all
still in jobs just. So I hope everybody has got
something positive out of this and my main
message to everybody is don't panic. This is far
easier than it appears to be and you already know
most of the answers.
>> MODERATOR: All right. Thank you very much, and we
are signing off now. I hope you all have a fantastic
week. Take care. Bye-by