2. Crosswalk shows people where to put the data from one
scheme into a different scheme. They are often used by
libraries, archives, museums, and other cultural institutions
to translate data to or from MARC, Dublin Core, TEI, and
other metadata schemes.
3. Crosswalks can apply to content standards, vocabularies,
or both. An automated crosswalk process may take
an instance of a metadata description that is presented in a
particular format and change the format and element
names and the values within those elements (i.e., the
vocabulary) to meet the requirements of the second
standard.
4. Crosswalking is generally done when datasets using
different metadata standards or vocabularies need to be
integrated. For example, consider a website providing a
searchable metadata directory. If the different datasets
composing the directory were described using different
standards and vocabularies, it would be difficult for a user
to search across them effectively.
5. If someone was interested in wave height data, she might
need to know to search for “wave ht (m)” in one dataset
and “wave amplitude” in another. A crosswalk that defined
these two elements as synonymous would allow a website
to be constructed that allowed the user to search on either
term, and retrieve applicable results from both datasets.
6. Due to the complexity of metadata content standards, there
are few automated processes to crosswalk between
content standards. Even in those cases where automated
crosswalks exist, inevitably some information is lost when
crosswalks are made. This is due to the complexity of the
standards and potentially non-overlapping subject areas.
When there are subject areas that do not overlap, even
manual translation between standards does not result in
complete information transfer.
7. For example, say an archive has a MARC record in their
catalog describing a manuscript. If the archive makes a
digital copy of that manuscript and wants to display it on the
web along with the information from the catalog, it will have
to translate the data from the MARC catalog record into a
different format such as MODS that is viewable in a
webpage.
8. Because MARC has different fields than MODS, decisions
must be made about where to put the data into MODS. This
type of "translating" from one format to another is often
called "field mapping," and is related to "data
mapping," and "semantic mapping."
9. Crosswalks also have several technical capabilities. They
help databases using different metadata schemes to share
information. They help metadata harvesters create union
catalogs. They enable search engines to search multiple
databases simultaneously with a single query.
10. Crosswalk tables are often employed within or in parallel
to enterprise systems, especially when multiple systems
are interfaced or when the system includes legacy
system data. In the context of Interfaces, they function as a
sort of internal ETL mechanism.
11. MARC field Dublin Core element
260$c (Date of publication,
distribution, etc.)
→ Date.Created
522 (Geographic Coverage Note) → Coverage.Spatial
300$a (Physical Description) → Format.Extent
For example, this is a metadata crosswalk from MARC to Dublin Core
12. One of the biggest challenges for crosswalks is that no two metadata schemes are 100% equivalent.
One scheme may have a field that doesn't exist in another scheme, or it may have a field that is split
into two different fields in another scheme; this is why you often lose data when mapping from a
complex scheme to a simpler one. For example, when mapping from MARC to Simple Dublin Core, you
lose the distinction between types of titles:
MARC field Dublin Core element
210 Abbreviated Title → Title
222 Key Title → Title
240 Uniform Title → Title
242 Translated Title → Title
245 Title Statement → Title
246 Variant Title → Title
13. Simple Dublin Core only has one single "Title" element so all of the different types
of MARC titles get lumped together without any further distinctions. This is called
"many-to-one" mapping. This is also why, once you've translated these titles into
Simple Dublin Core you can't translate them back into MARC. Once they're Simple
Dublin Core you've lost the MARC information about what types of titles they are so
when you map from Simple Dublin Core back to MARC, all the data in the "Title"
element maps to the basic MARC 245 Title Statement field.
Dublin Core element MARC field
Title → 245 Title Statement
Title → 245 Title Statement
Title → 245 Title Statement
Title → 245 Title Statement
Title → 245 Title Statement
Title → 245 Title Statement
14. This is why crosswalks are said to be "lateral" (one-way)
mappings from one scheme to another. Separate
crosswalks would be required to map from scheme A to
scheme B and from scheme B to scheme A
15. The Crosswalk Process
The process of mapping between content standards or
vocabularies is usually divided into the following steps
16. 1. Harmonization of Metadata Standards
Metadata standards are often described in terms
of element names and definitions. A standard defines the rules
for how the metadata are structured and also the appropriate
content for the various elements.
However, different standards can be stated in different ways.
In other words, a particular standard (the source standard)
doesn’t have to use the same element labels (names) for
similar content, or allow the same terms to be filled in to each
element as another standard (the target standard).
17. In the harmonization process, the source and target metadata standards
are resolved with the same syntax or model. In the simplest case, this is
done by creating a table of fields from each standard in a common
application (e.g., a spreadsheet). The table rows would likely contain
elements from the source standard that are in some way related to
elements of the target standard. In the simplest case, there would be one-
to-one relationshipsbetween source elements and target elements.
In more complex harmonization cases, there are one-to-many or many-to-
one relationships. Also, intra-relationships between the elements within a
single standard must be thoroughly described as part of the harmonization
process. Of course, this implies the elements must be thoroughly described
in the source and target standard.
18. 2.Semantic Mappings
The term semantic mapping as applied to metadata is a visual or
tabular strategy for establishing the relationships of vocabulary
termsbetween data sets.
Basic Relationships
When creating mappings among vocabulary terms, the mapping
organization requires a good set of basic relationships. The most
common relationship, “is the same as,” is usually too narrow to
adequately map all terms.
19. 3. Rules for Complex Metadata Mappings
The introduction and definition of rules is an essential step for most
cases of creating semantic mapping between standards because of
complex relationships that often exist.
To deal with complex mappings (when the mapping from
source element to target element is more complex than one-to-one)
between standards, we require the introduction of rules.
20. As an example, consider the case of a source standard having a
single element for the address. The target standard may
represent the address using multiple elements, such as street
address, city, state, zip code, and country. An automated rule
could be established to identify certain province or state names,
essentially parsing the single element address into its
components. Alternatively, a manual rule may also be created,
one that specifies that manual intervention is the only method to
properly separate the address components.
21. Transformation of Metadata Descriptions
Transformation is the process of creating a target instance of
the metadata description from the source instance. The
transformation usessemantic mapping and rules to create the
target instance.
It is important to note that the result of the transformation is a
metadata description. The created description is sometimes
referred to as a crosswalk, but this is an inappropriate usage of the
word. See the Crosswalk guide for more information about the
distinction.