BEA 2014--Let Common Core Power Your Publishing Accompanying Script
Metadata and the Common Core – Credibility Considerations
Narrative associated with Warzala’s Slides
Credible metadata, especially elements related to interest and reading level are essential to promote titles in
support of the goals and objectives of the Common Core. This is especially significant in, given the universe of
publishing, availability of titles in multiple formats, varying degrees of customer sophistication, and the lack of
time customers have to identify relevant titles -- credible metadata is essential.
In Baker & Taylor’s collection management team of 22 resident professionals we support K-12 school (library
and classroom), public library, and academic library collection building. In addition to our Library and
Educational support, our Merchandising team works with retailers and “e-tailers”, too.
We use a database of approximately 10 million titles in all formats, print, digital (e-book and downloadable
audio), spoken word, video, and music. In addition to supporting our internal collection development work,
we present very rich metadata to customers in our web based Title Source 360 that is used by customers for
collection development support and ordering.
Some of our metadata is supplied by our publisher partners in industry standard formats, others elements are
maintained by our Database Administration team, and yet other elements are licensed from third party
suppliers. [Skip to next slide Consumer Concerns]
I have violated a key rule of presentations by putting such a long quote on a slide. The more particular
consumer is skeptical. While some publishers and booksellers have gone to extremes to assure the integrity of
their work in relation to the Common Core, others have merely relabeled existing titles or packaging. In
contrast to the author’s comments, we (the collective we data managers, publishers, and booksellers) want to
promote with integrity, and not be perceived as “hucksters” [Skip to next slide Rational Considerations]
If you represent a publisher that is not going to pursue evaluation in one of the sophisticated text analysis and
leveling solutions that I will outline in a few minutes, a sound step that can be taken is consistent rational
targeting of an intended audience for your titles. Our experience is that a title’s audience level can be
effectively presented in a grade level related to K-12 education or an adult oriented audience level (i.e.
general adult, professional, vo-tech, associates, undergraduate, graduate.) The rationale behind this is that
even if you don’t have a formal textual evaluation, you provide consistent audience levels for book buyers.
I am going to outline a number of external measures related to textual complexity and provide a summary
conclusion about them based on research. An additional note - any one of these approaches merit their own
in-depth analysis and discussion by a linguist, but for our purposes, it is safe to say that they add an element of
credibility to the titles we present to our customers, and these solutions are very complex.
I will also briefly describe where we’re at as an industry, parts of which have already been represented in the
descriptions of Capstone’s approach to publishing and Booksource’s collection building efforts, along with
some ways that organization like Bowker and Baker & Taylor present metadata, what standards organizations
are doing to facilitate data management in support of Common Core, and the challenges that all of us face in
aspects of our businesses as related to data management. [Skip to next slide Consumer Needs]
[A foundation in the Common Core Standards, “…stress the importance of being able to read complex text for
success in college and career. Perception is that the complexity of reading demands for college and career
have held steady or risen in the past half century, than the complexity of texts to which students are exposed
has steadily decreased in this same time period, and that in order to address this gap, there must be an
emphasis on increasing the complexity of test students read as an element of reading comprehension.”]
Textual analysis (and the route to appropriate leveling) can be very complex: there are two major dimensions.
Qualitative – elements best measured by a human; levels of meaning or purpose; structure; clarity… and
knowledge demands, i.e. that that are appreciated and subjective in nature – the stuff of reviews.
Quantitative evaluation is emphasized in Common Core Support - elements such as word length, word
frequency, sentence length, and text cohesion; things that are difficult if not impossible for a human reader to
analyze…typically measured by computer software [Skip to Textual Analysis Approaches]
ATOS – Renaissance Learning (ATOS stands for Advantage/TASA (Touchstone Applied Science Associates)
Open Standard) – formula is text based takes into account words per sentence, grade level of words (based on
a table of values of words); and number of characters per word.
Degrees if Reading Power (DRP Questar Assessment Inc.) – formula based on word length, sentence length
and word familiarity with higher values representing more complex text
Flesch-Kincaid (public domain) - formula that considers words and sentence length used. These elements are
used as substitutes/proxies for complexity.
Lexile Framework For Reading (MetaMetrics) – represents the complexity of text and individuals reading
ability, measures include variables of word frequency and sentence length.
Reading Maturity (Pearson Education) – computational language model which estimates how much language
experience is required to achieve knowledge of meanings of words, sentences and paragraphs in a text.
TEXTEvaluator formally known as SourceRater (ETS) - This is a natural language processing technique; it takes
evidence of text relative to syntactic complexity, vocabulary difficulty, level of abstractness, referential
cohesion, connective cohesion, degree of academic orientation, and paragraph structure, and provides three
separate measures for; informational text, literary text, and/or mixed texts
Easability Indicator (Coh Metrics) - (newer textual metric device) Analyzes the ease or difficulty of texts based
on five different dimensions: narrativity, syntactic simplicity, word concreteness, referential cohesion, and
Comparison – each of these approaches does an analysis of text and produces a “score” – for example at a
grade 6-8 Common Core Band:
Atos – 7.0-9.8, DRP 57-67, Flesch-Kincaid 6.51-10.34, Lexile 925-1185, Reading Maturity 7.04-9.57, or
A non-linguist layperson’s conclusion is that there is value in any of the evidence that any of the methods in
that if displayed or related to title data, it can help target collections. Given multiple choices, I’m guessing
we’d all like a Consumer Reports like rating [skip to next slide – Textual Analysis Resource Evaluation]
No approach performed better in relation to student outcomes – there are variances within the approaches,
yet all climb reliably, though differently, up a text complexity ladder to college and career readiness.
So what can we conclude about these various textual analysis and leveling approaches? [Skip to 6 visually
impaired males and the elephant]
Even after a high level brief overview of the various textual analysis programs, this illustration of Six Visually
Impaired Males and the Pachyderm (in the spirit of political correctness) and perhaps, depending on our
position in the industry, we may feel that we only understand a piece of the complex animal that is readership
leveling, and the associated metadata in support of the Common Core.
We are, however, not as bad off and, we’re closer to the state depicted in the caption to this picture “…each
presenting a perspective that helps us better understand the animal…” [Skip to next slide Progress]
Many publishers are working to provide accurate and consistent leveling of titles, and some submit their
works to one or more of the noted textual analysis processes.
Booksellers are packaging quality collections of titles from publishers, and others are controlling quality
metadata that depicts leveling, and other descriptive attributes, to provide a means by which to query
databases, produce lists for librarians, teachers, and consumers, and also display this information so that one
can easily identify and select appropriate titles.
Last there are groups representing the various parts of our industry working to accommodate and augment
data that supports the Common Core within industry standards [skip to the first TS360 slide]
In order to display the robust metadata that is available in Title Source 360, I have to split the screen into two
slides, the first shows basic bibliographic information, some business information, and subject descriptions –
note the audience level here; it is in this position we are controlling publisher supplied evaluations, or those
made by B&T bibliographers [skip to second TS 360 slide]
Here is the point at which we’re textual analysis information that we receive from 3rd
party sources – Lexile,
Accelerated Reader, and, Scholastic reading counts. Not only is this displayed, but it is indexed for searching,
or, profiling carts/lists for our customers. We also focus on other robust data including multiple BISAC
subjects, subjects from MARC data, so that we can identify (or customers can identify) not only an appropriate
level of material, but also identify appropriate topical and thematic treatments. Further we relate our
citations to author biographies, publisher supplied annotations, and full text reviews from most of the major
[Skip to next slide Industry Efforts]
BISG has a Common core working group and on or about now they’re in the process of making
recommendations for packaging relevant information in the industry standard envelope for data transmission
Skip to [Conclusion Slide}
Metadata, and especially rational consistent leveling metadata is essential to help our customers identify,
select and purchase titles in relation to their Common Core needs.
Publishers and Booksellers need to accommodate enhancement of metadata to provide effective common
core support and make it easier for customers to select supporting titles. It must also be recognized that there
is an effort and expense associated the robust data management.
Industry standards are evolving, and customers are likely to benefit from this evolution
The summary benefit of the investments we have to make in publishing, book selling, and associated
enterprises, is that we will be able to ease the burden on our customers and be able to promote titles with
integrity in relation to the particular goals and objectives of those implementing and supporting the Common