The Centrality of Agreement - Presentation Transcript
The Centrality of Agreement
John Bordeaux, Ph.D.
March 2005
11572
The numbers above constitute data. Some will venture a guess that they indicate
a postal or zip code. This context, upon which we would agree once an
authoritative source were consulted (is it a zip code? Yes!), provides a framework
for understanding the important of the numbers. What town does it represent?
We can map it using any one of available public sources to learn that it is the zip
code for Oceanside, New York, USA. This is not always a clean mapping, as many
towns (nearby Hempstead, for one) have multiple zip codes. There are instances
where a zip code may represent several towns (Springfield and Burke, Virginia
USA). While the zip code context tells us something about 11572, the relationship
between town and zip code is potentially many-to-many rather than the one-to-
one correlation most of us immediately assume. Initially, we believe that the data
string 11572 can be mapped, one for one, to Oceanside. But if we are building a
data system to use this information, we cannot assume that every zip code has a
one-to-one relationship with a town. I once lived in Burke, VA, and used 22015
as my zip code. I was disconcerted to learn that someone in Springfield also had
the same zip code, compromising the unique identity of my home address. The
charm of Burke faded a bit, as I realized it was a small hamlet when compared to
its neighbor, and that it could be folded into its neighbor’s code for postal
purposes.
We have some additional context for 22015, then. My framework for
understanding the relevance of 22015 involves information few others have –
until this moment. Likewise, my visceral reaction to the numbers 11572 goes far
beyond its intended use as a zip code. It evokes memories of my childhood home,
and all that entails. One might consider this highly individual and situational
framework to be my knowledge regarding that number. I know what that
number represents. But only what it represents to me, and to a slightly lesser
degree, my siblings. I guarantee my response to these numbers involves
subjective experiences that would not correspond exactly with theirs.
Therefore, what metadata can we agree upon when storing the numbers 11572 in
an information system – or in our memories? Shall we store the fact that it
represents my childhood home, the name of the town, or even that it is a zip
code? Probably the only agreed metadata would be its pattern as a U.S. postal
code. Therefore, 11572 becomes a zip code – the context plus the value becomes
information. The context enriched with my individual experience becomes my
knowledge of that number. I do not know how this is stored, and likely various
memories will be evoked when I see the number months or years hence. Such is
the fluid nature of individual knowledge.
Page 2 of 5
What is the factor that moves 11572 from data, to information, to knowledge? I
argue that it is human agreement regarding its context and therefore relevance.
While I have made a strong argument for you to reconsider the intrinsic meaning
of 11572, chances are good that you will not store this framework in your long-
term memory. We simply do not agree regarding the importance of those
numbers.
What of Oriole 8-5018? Those born after 1965 or so may have a hard time
parsing this, and may have a better chance at guessing if I shorten it to OR8-
5108. Those born before 1965 or so knew immediately that it is a telephone
number, and are more likely to agree with my assertion to that end. Telephone
exchanges once had honorifics that mapped to numbers. OR8, therefore, was
678. And yes, here again I use personal experience as that was my telephone
number growing up. Oriole 8-5108 is data, the fact that is it a telephone number
represents the context that – combined with the value – constitutes information.
However, we may differ on the usefulness of storing a combined text/number
field – and the thoughtful data architect will employ a mapping algorithm to
transform the number to today’s format: 678-5108. We would lack agreement on
the relevance of the honorific, depending upon how we were to use the data. An
antiquated form of this number implies that we consider the time period in which
it was predominantly used. If we were supporting historic research into various
honorifics, we would lose valuable elements of the data by transforming the
number and storing it as 678-5108. However, if this were for use in a general
telephone database, data integrity and a reduction in the chance for error would
be served by such a transformation.
Again, the text/number stream was simply data. The context plus its value
constitutes information. But in this case, we see the importance of agreement at
a lower level. One need not store nor access my emotional reaction to that string;
we can agree to disagree even at the “lower” level of information. Here we see,
albeit anecdotally, the difficulty in establishing common data protocols.
Although every business uses a purchase order form to conduct transactions, it is
a common aphorism in electronic commerce that we cannot agree on a universal
format for such a form. The data elements are fairly consistent, but there is no
agreement on the relevance for each, or how they are to be stored. One cannot
settle on questions such as “What should be the maximum text length for a
product description?”
One can begin to see the hazards in declaring the benefits of knowledge sharing
(much less knowledge storing, capture, repository, etc.). When engaged in
advanced academic pursuits, the student is confronted with a literature review.
As the student is expected to understand the “state of knowledge” in her chosen
field – and in the case of the Ph.D., advance that state – one must review the
canon in her field. Of course, there will be marginal disagreement regarding the
Page 3 of 5
inclusion of certain works in that canon, but nevertheless the bulk of important
works are presented for review and comprehension.
For the knowledge management field, the canon is converging on an
understanding of (or agreement regarding) several types of knowledge: explicit,
implicit, tacit, and – most recently – emergent (referring more to organizational
knowledge than individual). Tacit knowledge as a term has its beginnings with
Polanyi (1967), and is understood as that which we know but cannot express.
That which we have not YET expressed (but is capable of expression) is implicit.
That which has been digitized or otherwise recorded in some form is explicit. I
consider what I am writing at this moment to be an explicit representation of my
knowledge on this topic. To you, it may be simply information; perhaps of a
trivial nature. Unless and until you agree on the truth and relevance of what I am
writing, it cannot become knowledge for you to use. Even if you agree with my
thesis, it will not be stored in the same way it is for me. We do not yet fully
understand the processes and structures of human comprehension – you could
not mirror this knowledge as I have stored it even if you were in complete
agreement with me. It will simply never be relevant to you in the way it is to me.
There is a temporal aspect to this context as well. It will never be relevant to me
in the same way that it is at this moment.
Business literature in the field of information technology is rife with calls for
common data standards, a call which is slow to be answered as corporations have
a vested interest in not agreeing to common protocols. For Web Services, as an
example, there are two camps pursuing two different sets of protocols. The lack
of agreement here has arguably held back the realization of the promise for this
emerging technology application. One can see how gaining agreement on
information standards will be that much harder. XML is an accepted data
standard, but was not an immediate silver bullet for business solutions. Why?
Yes, we can categorize HTML data beyond the initial format tags, but establishing
a common taxonomy is proving extremely difficult. What should those tags be?
Authoritative data sources are few, principally because it is difficult to agree first
on which sources should be authoritative. Also, the changing nature of most
domains means that taxonomies must adapt over time. Extremely few
authoritative data sources exist to manage a common taxonomy for a given
domain over time.
This brings us to knowledge transfer. While understood as a key differentiator in
the “information economy,” perhaps the focus should be on reaching a consensus
of agreement before assuming we can simply transfer knowledge. Data standards
bodies may arrive at common protocols, and associations within a domain may
approach the 80% solution for that domain’s taxonomy to enable information
sharing, but knowledge remains extremely individual. The context shifts over
time, as the human assigns a rich ontology to information that cannot be shared
– in part, because it is partially unknowable even to that human. This is why the
Page 4 of 5
knowledge management canon is also coming to reflect that knowledge exists
primarily among individuals and is transferred across shifting communication
networks, primarily face to face. When a new employee joins a firm, she is
surrounded by information that is yet to become knowledge. The careful
employee resists a temptation to form an individual context too quickly, as she is
not yet a member in the informal networks that comprise a sort of “hive mind”
for this information. In part, she can add to the common understanding by this
new perspective. However, without a clear understanding of the state of the hive
mind, she risks marginalization. Note the confusion that can occur if this
employee is introduced to a highly developed online community of practice – she
will first spend much time discerning the various roles and therefore authority of
the information he obtains. Expediting this initial introduction to the firm’s
communal nature exclusively using online tools hampers the effectiveness of his
comprehension; as the essential context for establishing roles in an organization
is conveyed through face-to-face encounters.
Therefore, in order to truly enable the “knowledge-centric organization,” we must
invest in technologies and processes that enable the exchange of digitized
information; develop case-based reasoning or other business rule methods and
technologies to add informational context to the extent feasible; and accelerate
the immersion into the informal human networks wherein the common
understanding and context lies. At the same time, facilitating assimilation should
never be done in such a way that new views are not rationally assessed for their
potential value by these networks. This may just be the key to helping an
organization exist at the “onset of chaos”. Stable enough to facilitate and
acculturate new members, but flexible enough to leverage the diversity of views
that these members bring to the organization. Understanding the role agreement
plays in mapping data, information, and knowledge is arguably the first step in
advancing relevant information technologies and methods for sharing.
Page 5 of 5
0 comments
Post a comment