Should a computer compete for language?

Should a computer compete for language?
An exploration of a computer acquiring natural language using brain modeling, evolution and affective
computing

Why should a computer acquire natural language?
I think it is useful to pursue the quest of learning a computer natural language.It is useful to do
this because it makes us able to cope with the information overload and filter failure we face in
our world. Filter failure is concept that Clay Shirkey a professor at Interactive
Telecommunications Program thought up for the information explosion of the Internet era.1 He
says; there has been more books than anybody could read since the sixteenth century . So
information overload is not the problem we are facing with the Internet. The problem we face
today is that the natural filters that existed have disappeared. For example an encyclopedia
could only have a limited number of pages. A television station can only have one program
airing at the same time. Look to the modern variant of these media, Wikipedia want to collect all
human knowledge and has over 3,5 million articles. YouTube gets 24 hours of footage
uploaded every second. The filters of television or paper encyclopedias have ceased to exist
and the floodgates are open.

Putting human filters back in doesn’t seem useful because the amount of data is only getting
bigger and the human brain isn’t keeping up. A computer might be, it is able to run for days or
months just analysing texts for relevant content. A computer then becomes a personal filter
between the enormous data that is available and the stuff that is relevant to a certain user.
Understanding which information is relevant in an argument. A computer is able to filter for a
specific person and not the taste and opinion of human filters. I think the quest for learning a
computer natural language is relevant and even necessary to manage the enormous flood of
data.

Our Internet is more and more tailor made for our interests. This seems to solve the above
motioned problems but can make them even worse. We will come to live more and more in a
filter bubble.2Our world is tailor made for us without us knowing what is filtered out. Google will
give you different result based on 60 parameters without you even being logged in to Google.
This is a major issue if take into account that even news via your social network is filtered by
Facebook. You will only see progressive news as a progressive voter and only conservative
news as a conservative voter. Your views will never be challenged by an algorithmic
gatekeeper. Unless such a gate keeper would understand the content and witch views opposes
each other. Then it could give you a regular and solid argument that opposes your view. Or
even completely different texts on the Internet that don’t have the same subject as you like, but
the same style of writing.

If we don’t tackle this problem we will all float in our own filter bubbles. Without ever finding
anything that opposes our views. The idea of a free Internet where every voice is equal is gone
and a great foundation for more extremist views to grow is laid.
1
Shirky, C. (2008)
2
Partiser, E. (2011)

1

How are we going to learn a computer natural language?
The problems with natural language are big. There are an infinite number of grammatically
correct sentences in a language, but an even greater number of sentences that are incorrect.
And the words used within these languages aren’t even clearly defined. Take the concept of
running. A person can run. An engine can run, A river runs and a nose can run. The problem is
that the same word can mean many things in a different sentence. If you want to describe the
grammatical rules and the meaning of every word in a language you will get in trouble. A word
can mean so many things in a different context.

We need breakthrough innovations to tackle the problem of natural language in computers. So
how are we going to find these breakthrough innovations?3 First we are going to put the problem
of natural language in new and different context Away from the idea of Turing machines, data
and building specific programs. We need to cope with input that is noisy but predictable. Look to
heuristics instead of hard coded rules.

After explaining this context I will explain the Hierarchical Temporal Memory(HTM) witch can
cope with the parameters mentioned above, and is modelled on the neocortex.4 First we are
going to put the problem of natural language in new and different context Away from the idea of
Turing machines, data and building specific programs. We need to cope with input that is noisy
but predictable. Look to heuristics instead of hard coded rules.

After explaining this context I will explain the Hierarchical Temporal Memory (HTM) witch can
cope with the parameters mentioned above, and is modelled on the neocortex. # This is the
brain area where for example higher vision, hearing and language is processed. Next to this
algorithm I suggest some concepts of affective computing to change the state of the system
according to changes in the input.

Evolution and Memes in the context of language
Our human brains are the only device that has been able to acquire language. The human brain
is a product of evolution so can we learn something from evolution that makes it easier for a
computer to acquire language.5 There are three powers that govern evolution:
1. Variation: In the context of genetic evolution these are the genetic mutations in the DNA
when it is copied. These variations make sure that novel qualities can arise that might be
beneficial to the animal the DNA is present in. Think of a mutation that make a bacteria
resistant to penicillin.

2. Selection: Selection is the force that prohibits all variations to survive. This can be
anything in the environment that prohibits the transfer of genes. This can either be by
killing an animal or prohibit its reproduction. If they take the bacteria example the
selection criteria may be a penicillin rich environment. Selection would give the

3
Baldwin, C. Y.(2009)
4
Hawkins,J. (2007)

5
Dawkins, R. (1978)

2

advantage to the bacteria that has the penicillin resistant piece of DNA over the non
resistant bacteria.

3. Heredity: The traits made a certain DNA mutation survive should past over to the next
generation. This seems logical but is an essential part of evolution if a trait could not be
passed from one generation to the next the whole process of variation and selection
could not be exploited.

Why are these three processes so important, because they also play a key role in our brains.
Next to a genetic evolution the human brain undergoes a memetic evolution. Culture, customs
and language are not transferred via genes but via so called memes. So what is a meme? A
meme is anything that can be copied between brains. This can literally be any concept,
behaviour or word. The same process applies as in with genetic evolution. There is variation in
memes, a good example is the party game where a sentence is passed along by whispering it in
each others ears. The more people you add to line the more the sentence is transformed. You
can see this as a variation. Selection some ideas stick in peoples minds and others don’t. Also
the limited capacity of a human brain and the time it would take to pass on all ideas you have
make sure some memes are selected above others. And the heredity in the world of memetics
is that ensured by the sharing of ideas between brain.

So does this memetic evolution effect language. It does; words get a new meaning and new
words are added. Some words become old fashioned or even complete languages die out
because they are not taught to a new generation. Just like species of animals a language must
be adept to its environment or go the path of the dodo. So a computer that learns language
must be as adaptable and ever changing as language itself. Language is not a data set it is a
process so the acquiring of language should be process focused. Competition like our brains
arose from genetic and memetic evolutionary competition.

Imperfect data and time in the context of language
Our brains, the world and our senses are full of noise. Still our brains are very capable of of
coping with these imperfections. Compare this to the world of computers and you see cracking
language will be hard with traditional Turing based computing. Knock out one bit in computer
memory and it won’t function. Change a bit and a file and it will be corrupted. After a heavy night
of drinking and knocking out some brain cells our brain still functions. We can understand
people in a crowded room with a lot of other people talking. Our brains are build to cope with
imperfect data. A computer is much better in things where accuracy is required like calculus.
Why is this?

I think this has to do with two things the first is heuristics vs rules. A computer is able to cope
with rules that are need to be followed in the same way. This is great for doing calculus, but not
so great for tasks that need to be done in natural world You can see. Thus for example with a
making an algorithm that can distinguish between a cat and a dog. You can feed a computer
thousand of images of cats and dogs but it will not be able to find a general rule. A computer
that learns language should be able to cope with heuristics. Heuristics are rules of thumb and
not hard coded if else statements. This being better able to cope with imperfect data. You could

3

argue that this is not really necessary. The human brain understands text only if it is
grammatically correct. But you can add a lot of noise to a text and a brain can still be able to
decipher it. The the text below shows how even with a lot of noise added to a text a human is
still able to get the meaning from it.6

“Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the
ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit
pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is
bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe.”

We do not read every individual letter but the the word as a whole. Also the shape and height of
different letters make us able to read something much faster.

“A ALL UPPERCASE TEXT IS MUCH MORE UNPLEASANT TO READ, than text with
lowercase letters. We recognize different letters by their height.”

For computer there is no real difference between the uppercase and lowercase letters as a data
structure. As long as the data is valid there is no problem for a computer to interpret the data.
The text example above that would cause a problem. This is the fundamental difference
between human and computers. For a computer validity is more important than structure. For a
human structure is much more important than validity. We can see past a spelling error but
knock out all line beaks, tabs and spaces in a piece of computer code and it will be unreadable.
For a computer it's the other way at around the mistake will make a piece of computer code
unreadable but the line breaks, spaces and tabs serve no function. The hard coded rules ask for
validity above overall structure. Heuristics ask for overall structure but can handle much more
messy and imperfect input.

A brain model for language
Hierarchical Temporal Memory is a way of modelling the prefrontalcortex of a human. The
prefrontalcortex is the place where higher vision, listening and language is processed .# This is
interesting because coping with noisy input is what this part of the brain does and it is
responsible for language. So if there is a way of using the brain structures to understand
language it is the prefrontalcortex we should look.

Hierarchical Temporal Memory (HTM) is based on concepts that are interesting for computers
understanding language. The main one is that it is temporal. This means that things are
recognized in sequence. Language is sequential it exists in sentences. And sentences exist in
paragraphs. So the temporal system makes sense for understanding language.

6
Rawlinson, G. E.(1976)

4

The model is hierarchical this means that it goes up a pyramid like structure. With a broad base
and a narrow top. See image below. This is perfect for language, because this system could
first take up words (input image) than a paragraph (level 0) and move up through the hierarchy,
and eventually answers the question what is this text about (level 1) and is interesting for user X
(level 2)?

So how does this HTM works? It is now used mostly for computer vision systems, but because
the prefrontalcortex structure is the same for language processing as for vision this is no
problem. The system takes a group of pixels and looks at them if a pattern change occurs it fires
up to next layer but also fires to its own layer to knock them out. In this way information travels
up the hierarchy and only the most efficient system “survives”. So the system competes on each
layer of the hierarchy with itself. The layer that is most effective survives. This is how in the
brain different groups of brain cells compete and only the most effective paths survive. This way
of representing the world in entities and connections is how our brain works so it is effective for
translating the products of these brains for example texts. The texts can also be represented
hierarchically with words at the bottom, sentences in the next layer, paragraphs in the layer on
top of that and the full text on top. The text gets summarized into a few nodes or words at the
top level. This is how the brain stores information by finding a common denominator. If you give
people a list of fruits but not the word fruit itself and quiz them later on the content of the list they
are sure that the word fruit was in the list. This is because the common label of the individual
parts is fruits. With the HTM system a computer is able to make this same “mistake”. It does not
try to explain the complete data set of a text but tries to find the common denominator or the
subject of the text.

The HTM method does not search for clear mathematical outlines but is useful for noisy
heuristic problems. It is well suited for the problems raised in the chapters above. It works well
with noisy data and competes on every step of the hierarchy with itself so an “evolutionary”
process is facilitated.

Conclusion

5

To overcome the flood of information that comes in via the internet and all other forms of media
we need to install new filtering systems. A computer that could understand a text and see if it is
relevant for a user could be a solution. To do this we need a breakthrough innovation because
the Turing machine based algorithms won’t do to solve the problem of language. So we need to
look to the only device that has solved the problem the human brain. The model we are going to
use should like the human brain is able to cope with ambiguity and noise. Next to it should
compete in a “Darwinian” struggle. The Hierarchical Temporal Memory has these attributes and
is based on the neo cortex. This is the part where language resides in the human brain. To
solve the problem learning a computer language is the model to use.

Reference
Baldwin, C. Y., von Hippel, E. A. (2009). Modeling a Paradigm Shift: From Producer Innovation
to User and Open Collaborative Innovation. Working Paper, Cambridge, December 23, 2009,
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1502864

Dawkin, R.,” The Selfish Gene”,
Oxford University Press, USA, 1978

Hawkins, J.,” HIERARCHICAL TEMPORAL MEMORY”
September 2011
http://www.numenta.com/htm-overview/education/HTM_CorticalLearningAlgorithms.pdf

Hawkins, J.,George, D.,”Hierarchical Temporal Memory:Concepts, Theory, and Terminology” ,
Numenta Inc., 27 March 2007
http://www.numenta.com/htm-overview/education/Numenta_HTM_Concepts.pdf

Partiser, E. , “The Filter Bubble”,
Penguin Press HC,12 May 2011

Shirky, C ,”It's Not Information Overload. It's Filter Failure”,
Web 2.0 Expo NY ,19 September 2008
http://blip.tv/web2expo/web-2-0-expo-ny-clay-shirky-shirky-com-it-s-not-information-overload-it-
s-filter-failure-1283699

Rawlinson, G. E., “The significance of letter position in word recognition.” Unpublished PhD
Thesis, Psychology Department, University of Nottingham, Nottingham UK.,1976

6

Should a computer compete for language?

Recommended

Recommended

More Related Content

What's hot

What's hot (6)

Viewers also liked

Viewers also liked (7)

Similar to Should a computer compete for language?

Similar to Should a computer compete for language? (20)

More from Mathijs van Meerkerk

More from Mathijs van Meerkerk (12)

Recently uploaded

Recently uploaded (20)

Should a computer compete for language?