SlideShare a Scribd company logo
1 of 115
Download to read offline
Total number of words 15177
A dissertation submitted in partial fulfilment of the requirements for the Open
University’s Master of Science Degree in Computing for Commerce and Industry
Ian Whitby
P2207964
10th
March 2009
Ian Whitby (P2207964) Page ii
PREFACE
I would like to thank all of those who have contributed to this work and provided the
opportunity to pursue this research.
Special thanks go to my wife, Kate, and children, Rory and Holly, for the long hours
spent away from them and for their unswerving support and encouragement
throughout. Thanks are also due to my employer, National Grid, for granting the
time to undertake this work and to the many employees who completed the
research questionnaire. My thanks also go to ADVFN for allowing the use of their
manually-written share reports within the research prototype and to my project
supervisor, Linda White, for her support.
Ian Whitby (P2207964) Page iii
TABLE OF CONTENTS
PREFACE ..................................................................................................................II
TABLE OF CONTENTS ........................................................................................ III
LIST OF FIGURES................................................................................................... V
LIST OF TABLES..................................................................................................VII
ABSTRACT .......................................................................................................... VIII
CHAPTER 1 INTRODUCTION..........................................................................9
1.1 THE PROBLEM DOMAIN ....................................................................................9
1.2 DEVELOPMENTS IN NLG.................................................................................12
1.3 RESEARCH AIMS .............................................................................................14
1.4 RESEARCH DELIVERABLES..............................................................................14
1.5 CONTRIBUTION TO KNOWLEDGE.....................................................................15
CHAPTER 2 A REVIEW OF THE LITERATURE.........................................16
2.1 INTRODUCTION................................................................................................16
2.2 EXPRESSIVENESS/EFFECTIVENESS OF GRAPHICS.............................................16
2.3 RHETORICAL STRUCTURE THEORY..................................................................18
2.4 THE LINGUISTICS OF PUNCTUATION................................................................19
2.5 MULTIMODAL SYSTEMS ..................................................................................21
2.6 USER CUSTOMISATION ....................................................................................24
2.7 MEANING AND MARKUP .................................................................................26
2.8 SUMMARY .......................................................................................................34
CHAPTER 3 RESEARCH STRATEGIES .......................................................36
3.1 INTRODUCTION................................................................................................36
3.2 REVIEW OF RESEARCH STRATEGIES ................................................................36
3.3 SUMMARY .......................................................................................................43
CHAPTER 4 DATAACQUISITION .................................................................44
Ian Whitby (P2207964) Page iv
4.1 INTRODUCTION................................................................................................44
4.2 DATA SOURCES................................................................................................44
4.3 PROTOTYPE DESIGN ........................................................................................45
4.4 QUESTIONNAIRE..............................................................................................49
4.5 ANALYSIS OF HANDWRITTEN REPORTS ...........................................................53
4.6 VOCABULARY OF A VOLATILE MARKET ..........................................................60
4.7 SUMMARY .......................................................................................................69
CHAPTER 5 DATAANALYSIS.........................................................................71
5.1 INTRODUCTION................................................................................................71
5.2 RESPONDENTS.................................................................................................71
5.3 USER PREFERENCES ........................................................................................72
5.4 USE OF COMMERCIAL DATA ............................................................................89
5.5 SUMMARY .......................................................................................................94
CHAPTER 6 CONCLUSIONS...........................................................................95
6.1 COMPARISON WITH THE RESEARCH AIMS........................................................95
6.2 DIRECTIONS FOR FUTURE WORK.....................................................................96
6.3 PROJECT REVIEW ............................................................................................97
REFERENCES.........................................................................................................98
BIBLIOGRAPHY ..................................................................................................101
INDEX.....................................................................................................................102
APPENDIX A ADVFN REPORTS AND DATA...........................................105
APPENDIX B QUESTIONNAIRE................................................................107
APPENDIX C DISPLAY STYLES................................................................. 112
Ian Whitby (P2207964) Page v
LIST OF FIGURES
FIGURE 1 STRUCTURAL ELEMENTS (POWER ET AL, 2003)................12
FIGURE 2 RST ELABORATION EXAMPLE.................................................19
FIGURE 3 COMET ARCHITECTURE (FEINER & MCKEOWN, 1991)...22
FIGURE 4 WIPARCHITECTURE (WAHLSTER ET AL, 1993)..................23
FIGURE 5 NLG ARCHITECTURE COMPONENTS ....................................27
FIGURE 6 RST CONCESSION EXAMPLE....................................................29
FIGURE 7 CONCESSION TEXT STRUCTURE............................................29
FIGURE 8 ICONOCLAST TEXT STRUCTURE LEVELS...........................30
FIGURE 9 PROTOTYPE - ENTITY RELATIONSHIP DIAGRAM ............46
FIGURE 10 PROTOTYPE – HIGH LEVEL DESIGN......................................47
FIGURE 11 COMPARISON OF NLG AND MANUAL REPORTS.................49
FIGURE 12 PROTOTYPE - WORKFLOW.......................................................49
FIGURE 13 SURVEY – TRADING EXPERIENCE.........................................52
FIGURE 14 SURVEY – DOMAIN EXPERTS ...................................................53
FIGURE 15 ADVFN REPORT - ABSTRACT DOCUMENT STRUCTURE .54
FIGURE 16 RST EXAMPLE FOR LSE HEADLINE.......................................55
FIGURE 17 ADVFN REPORT (13TH
AUG 2008) ..............................................56
FIGURE 18 RST ELABORATION IN HANDWRITTEN TEXTS..................59
FIGURE 19 FTSE100 DAILY VALUES AND 5-DAY TREND LINE ..............60
FIGURE 20 PRE-TRIALADVFN REPORTS - VERBS...................................63
FIGURE 21 PRE-TRIALADVFN REPORTS - ADJECTIVES .......................64
FIGURE 22 TRIALADVFN REPORTS - VERBS ............................................65
FIGURE 23 TRIAL/PRE-TRIAL COMPARISON - VERBS ...........................66
FIGURE 24 TRIALADVFN REPORTS - ADJECTIVES.................................67
FIGURE 25 TRIAL/PRE-TRIAL COMPARISON - ADJECTIVES ...............67
Ian Whitby (P2207964) Page vi
FIGURE 26 TRIAL/PRE-TRIAL COMPARISON - NOUNS...........................69
FIGURE 27 SURVEY RESPONSES - EXPERIENCE......................................71
FIGURE 28 SURVEY RESPONSES - TRADING .............................................72
FIGURE 29 SURVEY - DISPLAY STYLE .........................................................73
FIGURE 30 SURVEY RESPONSES - PREFERRED STYLE..........................73
FIGURE 31 SURVEY - REPORT STRUCTURE ..............................................74
FIGURE 32 SURVEY RESPONSES - REPORT STRUCTURE......................74
FIGURE 33 SURVEY - TEXT BLOCKS............................................................75
FIGURE 34 SURVEY RESPONSES - TEXT BLOCKS....................................76
FIGURE 35 PROPERTIES OF A NORMAL DISTRIBUTION CURVE........78
FIGURE 36 SURVEY - REPORT GROUPING.................................................79
FIGURE 37 SURVEY RESPONSES - TEXT GROUPING ..............................80
FIGURE 38 SURVEY RESPONSES – GROUPING COMPARISON .............81
FIGURE 39 SURVEY - CUSTOMISATION ......................................................82
FIGURE 40 SURVEY RESPONSES - CUSTOMISATION..............................83
FIGURE 41 SURVEY RESPONSES - CUSTOMISATION (HISTOGRAM) .83
FIGURE 42 SURVEY - PAGE NAVIGATION...................................................86
FIGURE 43 SURVEY RESPONSES - PAGE NAVIGATION...........................86
FIGURE 44 SURVEY RESPONSES - COLOUR...............................................88
FIGURE 45 SURVEY - SENTENCE CONSTRUCTION .................................89
FIGURE 46 SURVEY RESPONSES - VOCABULARY COMPARISON .......90
FIGURE 47 SURVEY RESPONSES - GRAMMAR COMPARISON .............91
FIGURE 48 SURVEY RESPONSES - SENTENCE COMPARISON ..............92
FIGURE 49 SURVEY - MISSING DATA............................................................93
FIGURE 50 SURVEY RESPONSES - MISSING DATA ...................................94
Ian Whitby (P2207964) Page vii
LIST OF TABLES
TABLE 1 LSE DATA RECEIVED..........................................................................45
TABLE 2 USERS TARGETED BY SURVEY.......................................................51
TABLE 3 FTSE100 DAILY TRENDS ....................................................................61
TABLE 4 FTSE100 SAMPLE DAYS......................................................................62
TABLE 5 PAGE NAVIGATION STATISTICS .....................................................87
Ian Whitby (P2207964) Page viii
ABSTRACT
This report examines the production of web-based text reports through Natural
Language Generation (NLG) techniques. The work reviews the current body of
NLG knowledge and aims, through the use of an internet-based prototype, to
determine whether commercial data can provide the quality of information
required to automatically produce texts of comparable sophistication to human-
authored reports. The research uses the prototype to further investigate whether
inclusion of user preferences in the generated outputs leads to improvements in
the effectiveness and coherency of the resulting web pages.
The study employs an internet survey to obtain feedback on the quality of
grammar, vocabulary and sentence construction obtained through NLG production
from a commercial data source. The survey also provides quantitative and qualitative
assessment on the effectiveness of web page designs in harnessing individual user
preferences. The returns from the survey were subjected to statistical analysis and its
results extended to infer characteristics for the wider population. This work is
believed to have applicability beyond the realm of the current research and the study
concludes with recommendations for further work within the same field and across
other disciplines.
Ian Whitby (P2207964) Page 9
Chapter 1 Introduction
Commercial organisations are making increasing use of their institutional databases
as sources of on-line information. Such repositories provide users with round-the-
clock access to information across the internet and an ability to interact with the data
at any time. For manually maintained web sites this causes problems with both the
content and design of the site lagging behind the information present within the
organisation’s own database. Rapid changes in the underlying data can take days or
even weeks to be hand-crafted into updated web designs and content. In addition the
presentation of this information is often seen as impersonal and failing to engage its
audience. To counter this organisations are assessing whether NLG could
automatically generate the design and content of their web pages.
1.1 The Problem Domain
NLG uses knowledge of natural language constructs, its grammar and vocabulary to
build grammatically correct sentences and phrases from an underlying source of
nonlinguistic data. The discipline is closely allied to Natural Language Processing
(NLP) which looks to determine the meaning of prewritten sentences. Reiter and
Dale (2000) describe NLG as:
“.. a subfield of artificial intelligence and computational linguistics that is
concerned with building computer software systems that can produce
meaningful texts in English or other human languages from some
underlying nonlinguistic representation of information. NLG systems
use knowledge about language and the application domain to
automatically produce documents, reports, help messages, and other
kinds of texts.“
(Reiter and Dale, 2000, p. xvii)
The application of NLG to the automatic generation of web page designs and content
represents an active research topic, both within academia and industry. The research
community is divided on whether commercial databases are of sufficient quality to
Ian Whitby (P2207964) Page 10
yield NLG texts comparable to those produced by research prototypes. Although
research data sets have been used to produce sophisticated outputs the disparate
nature of commercial data sources, their leniency of data validation and the intrinsic
limitations of recording language components within a relational database structure
have cast doubts over their suitability. In addition researchers are divided on the
extent to which the user’s knowledge, preferences and prior interaction with a web
site affect his/her assimilation and understanding of its content (Matsushita et al,
2003; Mackinlay, 1987; Reiter et al, 2003). Similar arguments have been extended to
the role of graphics within a multimodal presentation (Feiner and McKeown, 1991;
Bateman et al, 2001).
Within the natural language community the suitability of commercial data
repositories as sources of sophisticated natural language content remains an active
area of research. Dale et al’s (1998) work on generating large volumes of NLG text
from commercial sources highlighted the data quality issues involved and a need for
further investigation:
“The problems that arise from noisy data in our database are likely to be
faced by any attempt to use a real database as an information source” .
(Dale et al, 1998, Section 5. Conclusion)
Current research also focuses on the extent to which improvements in the
transference of a text’s message and content to the reader may be achieved through
the inclusion of the user context. Mackinlay’s (1987) work focused on the
expressiveness and effectiveness of graphical designs and highlighted the need for:
“...choosing or adapting the dialogue specifications appropriate to the
observed skill level of the user”.
(Mackinlay, 1987, p. 139)
Ian Whitby (P2207964) Page 11
Matsushita et al (2003) built on Mackinlay’s (1987) work to incorporate this user
context in the visualisation of numerical values whilst Reiter et al (2003) investigated
the research techniques used in acquiring this knowledge of user preference and
interaction with the system.
The current study investigates both areas further and uses a web-based prototype to
create dynamically generated reports from a commercial data source. A daily feed of
price information for stocks listed on the London Stock Market (LSE) is recorded
into the prototype’s relational database without validation or alteration. The database
acts as the information source for subsequent investigation of the suitability of the
commercial data and the contribution of user preferences to the effectiveness of the
generated text.
Ian Whitby (P2207964) Page 12
1.2 Developments in NLG
A NLG milestone in achieving automatic text generation was Mann and Thompson’s
(1986) development of Rhetorical Structure Theory (RST) as a means of formalising
the rhetorical structure of texts. RST allows texts to be expressed as a hierarchy of
elementary propositions and as a diagrammatic representation of ordered nodes, in
which non-terminal nodes within the tree represent a relationship within the text and
terminal nodes represent text phrases. Subsequent authors (Power et al., 2003) built
on this work, advocating refinements in the separation of rhetorical structure from
both the abstract and text structures within a document. Power et al (2003) disputed
Mann and Thompson’s assertion that non-terminal nodes represent a relationship
within the text, arguing that whilst these relationships exist within the document’s
rhetorical structure (i.e. its meaning) they do not necessarily persist through to its
textual realisation. Power et al (2003) argued the case for viewing a document as
three distinct structural elements (Figure 1):
Figure 1 Structural Elements (Power et al, 2003)
The authors further postulated that the abstract document structure could be viewed
as an extension of Nunberg’s (1990) “text-grammar” and its close affinities with text
Rhetorical Structure
 Hierarchy of
Elementary
Propositions
 Rhetorical
Structure
Diagrams
Abstract Document
Structure
 Titles
 Headings
 Captions
 Lists
Text Structure
 Chapters
 Sections
 Paragraphs
 Sentences
 Clauses
 Phrases
Ian Whitby (P2207964) Page 13
markup languages such as HTML and LaTeX. Both HTML and LaTeX are founded
on the belief that the visual appearance of the text assists in conveying its meaning
but can also be represented as a distinct element (e.g. as Cascading Style Sheets).
Research has also been undertaken into the automatic generation of graphics.
Mackinlay (1987) investigated the generation of a range of graphical
presentations (e.g. bar charts, scatter plots and connected graphs) from relational
data held within a database. The research sought to codify graphic designs by their
expressiveness (i.e. the extent to which the graphical language expressed the
underlying message) and their effectiveness (how well this message was received and
understood by its audience).
Dale et al (1998) sought to extend NLG techniques beyond carefully prepared
artificial intelligence knowledge bases to the use of real-world data repositories.
Using the PEBA-II NLG system (Milosavljevic et al, 1996) they generated hypertext
documents from both a knowledge base and an equivalent commercial database.
Their work highlighted weaknesses in both data quality and information structure
within the commercial offering which undermined its ability to generate quality texts.
Bateman et al (2001) argued that neither the function nor the nature of information
layout had been addressed in earlier works. They postulated that a document’s
message came through the overall arrangement of information on the page whilst
relationships between this information were commonly conveyed through layout.
The authors stated that insufficient research had been conducted on the informational
significance of presentation and layout. Bateman et al (2001) advocated an
integrated approach to generating layout, text and graphics under a common
framework, arguing that only by considering these elements collectively was it
possible to attain coherent presentation designs. Their research focused on the
Ian Whitby (P2207964) Page 14
empirical investigation of manually-generated presentations and the development of
a prototype system.
Matsushita et al (2003) extended the early work of Mackinlay (1987). The authors
contended that users were often unable to succinctly state how they wish to query the
underlying data, particularly for large datasets but relied instead on exploratory data
analysis and a series of graphic iterations to build this understanding. Under these
conditions the effectiveness of a graphic is determined not just by tenets of
graphic presentation but also by the user’s previous interaction with the system.
Matsushita et al (2003) argued that decisions on graphic effectiveness must include the
context in which the user poses the query.
1.3 Research Aims
This research builds on the work of others and seeks to establish, through a real-
world example, whether:
 A commercial data source can provide the quality of data required to generate
sophisticated web page content.
 Inclusion of user preferences leads to improvements in the effectiveness
and coherency of web page designs and content.
1.4 Research Deliverables
This research produced an analysis of the content and structure within handwritten
reports used to inform readers of the major stock movements and events on the LSE.
This analysed and documented the vocabulary, grammar and constraint satisfaction
model necessary to generate equivalent texts through NLG techniques.
Ian Whitby (P2207964) Page 15
A prototype application was developed to hold the LSE data, the lexicon of words,
and the constraint satisfaction rules. The prototype allowed users to define their data
and display preferences and subsequently to assess the results against the manually-
written alternatives. The research delivers a quantitative and qualitative appraisal of
whether commercial-sector data can result in sophisticated text content and the
extent to which inclusion of user preferences improves the effectiveness of the
generated pages.
1.5 Contribution to Knowledge
The work of Dale et al (1998) suggested that commercial databases lacked the rigour
of purpose-built knowledge bases, leading to sparse, low quality data and information
structures unsuited to NLG. From both an academic and a commercial viewpoint this
assertion has far-reaching consequences - much of our collective knowledge is held
in such repositories but may be poor sources of NLG content. Through use of a
specific commercial data (London Stock Exchange data) this study looks to challenge
Dale et al’s (1998) assertion. The findings for this specific data source seek to provide
greater insight into the validity of Dale et al’s (1998) assertion.
In addition there is evidence that the effectiveness of graphic representations improves
when user preferences and prior interactions with the system are taken into
consideration (Matsushita et al’s, 2003). This research attempts to discover whether
incorporation of user preferences leads to improvements in page design and
content. No theoretical basis exists for measuring the effectiveness of a text in
conveying its linguistic meaning to a human audience. Psychological variances
within the audience make such measures only meaningful when obtained through
experimental studies. This study, whilst limited in scope, contributes to the wider
understanding of these factors both in research and industry.
Ian Whitby (P2207964) Page 16
Chapter 2 A Review of the Literature
2.1 Introduction
This chapter presents a review of the literature surrounding NLG and published
research on the generation of the paragraphs, sentences and phrases associated with
a natural language. NLG is both an established research discipline and an emerging
technology within the commercial sector. The subject is wide-ranging in scope,
using elements from linguistics, artificial intelligence, cognitive science and
human-computer interaction. This review reflects the breadth of the subject and
draws on a range of topics, from studies of document effectiveness, through text
coherency theory, to the grammar of punctuation and the automatic generation of
multimodal documents (i.e. documents which integrate several document modes,
for example text and graphics). The following synthesises the work of others and
places the subject within the context of what has already been established. From
this basis the current project focuses on its key area of research interest, adopting
principles established by previous authors to challenge current views.
2.2 Expressiveness/Effectiveness of Graphics
Early research into the automatic generation of graphic designs was undertaken by
Mackinlay (1987). This seminal work proposed that the style of graphic designs
(e.g. bar graphs, pie charts) could be chosen automatically from the underlying
data. Mackinlay argued that this was feasible if generation of possible styles was
seen as a problem which could be resolved through composition algebra.
In Mackinlay’s (1987) work an application prototype was built around artificial
intelligence techniques to generate a wide range of potential designs. The prototype
invoked composition algebra, a marriage of composition operators with simple
graphic languages, to generate a wide range of potential graphic styles.
Ian Whitby (P2207964) Page 17
Mackinlay (1987) proposed two measures of graphical success which would
allow the prototype to select the more appropriate of these styles:
 Expressiveness - the extent to which the language can express the
graphic.
 Effectiveness - the extent to which the language can use the
available output media and knowledge of the human
visual system to convey its message.
The syntax of each graphical language was expressed as a collection of tuples
(representing position, height, etc) which expressed the facts for each language.
Once all such tuples were generated the prototype could measure expressiveness by
the degree to which each language satisfied the available facts.
Mackinlay’s (1987) work relied upon on the assumptions that:
 All users share common conventions on how graphics should be constructed
and interpreted.
 Within each graphical language it is possible to encode all facts from the set
within a single sentence.
Although Mackinlay (1987) demonstrated that expressiveness could be determined
empirically, no such measure existed for effectiveness. The author turned to
observations by Cleveland and McGill (1984) that particular properties of a graphic
(e.g. position, length) played a greater role in the effectiveness of users to complete
set tasks. Mackinlay’s (1987) work used these observations to drive the
effectiveness of his system. These observations however provide a far more
subjective view than the empirical measure of expressiveness used within the same
system.
The current research project aims to quantify the sophistication of NLG texts
together with the contribution of user preferences to their effectiveness and
Ian Whitby (P2207964) Page 18
coherency. Mackinlay’s (1987) work suggests these measures cannot be derived
theoretically but will require subjective analysis of the research results.
2.3 Rhetorical Structure Theory
Mann and Thompson (1987, 1988) developed RST as a means of explaining the
coherence of texts. The authors realised that within all coherent texts each element of
the text had some plausible reason for its inclusions and furthermore, there were no
constituent elements omitted. RST provided a framework by which the presence of
these plausible text elements could be analysed and their structures described.
Through the study of numerous texts Mann and Thompson (1987, 1988) were able
to identify key components within coherent texts providing both its coherence
relations (its “nuclearity” and “relations”) and, additionally, any groupings of
key components (i.e. “schemas” applicable to a specific genre).
Mann and Thompson (1987, 1988) identified that for many texts a frequent
structural pattern is for two text spans, often adjoining, related by a condition. One
span (the “nucleus”) acted as the primary structural element, whilst the other (the
“satellite”) played a lesser role in the text. An alternative relationship arose
where no particular span had the primary role – resulting in “multinuclear”
structures.
The authors identified 29 distinct relationships which could exist between
nucleus and satellite in both nuclear and multinuclear texts and devised a
diagramming method by which these structural relationships could be documented.
The following RST diagram (Figure 2) illustrates “Elaboration” relationships
between the nucleus (the leftmost text) and its adjoining satellite.
Ian Whitby (P2207964) Page 19
Figure 2 RST Elaboration Example
RST provides a method of stating text structure and has provided the means for
many NLG systems to structure inputs to their document planning/structuring
phases (Dilley et al, 1992; Reiter and Dale, 2000).
Later authors (e.g. André and Rist, 1995) have argued that graphic elements may be
defined by comparable structure principles to those of text and that text and
graphics can be jointly defined through RST principles. The authors further
outline a method by which the generation process of textual and graphical content
can be coordinated.
The current research will look to generate a restricted target text corpus (Reiter and
Dale, 2000), based on a review of the corpus of input data and expected outputs.
This target text corpus will be expressed through RST and form the basis of
subsequent language generation phases.
2.4 The Linguistics of Punctuation
Nunberg (1990) analysed the linguistics of punctuation from an alternate viewpoint
to his predecessors. He argued that written text should be regarded not as a sub-
category of speech, judged by its ability to convey the spoken language, but as a
Early research into
the automatic
generation of graphic
designs was
undertaken by
Mackinlay (1987).
This seminal work
proposed that the
style of graphic
designs (e.g. bar
graphs, pie charts)
could be chosen
automatically from
the underlying data.
Elaboration
Ian Whitby (P2207964) Page 20
distinct grammar in its own right. Nunberg (1990) contended that punctuation used
within the written language displayed the characteristics of a true grammar worthy
of linguistic analysis.
Nunberg’s (1990) punctuation grammar included not only graphic elements, depicted
as non-alphanumeric characters (e.g. commas, semicolons, colons, periods,
parentheses, quotation marks), but also functional elements (e.g. font- and
face-alterations, capitalisation, indentation, spacing). He distinguished graphical
elements by the function they performed within a text:
 Delimiters of one or both ends of a text sequence
 Separators of two elements of the same type (e.g. separating list elements)
 Typographic distinguishers of an element from its surroundings (e.g.
Italics, font-and face-alterations).
He also categorised the forms by which graphical elements could mark text
elements and boundaries:
 As distinct characters (e.g. commas, semicolons, capitalisation).
 As font-, face- and size-alterations (e.g. italics, bold typeface).
 As “null” elements (e.g. spaces, margins, line breaks) for text separation.
As Nunberg noted:
“These formal and functional properties are not entirely independent of
one another. For example, it is in the nature of distinguishers that they
can be realized only as font-, face-, or size-alterations or by
underlining or analogous devices”.
(Nunberg, 1990, p. 53)
He proposed a set of “linearization rules” to map properties of graphic element
form onto those of function and “pouring rules” to define the layout of text
sequences on the page or screen.
Ian Whitby (P2207964) Page 21
By satisfying a series of grammatical constraints Nunberg was able to assign specific
punctuation attributes (e.g. capitalisation; italicisation) to elements of the lexical
grammar and demonstrate a clear separation between the rhetorical and the textual
elements of a text. These linearization rules, specifically those relating to
typographical distinguishers, show strong similarities with methods subsequently
adopted by contemporary mark-up languages (e.g. HTML, LaTeX). The current
research prototype will build on this separation of the rhetorical from the textual
elements of a text (e.g. by defining the display characteristics of the text
independently from its actual content).
2.5 Multimodal Systems
Multimodal systems use more then one presentation medium (e.g. text; graphics; video;
audio; hypertext) to convey a message. Researchers (Mackinlay, 1987; Feiner and
McKeown, 1991) have been quick to recognise the improvements this could make in a
document’s ability to effectively express its meaning. If a picture is truly “worth a
thousand words” then a document using text, graphics and other media in a
complementary fashion must surely be superior to one based on text alone?
Feiner and McKeown (1991) built on the work of Mackinlay in developing their
Coordinated Multimedia Explanation Testbed (COMET). COMET used artificial
intelligence to create a constraint satisfaction program which matched requests
from its users with the underlying data and knowledge of the user’s prior
interaction with the system. The system would subsequently generate multimedia
presentations automatically.
COMET employed separate text and graphics generators (Figure 3):
Ian Whitby (P2207964) Page 22
Figure 3 COMET Architecture (Feiner & McKeown, 1991)
The “Content Planner” drew on knowledge of its underlying data sources - a static
object source; a rule base to drive text generation; and a geometric knowledge base
for graphic generation. In addition the Content Planner maintained knowledge of
the user context and prior interactions with the system. Output from the Content
Planner consisted of a hierarchy of logical forms which were passed to the Media
Coordinator to determine whether they should be generated as text or graphic
representations.
Feiner and McKeown (1991) decided upon a categorisation of logical forms:
“After conducting a series of informal experiments and a survey of
literature on media effectiveness”.
(Feiner and McKeown, 1991, p.36)
Building on the work of Mackinlay (1987) the Media Coordinator employed
effectiveness criteria to derive six information categories for representation of logical
forms. Physical and locational attributes were represented entirely through graphics,
whilst abstract actions and relationships were generated as text. All other forms (e.g.
physical actions) were generated as a combination of both text and graphics.
Annotated logical forms were forwarded to the Text Generator, the Graphics
Content
Planner
Media
Coordinator
Text
Generator
Graphics
Generator
Media
Layout
Render &
Typeset
Logical
form
Annotated
logical
form
Text
Illustrations
Ian Whitby (P2207964) Page 23
Generator, or both for generation before layout of these elements and final
typesetting. Throughout this process the Text Generator and Graphics Generator
interact, allowing graphics to be placed within the text structure or the text to
reference properties of the graphics.
The Graphics Generator used the annotated logical forms to apply different
attributes (e.g. size, shape, material, colour, lighting) to its pre-defined visual
objects (e.g. pictures) rather than dynamically generate the graphic content. This
differed from Mackinlay’s work where the 2-dimensional graphics were built “on-
the-fly” through measures of expressiveness and effectiveness.
Wahlster et al (1993) developed a similar multimodal presentation system to
COMET. Their WIP system employed a Presentation Planner (analogous to
COMET’s Media Coordinator) and separate text and graphic generators.
Presentation
Strategies
TAG
Graphics
Design
Strategies
RAT
Basic
Ontology
User
Model
...
Knowledge Base
Document
Design Plans
Text
Realization
Text
Design
Graphics
Design
Graphics
Realization
Presentation Planner Layout ManagerPresentation
Goal
Generation
Parameters
Application
Knowledge used
in RAT
Mower
Espresso
Machine
Modem
...
Illustrated
Document
Figure 4 WIP Architecture (Wahlster et al, 1993)
The WIP Presentation Planner employed a set of generation parameters (e.g. user
ability, layout preferences) to gauge the suitability of its candidate designs. These
parameters were comparable to the effectiveness criteria used by both Mackinlay
(1987) and Feiner and McKeown (1991). WIP differs from earlier research work in
Ian Whitby (P2207964) Page 24
its clear separation of text/graphics design from its subsequent realisation. This
separation allows opportunities for customisation of presentation elements.
Time constraints prevented the current research from exploring the potential of
multimodal displays. An early objective had been to examine the contribution such
mixed media types made to document understanding and, in fact, the prototype’s
web browser technology was chosen for its widespread support of such media.
Further research is required to quantifying the contribution made by multimodal
displays.
2.6 User Customisation
Petre (1995) argued for a considered balance between the relative contributions of
text and graphics to the overall understanding of a multimodal presentation. In
studying visual programming techniques on groups of novice and expert users she
observed considerable differences in their inspection strategies for graphics.
Experts demonstrated effective navigation, a strong correlation of inspection
strategy with goals and attention to secondary notation (e.g. layout, logical flow,
colour conventions). Novices showed wide variations in strategy, some sticking
rigidly to inappropriate strategies whilst others changed strategy in an unpredictable,
chaotic manner. Petre (1995) concluded that graphical readership was an acquired
skill – experts were able to take advantage of such secondary notation cues whilst
novices struggled to accurately interpret them. Her assertions support the view
that multimodal presentations require customisation to the needs of their audience
(Matsushita et al, 2003).
Milosavljevic et al (1996) extended the work on Natural Language Generation
with their PEBA-II system to include the generation of dynamic hypertext. This
illustrated how web sites could adopt NLG to dynamically building pages at the
Ian Whitby (P2207964) Page 25
point of invocation. This flexibility allowed PEBA-II to tailor its output to the
user’s preferences and their previous discourse with the site. These ideas will be
examined further in the current research.
In 1997 Calvi and De Bra described adaptive information browsing as a means of
filtering both the navigation links and, to a lesser extent, the content presented to
students at the Eindhoven University of Technology. The authors cited evidence
that comprehension of presentation material relied upon the recipient building a
conceptual model of the information and the semantic relations implicit within it
(Van Dijk and Kintsch, 1983). Calvi and De Bra (1997) argued that the network
navigational links presented in web-based documents placed a high cognitive
overhead on the reader, leading to a reduction in their comprehension of the
material itself. The authors developed an adaptive hypermedia system which
presented users with a subset of the available navigational links (and hence
content). The constituents of this subset were determined on a per-user basis by
modelling each user’s previous navigation through the system. Unlike
Milosavljevic et al’s (1996) PEBA-II system, this implementation did not
demonstrate NLG in the true sense, restricting itself to toggling the visibility of
fixed-format content within the outputs; however it did actively model the user
context.
Initial feedback on Calvi and De Bra’s (1997) system from students suggested
concerns with this approach to information hiding:
“Users seem nevertheless to complain about the impossibility of the
present formalism to provide them with a snapshot of the system’s
complete structure”.
(Calvi and De Bra, 1997, p.272)
Ian Whitby (P2207964) Page 26
Such comments are relevant to the current research. Calvi and De Bra’s (1997)
study demonstrates that although it is possible to modify the design, content and
navigation paths to account for a user’s previous interaction with the system this
may prove counter-productive. Initial evidence suggests that users have in mind a
conceptual model of the system and that altering the waypoints and pathways
through this model reduces their acceptance of and proficiency with the actual
system. The arena of proficiency-adapted information browsing (Calvi and De Bra,
1997) constitutes a significant area of research, majoring on the psychological
aspects of human learning and understanding. The current research does not attempt
to explore these topics, stopping short of modelling the user context within the
research prototype.
2.7 Meaning and Markup
Bateman et al (1998; 2001) presented an architecture which unified the data-
aggregating methods of information visualisation and the communicative-goal
techniques of NLG. Their KOMET-PAVE experimental prototype drew on Formal
Concept Analysis (Wille, 1982) and the construction of dependency lattices to
design both text and graphics. The system generated all potential texts and
diagrams from a single pass through these dependency lattices and, through the
extensive use of heuristics, arrived at its final choice of multimodal presentation.
Reiter and Dale (2000) in their milestone textbook “Building Natural Language
Generation Systems” defined standard architectural components in the creation of
NLG systems. These are summarised in Figure 5 below.
Ian Whitby (P2207964) Page 27
Content
Determination
The determination of which human-authored texts
(the “corpus text”) will be targeted as NLG output
texts (the “target text”).
↓
Document
Structuring
The assignment of the output text into message
collections or groups and the relationships that
applies to these groups (i.e. the “rhetorical
structure” of the target text).
↓
Lexicalisation
The identification of the words or dictionary of
words (the “lexicon”) which will be applied to the
target texts.
↓
Referring
Expression
Generation
Determination of the expressions that will be used
to refer to entities in the target texts.
↓
Aggregation
The conceptual organisation of these rhetorical
structures into linguistic structures (e.g. sentences,
paragraphs)
↓
Linguistic
Realisation
The physical organisation of these message groups
into linguistic structures.
↓
Surface
Realisation
Mapping abstract structures (e.g. paragraphs) onto
the symbols necessary to display them in a
document presentation medium.
Figure 5 NLG Architecture Components
The authors’ work shows a clear distinction between the logical construction of text
and its physical presentation and provides pointers for the current project. Reiter and
Dale’s (2000) use of markup languages (e.g. HTML and LaTeX) to provide surface
realisation are employed in the research prototype as a mechanism for customising
individual user displays.
Ian Whitby (P2207964) Page 28
Power (2000) described a method by which the rhetorical structure for a text
could be realised through a text structure of sections, paragraphs and sentences
linked by discourse connectives (e.g. “since”, “however”, “whilst”) to mark the
rhetorical relations. He contended that text structuring could be formulated as a
constraint satisfaction problem whereby all potential texts were generated by
satisfying the text structure constraints and the most suitable of these texts could be
identified through the application of additional constraints. Power (2000) argued
that whilst research efforts have largely focused on building rhetorical structures to
organise elementary propositions into hierarchies less attention has been given to
realising these rhetorical structures as text structures. He argues that such text
structures are of sufficient complexity to require consideration early in the
construction of the texts rather than on a per sentence basis.
Power illustrated how the ICONOCLAST text structuring system was able to
generate all possible candidate text structures from the rhetorical structure. Each
candidate text structure represented an ordered tree in which the non-terminal
nodes are labelled as text categories, and the terminal nodes held either discourse
connectives or propositions (i.e. the content of the assertion).
For example the following output text might be generated through NLG:
“Energy stocks closed higher due to rising prices; however, the
FTSE100 falls to record low.”
Mann and Thompson (1986, 1987) categorised the style of relational proposition
which links these phrases/discourse connectives as a “concession”, whose rhetorical
structure is represented in Figure 6 below.
Ian Whitby (P2207964) Page 29
concession
NUCLEUS SATELLITE
falls(FTSE100, record low) cause
NUCLEUS SATELLITE
closed(energy stocks, higher) rising(energy stocks, prices)
Figure 6 RST Concession Example
Power (2000) contended that each rhetorical structure would be applicable to one or
more text structures, for example the following could produce by the RST structure
above:
sentence
text_clause text_clause
text_phrase text_phrase text_phrase text_phrase
closed(energy stocks, higher) “however” falls(FTSE100,
record low)
text_phrase text_phrase
“due to” rising(energy stocks, prices)
Figure 7 Concession Text Structure
The ICONOCLAST system, assigned both TEXT-LEVEL attributes and
INDENTATION levels to each node within each text structure. An arbitrary
number of TEXT-LEVELs are possible, with five chosen by the designers:
Ian Whitby (P2207964) Page 30
TEXT_LEVEL Aggregation Level
0 text-phrase
1 text-clause
2 text-sentence
3 paragraph
4 section
Figure 8 ICONOCLAST Text Structure Levels
Nodes are also aligned by INDENTATION level, ranging from zero (no
indentation) to the chosen maximum. Each candidate text must adhere to the rules
of the text structure, namely that each TEXT-LEVEL consists of one or more nodes
of a lower TEXT-LEVEL value (e.g. sections consist of paragraphs and paragraphs
consist of text-sentences). It is argued that all candidate solutions could be
generated by the addition of two further rhetorical structure node attributes –
ORDER (the linear position of the texts relative to its peers) and CONNECTIVE
(e.g. ‘since’; ‘consequently’).
Candidate solutions are obtained through a number of steps:
1. Add TEXT-LEVEL and ORDER attributes to each of the rhetorical
structure nodes.
2. Assign domains to both the TEXT-LEVEL and ORDER attributes (i.e. the
possible ranges of these attributes for each node)
3. Apply constraints (e.g. root node should have a higher TEXT-
LEVEL than its child nodes).
4. Compute all possible combinations.
5. Compute complete text structures which satisfy the text structure
formation rules. This includes adding discourse connectives to either the
nucleus or the satellite.
6. Validate all text structures (e.g. ensure each child node has a parent at the
TEXT-LEVEL directly above it).
Power also details the practical difficulties in completing all possible combinations
and the necessity to pre-filter texts before developing their rhetorical and text
Ian Whitby (P2207964) Page 31
structures. The current work will look at this and the subsequent works by
Bateman and Delin (2001) and Power et al (2003) in isolating the structural and
linguistic components of a document.
Bateman and Delin (2001) and Bateman et al (2002a; 2002b) investigated the genre
of multimodal documents. These authors contend that documents satisfy
communicative goals on five levels:
Content Structure - The structure of the information that is to be
communicated.
Rhetorical Structure - Rhetorical relationships which exist between
content elements (How the content is ‘argued’).
Layout Structure - The nature, appearance and position of document
elements.
Navigation Structure - Structures used to support the communication of
the document’s message (e.g. Titles, colour,
grouping).
Linguistic Structure - The structure of language used to realise the
document.
The authors believe the manner in which a document harmonises these structures
imparts to it a certain genre. This genre is further enhanced by its satisfaction of a
series of constraints:
Canvas Constraints - Constraints arising from the physical nature of the
document produced.
Production Constraints - Constraints arising from its production
technology.
Consumption Constraints - Constraints arising from the time, place and
manner in which the document’s information is
imparted.
Bateman and Delin (2001) believed all document designs could be critiqued using
these techniques to yield predictions of its usability, an approach supported by
Power et al (2003) who argued that document structure should be defined as
distinct elements. Power et al (2003) envisage three such elements: Rhetorical
Ian Whitby (P2207964) Page 32
Structure, Text Structure and Abstract Document Structure (Figure 1). Their work
builds on Power’s earlier research (2000) in which he identified both the rhetorical
structure and the text structure of a document (Figure 7 & Figure 8). The authors
argue that the text structure, through its arrangement of text and use of font
variations, also imparts a significant graphical component to the text which directly
contributes to its meaning. Power et al (2003) argue this third element, termed the
“Abstract Document Structure”, must be considered in isolation from the text
structure itself. The authors believe that to view the Abstract Document Structure as
a component of text structure limits choices of both layout and wording. Using
examples they illustrate how the deferral of abstract document structure (e.g. text
layout and font characteristics) until after the formalisation of the text structure
creates restrictions on the text and layout possibilities, with text layout decisions
taken at a sentence or paragraph level rather than on a document-wide level.
In part Power et al (2003) build their arguments on the earlier work of Nunberg
(1990) who identified the need to separate the mapping of syntactic structures onto
the lexical elements of a text grammar from its functional elements (e.g. font- and
face-alterations), differentiating between concrete features of text structure and the
abstract (Chapter 2.4). Power et al (2003) expanded upon the author’s earlier
research (Power, 2000) to illustrate how rhetorical structure could be transposed
into an ordered text structure tree and highlighted the problems of doing so (e.g.
with ELABORATION). The paper detailed their work on the ICONOCLAST
system and the generation of multiple texts/text structures from the same rhetorical
structure. ICONOCLAST used a five stage process to generate natural language:
Ian Whitby (P2207964) Page 33
Planning Module – Organises the document into a single rhetorical
structure, in which each non-terminal node was
represented by a rhetorical relationship and each
terminal node by a simple proposition.
Preliminary Module - Selects simple propositions from a knowledge
base and organises these into arguments.
Document Structurer - Distributes these arguments into sections,
paragraphs and text-sentences.
Syntax Realiser - Formulates the wording of propositions.
Formatter - Applies the abstract document structure (e.g.
graphical elements; font variations; tabbing).
This step-wise process is very relevant to the NLG tasks necessary for the current
research. It illustrates how the project might utilise data fed into a data repository to
produce text content for publication within a web site. ICONOCLAST used
constraint-resolution methods to formalise the potential valid options for solving
the rhetorical structure. Again this method has relevance to the current research and
the use of constraint-satisfaction programs, such as SICStus Prolog, to achieve this.
Power et al (2003) determined that through the use of further classifications they
would be able to generate all possible document permutations. The authors were
unable to provide a means by which the most suitable permutation might be
selected, using intuition for their simple examples. This is the same issue
encountered by Mackinlay (1987) and others in their search for the “effectiveness”
of texts. The current research will seek responses from a survey to provide an
independent measure of effectiveness for the generated outputs.
Matsushita et al (2003) extended the early work of Mackinlay (1987). The authors
considered two factors in chart selection – the type of chart display and the type of
user utterance. They reasoned that frequently users are unable to succinctly state
how to query the underlying data, particularly for large datasets, but rely instead on
exploratory data analysis. Through a series of intermediate query steps and
Ian Whitby (P2207964) Page 34
associated graphic representations of the results users are able to iterate towards a
final solution. Each successive query builds upon knowledge learnt from the
preceding queries. Matsushita et al (2003) argue that the resulting graphic presented
to the user must consider this user interaction in its chart generation. The authors
argue that decisions on graphic effectiveness must include the context in which the
user poses the query.
2.8 Summary
NLG has been an active area of research since the 1980s. Mann and Thompson’s
(1987, 1988) work on RST provided a framework by which discourse could be
analysed and described. Mackinlay’s (1987) work proposed a method by which
graphic designs could be selected on the basis of composition algebra. Subsequent
authors have elaborated and expanded upon these early endeavours.
The current research seeks to determine whether a commercial data source can
provide sophisticated page content and whether building user preferences into the
resultant pages improves their effectiveness. This work builds on the earlier work
of others. Mackinlay (1987) postulated that effectiveness could not be derived
theoretically whilst no theorem of human perceptual capabilities existed. Twenty
years on this remains the case and the current research adopts a survey strategy to
gauge the views of respondents.
Mann and Thompson’s (1987, 1988) work on rhetorical structure theory will be
used within the current research. Through the methods proposed by Reiter and Dale
(2000) the lexicon, style and rhetorical structure adopted in manually-authored texts
will be documented to form the basis of the target text corpus generated by the
prototype through NLG.
Ian Whitby (P2207964) Page 35
Petre’s studies (1995) on multimodal presentations concluded that graphical
readership was an acquired skill, with experts able to take advantage of such
secondary notation cues whilst novices struggled to accurately interpret them.
Her assertions support the view that multimodal presentations require customisation
to the needs of their audience (Matsushita et al, 2003). The current research
investigates the customisation of page presentation and, through detailed analysis of
survey responses, identifies those elements regarded as most effective by users.
Bateman and Delin (2001) and Bateman et al (2002a; 2002b) argued for the
division of text into five structural levels whilst Power et al (2003) envisaged a
separation of the abstract document structure from its text or rhetorical structure.
These divisions have affinities with current mark-up languages (e.g. HTML;
LaTeX), founded on the principle of clear separation between document
presentation and discourse structure. The current research uses web-based browser
technology as its presentation medium, employing HTML and CSS (Cascading
Style Sheets) to isolate the visual components from the underlying structure.
Ian Whitby (P2207964) Page 36
Chapter 3 Research Strategies
3.1 Introduction
Before moving on to the primary research it is worth reflecting on the strategies
and methods available to the study. Whilst the review of the literature placed the
current work in context it yielded little on the techniques necessary to undertake
this endeavour. This chapter reviews these strategies and assesses their
applicability.
3.2 Review of Research Strategies
Research activity is, by its very nature, a wide-ranging and varied discipline.
Methods vary with the field of study, the intended purpose, the approach adopted and
the nature of the work (Sharp et al, 2002). Despite this each study must be
repeatable, verifiable and contribute to the sum of knowledge. These demands create
a need for general principles which can be employed, singularly or collectively, to
research. Denscombe (2007) reviewed the major strategies available for small-scale
social research projects and proposed a grouping of eight categories:
1. SURVEYS
2. CASE STUDIES
3. EXPERIMENTS
4. ETHNOGRAPHY
5. PHENOMENOLOGY
6. GROUNDED THEORY
7. MIXED METHODS
8. ACTION RESEARCH
Robson’s (1983) review added a further category of “Psychological Experiments” to
this list.
Ian Whitby (P2207964) Page 37
3.2.1 Survey
Surveys provide established techniques by which hypotheses can be tested against
the actual views or actions of respondents and are widely used in social research
projects. Denscombe (2007) summarised these techniques as: postal questionnaires;
internet surveys/questionnaires; face-to-face interviews; telephone interviews;
documents; observations.
Surveys give wide coverage of the problem domain and are often selected to provide
a broad range of views and responses on the subject area. Inevitably time or
resources limitations mean surveys will contain a subset of the total population (a
“sample”). Rowntree (2000) emphasised the difference between surveys which
describe or summarise their sample and those which use it to make inferences on the
wider population. In the former observations have applicability only to the sample
itself whilst in the latter there is an assumption that the sample is representative of the
total population and that inferences can be drawn from the sample for every instance
of the population. A sample strategy which looks to make inferences on the total
population must address the issues of random sampling and skewed distributions
within its sample.
Surveys focus on a point in time, a snapshot, from which a random sample of views
or actions is observed or inferred. Further snapshots can be taken subsequently but
the random nature of sampling means these samples are unlikely to contain the same
respondents. This effect restricts the ability of surveys to analyse or describe
variations over time except at a macro level. Denscombe (2007) also noted their
empirical nature, targeting the measurement and recording of tangible attributes.
Such characteristic would prove restrictive in a study of human relationships but are
Ian Whitby (P2207964) Page 38
well suited to the current project where changes of independent variables cause
quantifiable changes in the associated dependant variables.
The survey approach is applicable to the current research which focuses on a point in
time and seeks to quantify the influence of independent variables on the NLG
outputs. The study looks to infer general characteristics from its sample, both for the
effectiveness/sophistication of share reports and for web-based NLG documents in
general. Implicit in this is the need to target a representative sample, whilst
acknowledging that domain experts will be geographically dispersed and represent a
small proportion of the total UK population. The study’s focus on web-based reports
suggests an internet survey/questionnaire to be suitable to its purposes.
3.2.2 Case Studies
In many ways a case study represents the antithesis of the survey, focusing on the
specific and studying few instances in detail rather than a higher level examination of
the broader population. Case studies tend to focus on relationships and processes,
making them particularly appropriate to studies of human sociology and psychology.
A case study will often involve monitoring the phenomenon in its normal setting and
allowing the researcher to observe interactions between variables. Multiple methods
are commonly employed to analyse the phenomenon and results are more qualitative
than those of a survey, relying on description rather than statistics to characterise the
phenomenon under study. As with surveys a case study may look to infer the
applicability of its results beyond the sample.
A case study approach is applicable to this research and was initially selected as a
means of obtaining feedback. The approach requires a limited number of domain
experts to make significant time available to the study (perhaps three sessions of one
Ian Whitby (P2207964) Page 39
hour each). However early enquiries to share clubs within a 30 mile radius revealed
that only a handful of clubs existed and of those only one met regularly and was
willing to participate. Further investigation showed that this would not yield
sufficient domain experts, with meetings scheduled only monthly and very low
attendance figures (usually two or three and occasionally fewer members). Assuming
not all members used web-based share reports this represented an unrewarding line
of analysis. The research adopted an internet-based survey/questionnaire approach to
gain a large sample population, although this brought attendant issues on how
representative the sample was to the overall population (Chapter 4).
3.2.3 Experiments
Experiments are characterised by the control they exhibit over their environment and
their identification of causal factors. They aim to isolate the study from external
influences and manipulate input variables to understand their influence on the
resulting outputs. Experiments use empirical measurements and observations to draw
conclusions from the results of the study. These conclusions have applicability
beyond the experiment itself and, through careful isolation of competing factors, the
experiment infers the causes of phenomena observed in the real world.
The current study investigates a complex research area. In measuring and appraising
the “effectiveness” of user customisation we need to understand what domain experts
understand by “effective”. Furthermore, deciding if a “commercial data source can
provide the quality of data required to generate sophisticated web page content”
(Chapter 1.3), requires us to test the users’ understanding of “sophisticated”.
The experimental approach provides the opportunity to achieve this. It isolates
external influences, keeping the underlying data consistent across the experiments
Ian Whitby (P2207964) Page 40
and removing the intangible effects of market sentiment and psychological pressures
associated with share trading. By providing a prototype environment in which casual
factors are controlled it is able to focus on the effects the remaining factors have on
the results. The internet-based survey/questionnaire forms an integral part of the
experiment (Appendix B) and probes respondents on specific cause-and-effect
relationships (e.g. “How significant was colour in identifying text groups?”; “How
similar was the grammar to that used in the ADVFN reports?”).
3.2.4 Ethnography
Ethnographic studies analyse people and groups through their lifestyles,
understandings and beliefs. This strategy has its origins in anthropological research
and the study of alien cultures. Ethnography is a detailed examination of human
society and social behaviour, often conducted from an insider’s viewpoint. This
approach offers little to the current research.
3.2.5 Phenomenology
Phenomenology is a strategy which deals with human perceptions, studying their
feelings, emotions, beliefs and attitudes. It analyses phenomena through a descriptive
approach aimed at interpreting its findings rather than quantifying them. The
approach is characterised by seeking to describe and explain phenomena without
abstracting, classifying or quantifying them.
The current research examines the coherency of web designs and the ability of
commercial data to provide sophisticated NLG content. Although elements of this
assessment are subjective (e.g. interpretation of designs as they appear to others)
phenomenology’s focus on feelings and emotions does not offer anything to the
overall aims of this research.
Ian Whitby (P2207964) Page 41
3.2.6 Grounded Theory
Grounded Theory is an approach aimed at the generation of new theories rather than
the verification of existing theories. The approach focuses on observations of the real
world to build theories around these observations on the basis of empirical research.
This strategy offers little to the current research which looks to challenge Dale et al’s
(1998) theory that commercial data is too noisy to generate sophisticated NLG
content and Matsushita et al’s (2003) assertion that user preferences lead to
improvements in web-page effectiveness.
3.2.7 Mixed Methods
A mixed method strategy emphasises the benefits of an approach which combines
quantitative and qualitative analysis to improve research results. A central
characteristic of the approach is that the study is not seen as an “either/or” decision
between empirical and subjective analysis but a consistent application of both. Mixed
method advocates select research techniques on their ability to clarify the problem
domain, not on the basis of their categorisation as belonging to a particular approach.
The current study is not specifically designed as a mixed method approach, although
the incorporating of a survey within an experiment shows its applicability to the
work. Furthermore the open questions within the survey require an alternative
approach to the empirical path taken for the multiple choice closed questions.
3.2.8 Action Research
Action research is a practical strategy aimed at real world problems. Both researcher
and respondents are influence by the study and often become active collaborators in
achieving its goals. The strategy is geared towards engineering change in real world
Ian Whitby (P2207964) Page 42
situations and is cyclic in approach, each change feeding back into a cycle of
renewed actions.
The current research does not seek to establish real world change. It is expected that
users might freely donate a little of their time to the study but unrealistic to expect
respondents to become active collaborators in addressing the issue of sub-optimal
web designs.
3.2.9 Psychological Experiments
Robson (1983) reviewed strategies for designing psychological experiments and
identified the need to differentiate between variables manipulated by the researcher
(“independent variables”) and those variables observed by the researcher to
determine the effect of this manipulation (“dependant” variables). Independent
variables usually represent study inputs whilst dependant variables are its outputs.
Within the current research user preferences will act as independent variables whilst
the effectiveness of the generated pages is the dependant variable. In this sense the
research shows the characteristics of an experimental strategy.
Robson (1983) also refers to independent variables whose value is not derived from a
numeric scale but through a subjective categorisation (e.g. “big”, “small”, “often”).
Such variables do not lend themselves to rigorous quantitative analysis but to more
qualitative analysis. The open questions of the research questionnaire are of this
nature. The questionnaire ensures respondents have an opportunity to expand on their
responses by supplementing the closed questions with a number of open ones.
Analysis of these responses shows the characteristics of a mixed method strategy.
Ian Whitby (P2207964) Page 43
3.3 Summary
The study looks at current practices within NLG and examines the use of constraint
satisfaction methods. An experimental approach is adopted in which the inputs to a
prototype system are carefully controlled and the factors influencing the effectiveness
and sophistication of the resulting reports carefully assessed. Existing work in this field
(Wahlster et al, 1993; Reiter and Dale, 2000; Power et al, 2003) is used as the basis for
the research prototype.
The prototype separates the report’s presentational elements from its rhetorical and
text structure through the use of cascading style sheets. Users are encouraged to
exploit this separation in customising the presentational aspects of the report. The
research also adopts a survey approach in eliciting the views of domain experts
through an internet-based survey/questionnaire (Appendix B). The prototype
dynamically generates a range of web page designs which, to varying degrees,
incorporate aspects of user preference.
Ian Whitby (P2207964) Page 44
Chapter 4 Data Acquisition
4.1 Introduction
This chapter extends the research strategies previously outlined to document specific
methods used for the current research. Acquisition techniques were applied both in
determining the necessary inputs for the prototype and gathering user feedback on its
outputs. Inevitably much of the input data for the handwritten reports came from
sources unavailable to the study (e.g. inflation rates, market sentiment, foreign
exchanges). Which text could be reproduced by the prototype was determined
through analysis of the rhetorical and textual structure of the handwritten reports
together with the creation of a dictionary of terms used by human authors. An
overview of the research prototype is described along with the commercial data used
and the process by which the range of output texts was determined.
4.2 Data Sources
This research study focused on the structure and content of web-based reports aimed
at informing the reader of daily share price movements for companies listed on the
London Stock Exchange. Reports of this nature already exist and an internet search
yields a wealth of information, ranging from the LSE itself
(www.londonstockexchange.com) to internet-based news organisations (e.g.
http://uk.reuters.com/business/markets; http://money.uk.msn.com) and the on-line
presence of traditional broadsheets
(http://business.timesonline.co.uk/tol/business/markets/; http://www.ft.com/markets).
This project does not attempt to study each of these sources, recognising that the
differences in their reporting styles would mask any variations caused by adjustments
to the study’s independent variables. Attempts to assess the effectiveness of user
customisation or the sophistication of NLG text are unlikely to prove fruitful when
Ian Whitby (P2207964) Page 45
the benchmark against which they are compared represent a collection of varying
styles. The current research opted for a single information source from internet
company ADVFN (http://www.advfn.com/) to provide both daily versions of their
own share reports and access to the underlying LSE data. Although the ADVFN
reports and data are publicly available upon payment of a subscription fee their
content is clearly aimed at trading experts with their web site geared towards
monitoring, charting and reporting stock market movements for trading purposes.
The study subscribed to ADVFN’s daily market data service (“Level 1” service), to
receive a trading summary each evening for all shares listed on the LSE (Table 1).
Variable Description
TIDM Share identifier (Tradable Instrument Display Mnemonic)
Opening Opening price at start of trading day
High Highest price achieved during the day
Low Lowest price achieved during the day
Closing Closing price at end of trading day
Volume Volume traded during the day
Table 1 LSE Data Received
The prototype provided access to LSE data and ADVFN reports across a three month
trial period (Sep to Nov 2008) to simulate a wide variety of actual market conditions.
Examples of an ADVFN share report and the Level 1 data are provided in Appendix
A. It should be noted that whilst not required for the aims of this study the prototype
could also have received market data in real-time (”Level 2” service), allowing share
reports to change dynamically throughout the trading day.
4.3 Prototype Design
The study employed a variety of technologies to produce a prototype capable of
addressing the aims of the research. The Level 1 data was manually downloaded
from ADVFN each evening as comma-separated files and uploaded into a MySQL
Ian Whitby (P2207964) Page 46
database (Figure 9). The ADVFN data values were unchanged by this process, with
no validation, error correction or data manipulation performed during committal to
the Daily_Values table. Subsequent use of this data was also read-only to ensure all
experiments were repeatable, albeit with minor variations of the actual texts
produced due to the dynamic nature of NLG. FTSE100 index values were committed
to the Index_Values table by a similar process whilst the remaining tables were
populated through a one-off data entry exercise.
Figure 9 Prototype - Entity Relationship Diagram
The database also illustrates future design considerations. Whilst the Indexes table
contained only a single entry for the FTSE100 it could easily be extended to other
market indexes. Similarly the views (labelled “VW_”) allowed market trends to be
analysed over time, a feature not implemented due to time constraints.
Ian Whitby (P2207964) Page 47
The general purpose scripting language PHP provided the application “glue” between
the database, the SICStus Prolog constraint satisfaction program and the HTML web
pages (Figure 10). From the HTML interface individual display/data preferences
were recorded as transient session variables, lost on termination of the session to
ensure respondent anonymity. Share reports were invoked from the “Reports” page
(Appendix C) by simply choosing the required date, with PHP querying the database
for all share values and FTSE100 movements relating to this date.
Figure 10 Prototype – High Level Design
The script interrogated the direction of movement for the FTSE100 on the chosen
day and, based on whether the index was rising, falling or static, altered the lexicon
of terms employed by the SICStus Prolog constraint satisfaction program to build its
outline sentences. These sentence templates were subsequently merged with the data
values queried previously and the most significant movements built into complete
sentences. The PHP queried the data and display preferences of the user adjusting the
content and presentation layout of the NLG report before presenting it at the browser.
Report Date
ADVFN
Share
Data Share
Data
M801 Website GenerationM801 Website Generation
Prolog
Preferences
Report Date
Sentence
Generation
HTML
HTML
Sentence
“Templates”
Display
Preferences
Report
Sentences
Share
Report
Share
Preferences
Database
Ian Whitby (P2207964) Page 48
The prototype encouraged respondents to experiment with the site
(www.ianwhitby.co.uk), revise their preferences and assess the effect on the
generated reports. Each NLG web page included a hyperlink to an equivalent
handwritten report, displayed in a separate browser window, for comparison with the
prototype’s output (Figure 11).
Ian Whitby (P2207964) Page 49
Figure 11 Comparison of NLG and Manual Reports
The prototype design shows simple navigation between pages, engendering a sense
of workflow (Figure 12). The principal pages in this workflow were accessed through
a tabbed menu structure with the feedback questionnaire marking the culmination of
a user’s interaction with the system.
Figure 12 Prototype - Workflow
4.4 Questionnaire
The research questionnaire (Appendix B) illustrated a number of characteristics:
1. The number of questions was limited to twenty three to encourage
respondents to fully complete the questionnaire. As Denscombe (2007)
cautioned:
“…there is, perhaps, no more effective deterrent to answering a
questionnaire than its sheer size”.
2. Formats were standardised across the questionnaire to ease comprehension.
3. Respondents were not asked to login or provide any information by which
they might be identified. The author believes the response rate would have
been considerably lower had respondents been required to identify
themselves.
Home
Page
Survey/
Questionnaire
NLG
Reports
User
Preferences
Ian Whitby (P2207964) Page 50
4. Most questions were multiple choice and closed in style (i.e. answers were
restricted to the options provided). This format was chosen to allow
respondents to fully complete the questionnaire with minimal effort.
5. Multiple choice answers were presented as a 5-star rating system
accompanied by supporting descriptions to aid comprehension.
6. Three open questions were provided for more expansive answers.
7. All questions were optional with the author choosing to allow partially
completed questionnaires in the results as the subject matter was deemed
likely to cause omitted answers. The author considered that forcing
respondents to complete all questions would significantly lower the response
rate. The responses received support this view with five of the seventeen
responses received failing to fully answer the closed questions. Rejecting
these results from the analysis would have excluded a significant proportion
of the sample (29%).
8. With no login procedure there was no limit on the number of responses an
individual could submit and the prototype did not validate the answers
received. This decision was taken to increase the response rate but did
increase the risk of accidental or deliberate contamination of the sample.
9. Responses were written to a file on the server in a format that could be
directly loaded into the database for statistical analysis, reducing the risk of
transcription errors.
Prior to the trial period the author conducted a pilot study with work colleagues
within the Technical Assurance/Design Authority department at National Grid.
Ian Whitby (P2207964) Page 51
The feedback received led to improvements in both the questionnaire and the site
itself. The prototype was subsequently published and feedback sought from
individuals and organisations with appropriate share trading knowledge (Table
2).
Individual/
Organisation
Potential
Number of
Respondents
Method of Communication
Energy Trading
Department – National
Grid
10 Face-to-face communication and follow-
up meetings.
Share trading clubs
(generally internet-
based)
200 Email to Secretary of each club
requesting he/she forward the request to
the members. (N.B. Twenty clubs were
contacted – the figure opposite assumes
an average membership of ten)
Family and Friends 40 Letters encouraging those with share
trading experience to respond to the
survey or to forward the link on to
others.
Work Colleagues 100 Email encouraging those with share
trading experience to respond to the
survey or to forward the link on to
others.
M801 Students 30 Message posted on the M801 Chat
Forum encouraging those with share
trading experience to respond to the
survey.
Total 380
Table 2 Users Targeted by Survey
From the target sample of 380 users it was felt 25-30 responses could be achieved.
National Grid traders and share club members would comprise a high proportion
domain experts and a 10% response rate was envisaged whilst a 5% response from
other groups would give a total sample of approximately 30.
Ian Whitby (P2207964) Page 52
The anonymous nature of the responses was identified as a potential problem. As
Denscombe (2007) pointed out:
“Questionnaires offer little opportunity for the researcher to check the
truthfulness of the answers given by the respondents”.
The author appraised this risk and chose to accept it rather than opt for the low
response rates achievable through face-to-face interviews (Chapter 3.2.2). As
mitigation the following steps were taken:
1. The target sample comprised domain experts operating in environments
where the aims of the research were likely to be well received.
2. The target sample was sufficiently large for a number of erroneous responses
to be accommodated without undermining the results.
3. Specific questions were asked to identify domain experts:
How many years have you been trading shares?
> 10 years 5 - 10 years 3 - 5 years 1 - 3 years 0 years
How regularly?
Daily Weekly Monthly Rarely Never
Which media type(s) do you use for your share reports?
Newspaper/Magazine Television Teletext Internet Datafeed
Other
How regularly do you read share reports?
Daily Weekly Monthly Rarely Never
Figure 13 Survey – Trading Experience
4. Additional “expert” questions were posed, requiring domain knowledge to
answer correctly, to provide a qualified view of the respondent’s experience:
Ian Whitby (P2207964) Page 53
What do you understand by the "AIM" market?
A listing of the top 100 companies within the Main Market
A listing on the LSE for smaller companies which offers less regulation
A listing of defence stocks
The Asian Investment Market
Don't Know
What do you understand by "Going Long"?
Buying futures which commit you to buy the asset at a set price in the future
Selling futures which commit you to sell the asset at a set price in the future
Buying shares as the price of the share falls
Selling shares as the price of the share falls
Don't Know
Figure 14 Survey – Domain Experts
The responses received were subjected to a series of statistic and analytical tests and
displayed no evidence of malicious misuse of the prototype.
4.5 Analysis of Handwritten Reports
ADVFN share reports are published twice daily in HTML format and emailed to
subscription members. The reports are derived from knowledge of LSE share
movements and an understanding of political and economic influences. The abstract
document structure of the reports remained broadly consistent across the trial period
as shown by the wireframe below (Figure 15).
Ian Whitby (P2207964) Page 54
01 Aug 2008 17:30
ADVFN III Evening Euro Markets Bulletin
Daily world financial news from Thomson Financial News Supplied by advfn.com
London
Lorem ipsum onsectetuer adipiscing elit
Lorem ipsum onsectetuer adipiscing elit, sed diam nonummy nibh euismod
tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim
veniam, quis nostrud exerci tation ullamcorper aliquip ex ea commodo consequat.
Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis
nisl ut aliquip ex ea commodo consequat. Duis autem vel eum iriure dolor in hendrerit in
vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero
eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit
augue duis dolore te feugait nulla facilisi. Nam liber tempor cum soluta
Lorem ipsum onsectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut
laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud
exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat.
Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl
ut aliquip ex ea commodo consequat. Duis autem vel eum iriure dolor in hendrerit in
vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero
eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue
duis dolore te feugait nulla facilisi. Nam liber tempor cum soluta
Lorem ipsum onsectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut
laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud
exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat.
Report Title
FTSE100 Content
Movements & editorial
- standard font weight
- standard font size
Trend-following FTSE100 Shares
Individual share movements within the
LSE plus editorial content on market
trends and sentiment.
This text group consists exclusively of
shares showing significant movements
in the same direction as the FTSE100.
- standard font weight
- standard font size
Trend-bucking FTSE100 Shares
Individual share movements within the
LSE plus editorial content on market
trends and sentiment.
If the previous group consisted of rising
stocks then this group consists of falling
stocks, and vice-versa.
- standard font weight
- standard font size
LSE Headline
- bold font
- standard font size
Report Date
- right-hand side
- bold font
Market Analysis Content
Political & economic factors affecting
global markets.
- standard font weight
- standard font size
Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl
ut aliquip ex ea commodo consequat. Duis autem vel eum iriure dolor in hendrerit in
vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero
eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue
duis dolore te feugait nulla facilisi. Nam liber tempor cum soluta
Lorem ipsum onsectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut
laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud
exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat.
LSE Section Heading
- bold font
- underlined font
- larger font size
Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis
nisl ut aliquip ex ea commodo consequat. Duis autem vel eum iriure dolor in hendrerit in
vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero
eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit
augue duis dolore te feugait nulla facilisi. Nam liber tempor cum soluta
Lorem ipsum onsectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut
laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud
exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat.
Non-FTSE100 Shares
Information on LSE stocks outside the
FTSE100.
Generally structured as shares which
followed the FTSE100 trend followed by
those which bucked the trend, although
the abstract data structure is less rigid
than earlier sections of the report.
- standard font weight
- standard font size
Figure 15 ADVFN Report - Abstract Document Structure
All reports began with the report title at the top of the page and the report date in the
top right-hand corner. All text was justified to align with left and right-hand margins
and, with the exception of the LSE section heading, used the same font size and style
Ian Whitby (P2207964) Page 55
throughout. The LSE section heading differed from the rest of the report in having
larger, underlined text in blue. The LSE headline, in common with all proper nouns,
was displayed in bold typeface and demonstrated a simple rhetorical structure aimed
at summarising the day’s share movements. Using 13th
August 2008 as an example
the LSE headline shows a simple nucleus of “London shares close lower” supported
by elaborations of additional information (Figure 16).
Figure 16 RST Example for LSE Headline
Beneath the LSE headline the ADVFN report positioned a paragraph detailing
movements of the FTSE100 index and followed this with several paragraphs of
market analysis. Below these, movements of FTSE100 stocks which echoed the trend
of the FTSE100 were detailed. This section generally reported the top 10 to 15
movers, providing price information and market analysis. This trend-following
section was followed by details of the top 10 to 15 movers in the opposite direction.
The report ended with a roundup of shares outside of the FTSE100. Figure 17
illustrates these characteristics for the 13th
August 2008 evening report.
Elaboration
London shares close
lower;
Wall Street drops;
Elaboration
U.S. retail sales fall
Ian Whitby (P2207964) Page 56
FTSE100 Content
Market Analysis Content
Trend-following FTSE100
Shares
Trend-bucking FTSE100
Shares
Non-FTSE100 Shares
LSE Section Heading
Report Date
Report Title
LSE Headline
Figure 17 ADVFN Report (13th
Aug 2008)
Ian Whitby (P2207964) Page 57
From this abstract document structure it is apparent that the reports cover both the
highest risers and fallers whatever the overall direction of the market, only the order
in which they are reported changes. The research prototype used two distinct
lexicons, one for rising stocks another for fallers, only the order in which they were
used changed to reflect the direction of the FTSE100. The prototype did not have the
relevant inputs to generate the “Market Analysis Content” section of the report and,
in fact, it is questionable whether a section with such high editorial content could be
generated through constraint satisfaction methods. This section did not appear in the
resulting NLG reports.
The research centred on ADVFN’s evening reports which provided information
directly comparable to the Level 1 service data. However such daily updates of share
movements prevented the research prototype from reporting intra-day share activity.
For example the ADVFN report for the 4th
August 2008 stated:
“The FTSE100 index closed down 34.5 points to 5,320.2, having
retreated from a morning peak of 5,414.7” (ADVFN report 4th
August
2008)
The research prototype could not produce this type of information as the satellite text
span (“having retreated from a morning peak of 5,414.7”) required knowledge of
intra-day trading activity.
The research prototype was limited to the Level 1 data outlined in Chapter 4.2, a
much narrower range of information compared to that available to the human authors
of the ADVFN reports. The prototype was unable to draw on an understanding of
market sentiment, external markets, political or economic factors in its report
production. The target text generated for the LSE headline above (Figure 17) is
reduced to “London shares close lower”.
Ian Whitby (P2207964) Page 58
Similarly the grouping of stocks into market sectors is unachievable with the
information to hand. Human authors bundle the reporting of related stocks into
cohesive phrases:
“Among the casualties, HBOS was off 24 pence at 307, Royal Bank of
Scotland was down 15-3/4 pence at 229-3/4, Barclays was 27 pence
lower at 355-1/2, and Lloyds TSB shed 20-1/2 pence to 308-1/2.”
(ADVFN report – 13th
Aug 2008)
The prototype, with no feel for the allocation of stocks to market sectors, is unable to
achieve this bundling and reports each stock as a separate, unrelated sentence.
Similarly expressing price movements of a group of stocks as a range or describing
them as list of related items is unachievable. Handwritten reports employ such
techniques to add fluency:
“Kazakhmys, Vedanta and Xstrata slid between 11.8 and 17.9 cent”
(ADVFN report – 19th
Nov 2008)
“Oil service groups Petrofac and Wood Group were up 23-1/2 at 544 and
17 at 397, respectively…”
(ADVFN report – 14th
Aug 2008)
The prototype is unable to perform these associations from data available to it with
“respectively” and “between” becoming redundant from its lexicon of terms.
A further differentiator between the handwritten reports and the NLG output is the
latter’s lack of elaboration and supporting content. The ADVFN reports use
additional text to support primary statements in a manner which cannot be achieved
by the prototype. The following sentence is typical of those employed by human
authors
Ian Whitby (P2207964) Page 59
“Upmarket residential property group Savills was the top FTSE 250
faller”
(ADVFN report – 13th
Aug 2008)
In rhetorical structure terms such phrases represent an elaboration pattern for which
the prototype can supply the nucleus (“Savills was the top FTSE 250 faller”) but not
the satellite text (“Upmarket residential property group”):
Figure 18 RST Elaboration in Handwritten Texts
In fact even the nucleus of this particular construct cannot be produced by the current
prototype. Although it has knowledge of stocks constituting the FTSE100 the same is
not true for alternative indices such as the FTSE250. A decision was taken to limit
the prototype’s scope to the FTSE100 and, whilst they could readily be incorporated
at a later date, alternative indices are not included in the current implementation. The
“Non-FTSE100 Shares” of Figure 17 are thus not created by the prototype.
Savills was the top
FTSE 250 faller
Upmarket residential
property group
Elaboration
Ian Whitby (P2207964) Page 60
4.6 Vocabulary of a Volatile Market
The study period of September to November 2008 proved one of the most eventful in
LSE history, with sizeable daily swings in market values and a strong downward
trend in the FTSE100 as the UK economy headed towards recession (Figure 19).
Figure 19 FTSE100 Daily Values and 5-day Trend Line
The current research evaluated the words/terms used in handwritten reports and their
rhetorical/textual structure in August 2008 prior to the trial. This work developed a
lexicon of terms and phrases adopted in the subsequent trial period for computer
generation of reports.
August represented a stable trading period for the FTSE100, illustrated by the small
variations in daily values and steady 5-day rolling average to the left of Figure 19. By
the start of the trial market conditions had worsened significantly with increased
market volatility and the FTSE100 rapidly losing 20-25% of its value (shown to the
Ian Whitby (P2207964) Page 61
right of Figure 19). Critically this research needs to determine whether human
authors continued with August’s words and phrases through the trial period. The
prototype was unable to alter its lexicon of terms once the study was underway and
questionnaire results received. The following sections examine whether the changing
market conditions led to changes in the vocabulary used by human authors and the
consequences of this.
Table 3 shows that during August the FTSE100 showed negligible changes in value
for one in five trading days. By the trial period only 8% of days showed variations of
less than 0.5% due to the increased market volatility.
Pre-Trial
(Aug 2008)
Trial
(Sep – Nov 2008)
FTSE100 Trend
Total % of
Total
Total % of
Total
Upward
(daily rise ≥ 0.5%)
6 days 30% 25 days 38%
Stable
(daily movement < 0.5%)
4 days 20% 5 days 8%
Downward
(daily fall ≥ 0.5%)
10 days 50% 35 days 54%
Table 3 FTSE100 Daily Trends
The research categorised trading on the LSE into days exhibiting upward, downward
or stable FTSE100 trends. For both the pre-trial and trial periods the lexicon of terms
used by handwritten reports within each category were analysed for randomly chosen
dates (Table 4).
Ian Whitby (P2207964) Page 62
Pre-Trial
(Aug 2008)
Trial
(Sep – Nov 2008)
FTSE100 Trend
Chosen
Date
FTSE100
Change
Chosen
Date
FTSE100
Change
5th
Aug 08 2.5% 19th
Sep 08 8.8%
8th
Aug 08 0.6% 13th
Oct 08 8.3%
Upward
(daily rise ≥ 0.5%)
14th
Aug 08 0.9% 20th
Oct 08 5.4%
-0.1% 13th Nov 08 -0.3%
-0.0% 25th Nov 08 0.4%
Stable
(daily movement < 0.5%)
12th
Aug 08
21st
Aug 08
26th Nov 08 -0.4%
1st
Aug 08 -1.1% 29th
Sep 08 -5.3%
4th
Aug 08 -0.6% 22nd
Oct 08 -4.5%
Downward
(daily fall ≥ 0.5%)
13th
Aug 08 -1.6% 19th
Nov 08 -4.8%
Table 4 FTSE100 Sample Days
The handwritten reports were analysed for each of the sample days to create corpus
texts (Reiter and Dale, 2000) of the words and terms used by human authors. This
exercise was undertaken for each FTSE100 trend before and during the trial and
these corpus texts were further refined to remove elements not directly available to
the prototype or those which could not be computed from its inputs. For the purposes
of comparing the vocabulary used before and during the trial these target text corpora
were consolidated into two lexicons of terms, one pre-trial, the other spanning the
trial period. Analysis of these lexicons illustrated tangible differences between the
pre-trial vocabulary and that used during the actual trial. During the stable pre-trial
period the handwritten reports showed little variation in grammar. Reports adopted a
restricted vocabulary which, the author contends, reflected the restricted movements
within the market. Sentences proved heavily reliant upon a few verbs and a limited
range of adjectives. Figure 20 illustrates for the target text corpus the proportional
use of verbs across all of the sample day reports in the pre-trial period and Figure 21
the equivalent use of adjectives. Any verb or adjective whose use accounted for more
Ian Whitby (P2207964) Page 63
than 1% of the total verbs or adjectives within the target text corpus was plotted as an
axis. The distance from the centre illustrates the relative use of the verb or adjective
against the total used in sample days in the pre-trial period.
Verbs Used in Manual Reports
0%
5%
10%
15%
20%
is
close
fall
shed
add
gain
drop
lose
rise
take
jump
slide
Other…
slip
surge
firm
rally
ease
tick up
climb
end
advance
lead
leap
glance
have
sag
sink
Verbs Before Trial
Figure 20 Pre-Trial ADVFN Reports - Verbs
Ian Whitby (P2207964) Page 64
Adjectives
0%
5%
10%
15%
20%
dow n
up
higher
low er
off
on
ahead
among
Other…
strongerw eaker
mixed
top
biggest
soared
battered
respectively
low
plummeting
Adjectives Before Trial
Figure 21 Pre-Trial ADVFN Reports - Adjectives
Over the trial period the lexicon expanded, showing less reliance upon stock terms
and phrases. Figure 22 illustrates verbs used within the target text corpus during this
period. The heavy reliance upon “is”, “close”, “fall”, “shed”, “add” and “gain” was
balanced by a broader range of verbs and a wider vocabulary.
Ian Whitby (P2207964) Page 65
Verbs Used in Manual Reports
0%
5%
10%
15%
20%
is
close
fall
shed
add
gain
drop
lose
rise
take
jump
slide
Other…
slip
surge
firm
rally
ease
tick up
climb
end
advance
lead
leap
glance
have
sag
sink
Verbs During Trial
Figure 22 Trial ADVFN Reports - Verbs
The contrast between verbs used during the pre-trial period and the trial period itself
is shown in Figure 23. The pre-trial vocabulary shows little variety in describing the
FTSE100 or movements of individual shares whilst the volatile trading of the trial
period coincides with an expanded verb base and more emotive terms. Verbs such as
“jump”, “slip”, “surge”, “leap”, “sag” and “sink” gain significance within the lexicon
in an attempt to effectively capture the magnitude of changes experienced by the
markets.
Ian Whitby (P2207964) Page 66
Verbs Used in Manual Reports
0%
5%
10%
15%
20%
is
close
fall
shed
add
gain
drop
lose
rise
take
jump
slide
Other…
slip
surge
firm
rally
ease
tick up
climb
end
advance
lead
leap
glance
have
sag
sink
Verbs Before Trial Verbs During Trial
Figure 23 Trial/Pre-Trial Comparison - Verbs
The comparison (Figure 23) also illustrates that, despite a broadening of the
vocabulary, the reports continue to draw heavily on verbs used in the pre-trial
reports. For adjectives the same holds true, the early dependence on “up”, “down”,
“higher” and “lower” is maintained across the trial period (Figure 24 and Figure 25)
but more expressive terms (“battered”, “soared”, “plummeting”) begin to enter the
lexicon.
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final
P2207964_M801_Final

More Related Content

What's hot

Sada strategy january 2011
Sada strategy january 2011Sada strategy january 2011
Sada strategy january 2011elvinblankson
 
Citrus-College-NASA SL Proposal-2014-15
Citrus-College-NASA SL Proposal-2014-15Citrus-College-NASA SL Proposal-2014-15
Citrus-College-NASA SL Proposal-2014-15Joseph Molina
 
Healthcare Contingency Operations by DHHS ASPR
Healthcare Contingency Operations by DHHS ASPRHealthcare Contingency Operations by DHHS ASPR
Healthcare Contingency Operations by DHHS ASPRDavid Sweigert
 
NSTC Identity Management Task Force Report
NSTC Identity Management Task Force Report NSTC Identity Management Task Force Report
NSTC Identity Management Task Force Report Duane Blackburn
 
Brazil-Phase-3-Report-EN
Brazil-Phase-3-Report-ENBrazil-Phase-3-Report-EN
Brazil-Phase-3-Report-ENJorge Lúcio
 
Closing The Loop: the benefits of Circular Economy for developing countries a...
Closing The Loop: the benefits of Circular Economy for developing countries a...Closing The Loop: the benefits of Circular Economy for developing countries a...
Closing The Loop: the benefits of Circular Economy for developing countries a...Alexandre Fernandes
 
Health Impact Assessment of the Shell Chemical Appalachia Petrochemical Complex
Health Impact Assessment of the Shell Chemical Appalachia Petrochemical ComplexHealth Impact Assessment of the Shell Chemical Appalachia Petrochemical Complex
Health Impact Assessment of the Shell Chemical Appalachia Petrochemical ComplexMarcellus Drilling News
 
Ariadne: Towards a Web of Archaeological Linked Open Data
Ariadne: Towards a Web of Archaeological Linked Open DataAriadne: Towards a Web of Archaeological Linked Open Data
Ariadne: Towards a Web of Archaeological Linked Open Dataariadnenetwork
 
Five-Year Economic Development Strategy for the District of Columbia
Five-Year Economic Development Strategy for the District of ColumbiaFive-Year Economic Development Strategy for the District of Columbia
Five-Year Economic Development Strategy for the District of ColumbiaWashington, DC Economic Partnership
 
Green Asset Management Toolkit: for Multifamily Housing
Green Asset Management Toolkit: for Multifamily HousingGreen Asset Management Toolkit: for Multifamily Housing
Green Asset Management Toolkit: for Multifamily HousingRashard Dyess-Lane
 
Sustainable Food Truck Business Plan
Sustainable Food Truck Business PlanSustainable Food Truck Business Plan
Sustainable Food Truck Business PlanKristin McGinnis, MBA
 
20150324 Strategic Vision for Cancer
20150324 Strategic Vision for Cancer20150324 Strategic Vision for Cancer
20150324 Strategic Vision for CancerSally Rickard
 
Building special operations partnership in afghanistan and beyond
Building special operations partnership in afghanistan and beyondBuilding special operations partnership in afghanistan and beyond
Building special operations partnership in afghanistan and beyondMamuka Mchedlidze
 

What's hot (20)

Main Report
Main ReportMain Report
Main Report
 
Sada strategy january 2011
Sada strategy january 2011Sada strategy january 2011
Sada strategy january 2011
 
Citrus-College-NASA SL Proposal-2014-15
Citrus-College-NASA SL Proposal-2014-15Citrus-College-NASA SL Proposal-2014-15
Citrus-College-NASA SL Proposal-2014-15
 
Healthcare Contingency Operations by DHHS ASPR
Healthcare Contingency Operations by DHHS ASPRHealthcare Contingency Operations by DHHS ASPR
Healthcare Contingency Operations by DHHS ASPR
 
NSTC Identity Management Task Force Report
NSTC Identity Management Task Force Report NSTC Identity Management Task Force Report
NSTC Identity Management Task Force Report
 
Thesis
ThesisThesis
Thesis
 
Brazil-Phase-3-Report-EN
Brazil-Phase-3-Report-ENBrazil-Phase-3-Report-EN
Brazil-Phase-3-Report-EN
 
Closing The Loop: the benefits of Circular Economy for developing countries a...
Closing The Loop: the benefits of Circular Economy for developing countries a...Closing The Loop: the benefits of Circular Economy for developing countries a...
Closing The Loop: the benefits of Circular Economy for developing countries a...
 
Health Impact Assessment of the Shell Chemical Appalachia Petrochemical Complex
Health Impact Assessment of the Shell Chemical Appalachia Petrochemical ComplexHealth Impact Assessment of the Shell Chemical Appalachia Petrochemical Complex
Health Impact Assessment of the Shell Chemical Appalachia Petrochemical Complex
 
Ariadne: Towards a Web of Archaeological Linked Open Data
Ariadne: Towards a Web of Archaeological Linked Open DataAriadne: Towards a Web of Archaeological Linked Open Data
Ariadne: Towards a Web of Archaeological Linked Open Data
 
DCFriskpaper280215
DCFriskpaper280215DCFriskpaper280215
DCFriskpaper280215
 
Five-Year Economic Development Strategy for the District of Columbia
Five-Year Economic Development Strategy for the District of ColumbiaFive-Year Economic Development Strategy for the District of Columbia
Five-Year Economic Development Strategy for the District of Columbia
 
Food truck business plan
Food truck business plan Food truck business plan
Food truck business plan
 
Green Asset Management Toolkit: for Multifamily Housing
Green Asset Management Toolkit: for Multifamily HousingGreen Asset Management Toolkit: for Multifamily Housing
Green Asset Management Toolkit: for Multifamily Housing
 
Sustainable Food Truck Business Plan
Sustainable Food Truck Business PlanSustainable Food Truck Business Plan
Sustainable Food Truck Business Plan
 
20150324 Strategic Vision for Cancer
20150324 Strategic Vision for Cancer20150324 Strategic Vision for Cancer
20150324 Strategic Vision for Cancer
 
EvalInvStrats_web
EvalInvStrats_webEvalInvStrats_web
EvalInvStrats_web
 
Building special operations partnership in afghanistan and beyond
Building special operations partnership in afghanistan and beyondBuilding special operations partnership in afghanistan and beyond
Building special operations partnership in afghanistan and beyond
 
Rand rr2621
Rand rr2621Rand rr2621
Rand rr2621
 
U.S. Consumer Best Practices
U.S. Consumer Best PracticesU.S. Consumer Best Practices
U.S. Consumer Best Practices
 

Similar to P2207964_M801_Final

QP_PRACTICAL_GUIDE_08062018_online (1).pdf
QP_PRACTICAL_GUIDE_08062018_online (1).pdfQP_PRACTICAL_GUIDE_08062018_online (1).pdf
QP_PRACTICAL_GUIDE_08062018_online (1).pdfalbeetar11
 
20090712 commodities in the if study undp exeuctive summarywith covers
20090712 commodities in the if study undp exeuctive summarywith covers20090712 commodities in the if study undp exeuctive summarywith covers
20090712 commodities in the if study undp exeuctive summarywith coversLichia Saner-Yiu
 
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...Jason Cheung
 
Undergrad Thesis | Information Science and Engineering
Undergrad Thesis | Information Science and EngineeringUndergrad Thesis | Information Science and Engineering
Undergrad Thesis | Information Science and EngineeringPriyanka Pandit
 
Monitoring And Evaluation For World Bank Agricultural Projects
Monitoring And Evaluation For  World Bank Agricultural  ProjectsMonitoring And Evaluation For  World Bank Agricultural  Projects
Monitoring And Evaluation For World Bank Agricultural ProjectsMalik Khalid Mehmood
 
Quentative research method
Quentative research methodQuentative research method
Quentative research methodMarketing Utopia
 
Applying Machine Learning Techniques to Revenue Management
Applying Machine Learning Techniques to Revenue ManagementApplying Machine Learning Techniques to Revenue Management
Applying Machine Learning Techniques to Revenue ManagementAhmed BEN JEMIA
 
Biomass
BiomassBiomass
BiomassLieuqn
 
REGIONAL WOOD ENERGY DEVELOPMENT PROGRAMME IN ASIA
REGIONAL WOOD ENERGY DEVELOPMENT PROGRAMME IN ASIAREGIONAL WOOD ENERGY DEVELOPMENT PROGRAMME IN ASIA
REGIONAL WOOD ENERGY DEVELOPMENT PROGRAMME IN ASIAPT carbon indonesia
 
Specification of the Linked Media Layer
Specification of the Linked Media LayerSpecification of the Linked Media Layer
Specification of the Linked Media LayerLinkedTV
 
Uncertainty Reduction in Online Dating Do Satisfied Customers Communicate Mor...
Uncertainty Reduction in Online Dating Do Satisfied Customers Communicate Mor...Uncertainty Reduction in Online Dating Do Satisfied Customers Communicate Mor...
Uncertainty Reduction in Online Dating Do Satisfied Customers Communicate Mor...Lena Frenzel
 
nasa-safer-using-b-method
nasa-safer-using-b-methodnasa-safer-using-b-method
nasa-safer-using-b-methodSylvain Verly
 
Design for public services- The fourth way
Design for public services- The fourth wayDesign for public services- The fourth way
Design for public services- The fourth wayforumvirium
 

Similar to P2207964_M801_Final (20)

QP_PRACTICAL_GUIDE_08062018_online (1).pdf
QP_PRACTICAL_GUIDE_08062018_online (1).pdfQP_PRACTICAL_GUIDE_08062018_online (1).pdf
QP_PRACTICAL_GUIDE_08062018_online (1).pdf
 
Clancy95barriers geetal
Clancy95barriers geetalClancy95barriers geetal
Clancy95barriers geetal
 
20090712 commodities in the if study undp exeuctive summarywith covers
20090712 commodities in the if study undp exeuctive summarywith covers20090712 commodities in the if study undp exeuctive summarywith covers
20090712 commodities in the if study undp exeuctive summarywith covers
 
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
 
Undergrad Thesis | Information Science and Engineering
Undergrad Thesis | Information Science and EngineeringUndergrad Thesis | Information Science and Engineering
Undergrad Thesis | Information Science and Engineering
 
Rand rr2637
Rand rr2637Rand rr2637
Rand rr2637
 
tese
tesetese
tese
 
Monitoring And Evaluation For World Bank Agricultural Projects
Monitoring And Evaluation For  World Bank Agricultural  ProjectsMonitoring And Evaluation For  World Bank Agricultural  Projects
Monitoring And Evaluation For World Bank Agricultural Projects
 
Quentative research method
Quentative research methodQuentative research method
Quentative research method
 
Master_Thesis
Master_ThesisMaster_Thesis
Master_Thesis
 
main
mainmain
main
 
Applying Machine Learning Techniques to Revenue Management
Applying Machine Learning Techniques to Revenue ManagementApplying Machine Learning Techniques to Revenue Management
Applying Machine Learning Techniques to Revenue Management
 
Knustthesis
KnustthesisKnustthesis
Knustthesis
 
Biomass
BiomassBiomass
Biomass
 
REGIONAL WOOD ENERGY DEVELOPMENT PROGRAMME IN ASIA
REGIONAL WOOD ENERGY DEVELOPMENT PROGRAMME IN ASIAREGIONAL WOOD ENERGY DEVELOPMENT PROGRAMME IN ASIA
REGIONAL WOOD ENERGY DEVELOPMENT PROGRAMME IN ASIA
 
10.1.1.866.373
10.1.1.866.37310.1.1.866.373
10.1.1.866.373
 
Specification of the Linked Media Layer
Specification of the Linked Media LayerSpecification of the Linked Media Layer
Specification of the Linked Media Layer
 
Uncertainty Reduction in Online Dating Do Satisfied Customers Communicate Mor...
Uncertainty Reduction in Online Dating Do Satisfied Customers Communicate Mor...Uncertainty Reduction in Online Dating Do Satisfied Customers Communicate Mor...
Uncertainty Reduction in Online Dating Do Satisfied Customers Communicate Mor...
 
nasa-safer-using-b-method
nasa-safer-using-b-methodnasa-safer-using-b-method
nasa-safer-using-b-method
 
Design for public services- The fourth way
Design for public services- The fourth wayDesign for public services- The fourth way
Design for public services- The fourth way
 

P2207964_M801_Final

  • 1. Total number of words 15177 A dissertation submitted in partial fulfilment of the requirements for the Open University’s Master of Science Degree in Computing for Commerce and Industry Ian Whitby P2207964 10th March 2009
  • 2. Ian Whitby (P2207964) Page ii PREFACE I would like to thank all of those who have contributed to this work and provided the opportunity to pursue this research. Special thanks go to my wife, Kate, and children, Rory and Holly, for the long hours spent away from them and for their unswerving support and encouragement throughout. Thanks are also due to my employer, National Grid, for granting the time to undertake this work and to the many employees who completed the research questionnaire. My thanks also go to ADVFN for allowing the use of their manually-written share reports within the research prototype and to my project supervisor, Linda White, for her support.
  • 3. Ian Whitby (P2207964) Page iii TABLE OF CONTENTS PREFACE ..................................................................................................................II TABLE OF CONTENTS ........................................................................................ III LIST OF FIGURES................................................................................................... V LIST OF TABLES..................................................................................................VII ABSTRACT .......................................................................................................... VIII CHAPTER 1 INTRODUCTION..........................................................................9 1.1 THE PROBLEM DOMAIN ....................................................................................9 1.2 DEVELOPMENTS IN NLG.................................................................................12 1.3 RESEARCH AIMS .............................................................................................14 1.4 RESEARCH DELIVERABLES..............................................................................14 1.5 CONTRIBUTION TO KNOWLEDGE.....................................................................15 CHAPTER 2 A REVIEW OF THE LITERATURE.........................................16 2.1 INTRODUCTION................................................................................................16 2.2 EXPRESSIVENESS/EFFECTIVENESS OF GRAPHICS.............................................16 2.3 RHETORICAL STRUCTURE THEORY..................................................................18 2.4 THE LINGUISTICS OF PUNCTUATION................................................................19 2.5 MULTIMODAL SYSTEMS ..................................................................................21 2.6 USER CUSTOMISATION ....................................................................................24 2.7 MEANING AND MARKUP .................................................................................26 2.8 SUMMARY .......................................................................................................34 CHAPTER 3 RESEARCH STRATEGIES .......................................................36 3.1 INTRODUCTION................................................................................................36 3.2 REVIEW OF RESEARCH STRATEGIES ................................................................36 3.3 SUMMARY .......................................................................................................43 CHAPTER 4 DATAACQUISITION .................................................................44
  • 4. Ian Whitby (P2207964) Page iv 4.1 INTRODUCTION................................................................................................44 4.2 DATA SOURCES................................................................................................44 4.3 PROTOTYPE DESIGN ........................................................................................45 4.4 QUESTIONNAIRE..............................................................................................49 4.5 ANALYSIS OF HANDWRITTEN REPORTS ...........................................................53 4.6 VOCABULARY OF A VOLATILE MARKET ..........................................................60 4.7 SUMMARY .......................................................................................................69 CHAPTER 5 DATAANALYSIS.........................................................................71 5.1 INTRODUCTION................................................................................................71 5.2 RESPONDENTS.................................................................................................71 5.3 USER PREFERENCES ........................................................................................72 5.4 USE OF COMMERCIAL DATA ............................................................................89 5.5 SUMMARY .......................................................................................................94 CHAPTER 6 CONCLUSIONS...........................................................................95 6.1 COMPARISON WITH THE RESEARCH AIMS........................................................95 6.2 DIRECTIONS FOR FUTURE WORK.....................................................................96 6.3 PROJECT REVIEW ............................................................................................97 REFERENCES.........................................................................................................98 BIBLIOGRAPHY ..................................................................................................101 INDEX.....................................................................................................................102 APPENDIX A ADVFN REPORTS AND DATA...........................................105 APPENDIX B QUESTIONNAIRE................................................................107 APPENDIX C DISPLAY STYLES................................................................. 112
  • 5. Ian Whitby (P2207964) Page v LIST OF FIGURES FIGURE 1 STRUCTURAL ELEMENTS (POWER ET AL, 2003)................12 FIGURE 2 RST ELABORATION EXAMPLE.................................................19 FIGURE 3 COMET ARCHITECTURE (FEINER & MCKEOWN, 1991)...22 FIGURE 4 WIPARCHITECTURE (WAHLSTER ET AL, 1993)..................23 FIGURE 5 NLG ARCHITECTURE COMPONENTS ....................................27 FIGURE 6 RST CONCESSION EXAMPLE....................................................29 FIGURE 7 CONCESSION TEXT STRUCTURE............................................29 FIGURE 8 ICONOCLAST TEXT STRUCTURE LEVELS...........................30 FIGURE 9 PROTOTYPE - ENTITY RELATIONSHIP DIAGRAM ............46 FIGURE 10 PROTOTYPE – HIGH LEVEL DESIGN......................................47 FIGURE 11 COMPARISON OF NLG AND MANUAL REPORTS.................49 FIGURE 12 PROTOTYPE - WORKFLOW.......................................................49 FIGURE 13 SURVEY – TRADING EXPERIENCE.........................................52 FIGURE 14 SURVEY – DOMAIN EXPERTS ...................................................53 FIGURE 15 ADVFN REPORT - ABSTRACT DOCUMENT STRUCTURE .54 FIGURE 16 RST EXAMPLE FOR LSE HEADLINE.......................................55 FIGURE 17 ADVFN REPORT (13TH AUG 2008) ..............................................56 FIGURE 18 RST ELABORATION IN HANDWRITTEN TEXTS..................59 FIGURE 19 FTSE100 DAILY VALUES AND 5-DAY TREND LINE ..............60 FIGURE 20 PRE-TRIALADVFN REPORTS - VERBS...................................63 FIGURE 21 PRE-TRIALADVFN REPORTS - ADJECTIVES .......................64 FIGURE 22 TRIALADVFN REPORTS - VERBS ............................................65 FIGURE 23 TRIAL/PRE-TRIAL COMPARISON - VERBS ...........................66 FIGURE 24 TRIALADVFN REPORTS - ADJECTIVES.................................67 FIGURE 25 TRIAL/PRE-TRIAL COMPARISON - ADJECTIVES ...............67
  • 6. Ian Whitby (P2207964) Page vi FIGURE 26 TRIAL/PRE-TRIAL COMPARISON - NOUNS...........................69 FIGURE 27 SURVEY RESPONSES - EXPERIENCE......................................71 FIGURE 28 SURVEY RESPONSES - TRADING .............................................72 FIGURE 29 SURVEY - DISPLAY STYLE .........................................................73 FIGURE 30 SURVEY RESPONSES - PREFERRED STYLE..........................73 FIGURE 31 SURVEY - REPORT STRUCTURE ..............................................74 FIGURE 32 SURVEY RESPONSES - REPORT STRUCTURE......................74 FIGURE 33 SURVEY - TEXT BLOCKS............................................................75 FIGURE 34 SURVEY RESPONSES - TEXT BLOCKS....................................76 FIGURE 35 PROPERTIES OF A NORMAL DISTRIBUTION CURVE........78 FIGURE 36 SURVEY - REPORT GROUPING.................................................79 FIGURE 37 SURVEY RESPONSES - TEXT GROUPING ..............................80 FIGURE 38 SURVEY RESPONSES – GROUPING COMPARISON .............81 FIGURE 39 SURVEY - CUSTOMISATION ......................................................82 FIGURE 40 SURVEY RESPONSES - CUSTOMISATION..............................83 FIGURE 41 SURVEY RESPONSES - CUSTOMISATION (HISTOGRAM) .83 FIGURE 42 SURVEY - PAGE NAVIGATION...................................................86 FIGURE 43 SURVEY RESPONSES - PAGE NAVIGATION...........................86 FIGURE 44 SURVEY RESPONSES - COLOUR...............................................88 FIGURE 45 SURVEY - SENTENCE CONSTRUCTION .................................89 FIGURE 46 SURVEY RESPONSES - VOCABULARY COMPARISON .......90 FIGURE 47 SURVEY RESPONSES - GRAMMAR COMPARISON .............91 FIGURE 48 SURVEY RESPONSES - SENTENCE COMPARISON ..............92 FIGURE 49 SURVEY - MISSING DATA............................................................93 FIGURE 50 SURVEY RESPONSES - MISSING DATA ...................................94
  • 7. Ian Whitby (P2207964) Page vii LIST OF TABLES TABLE 1 LSE DATA RECEIVED..........................................................................45 TABLE 2 USERS TARGETED BY SURVEY.......................................................51 TABLE 3 FTSE100 DAILY TRENDS ....................................................................61 TABLE 4 FTSE100 SAMPLE DAYS......................................................................62 TABLE 5 PAGE NAVIGATION STATISTICS .....................................................87
  • 8. Ian Whitby (P2207964) Page viii ABSTRACT This report examines the production of web-based text reports through Natural Language Generation (NLG) techniques. The work reviews the current body of NLG knowledge and aims, through the use of an internet-based prototype, to determine whether commercial data can provide the quality of information required to automatically produce texts of comparable sophistication to human- authored reports. The research uses the prototype to further investigate whether inclusion of user preferences in the generated outputs leads to improvements in the effectiveness and coherency of the resulting web pages. The study employs an internet survey to obtain feedback on the quality of grammar, vocabulary and sentence construction obtained through NLG production from a commercial data source. The survey also provides quantitative and qualitative assessment on the effectiveness of web page designs in harnessing individual user preferences. The returns from the survey were subjected to statistical analysis and its results extended to infer characteristics for the wider population. This work is believed to have applicability beyond the realm of the current research and the study concludes with recommendations for further work within the same field and across other disciplines.
  • 9. Ian Whitby (P2207964) Page 9 Chapter 1 Introduction Commercial organisations are making increasing use of their institutional databases as sources of on-line information. Such repositories provide users with round-the- clock access to information across the internet and an ability to interact with the data at any time. For manually maintained web sites this causes problems with both the content and design of the site lagging behind the information present within the organisation’s own database. Rapid changes in the underlying data can take days or even weeks to be hand-crafted into updated web designs and content. In addition the presentation of this information is often seen as impersonal and failing to engage its audience. To counter this organisations are assessing whether NLG could automatically generate the design and content of their web pages. 1.1 The Problem Domain NLG uses knowledge of natural language constructs, its grammar and vocabulary to build grammatically correct sentences and phrases from an underlying source of nonlinguistic data. The discipline is closely allied to Natural Language Processing (NLP) which looks to determine the meaning of prewritten sentences. Reiter and Dale (2000) describe NLG as: “.. a subfield of artificial intelligence and computational linguistics that is concerned with building computer software systems that can produce meaningful texts in English or other human languages from some underlying nonlinguistic representation of information. NLG systems use knowledge about language and the application domain to automatically produce documents, reports, help messages, and other kinds of texts.“ (Reiter and Dale, 2000, p. xvii) The application of NLG to the automatic generation of web page designs and content represents an active research topic, both within academia and industry. The research community is divided on whether commercial databases are of sufficient quality to
  • 10. Ian Whitby (P2207964) Page 10 yield NLG texts comparable to those produced by research prototypes. Although research data sets have been used to produce sophisticated outputs the disparate nature of commercial data sources, their leniency of data validation and the intrinsic limitations of recording language components within a relational database structure have cast doubts over their suitability. In addition researchers are divided on the extent to which the user’s knowledge, preferences and prior interaction with a web site affect his/her assimilation and understanding of its content (Matsushita et al, 2003; Mackinlay, 1987; Reiter et al, 2003). Similar arguments have been extended to the role of graphics within a multimodal presentation (Feiner and McKeown, 1991; Bateman et al, 2001). Within the natural language community the suitability of commercial data repositories as sources of sophisticated natural language content remains an active area of research. Dale et al’s (1998) work on generating large volumes of NLG text from commercial sources highlighted the data quality issues involved and a need for further investigation: “The problems that arise from noisy data in our database are likely to be faced by any attempt to use a real database as an information source” . (Dale et al, 1998, Section 5. Conclusion) Current research also focuses on the extent to which improvements in the transference of a text’s message and content to the reader may be achieved through the inclusion of the user context. Mackinlay’s (1987) work focused on the expressiveness and effectiveness of graphical designs and highlighted the need for: “...choosing or adapting the dialogue specifications appropriate to the observed skill level of the user”. (Mackinlay, 1987, p. 139)
  • 11. Ian Whitby (P2207964) Page 11 Matsushita et al (2003) built on Mackinlay’s (1987) work to incorporate this user context in the visualisation of numerical values whilst Reiter et al (2003) investigated the research techniques used in acquiring this knowledge of user preference and interaction with the system. The current study investigates both areas further and uses a web-based prototype to create dynamically generated reports from a commercial data source. A daily feed of price information for stocks listed on the London Stock Market (LSE) is recorded into the prototype’s relational database without validation or alteration. The database acts as the information source for subsequent investigation of the suitability of the commercial data and the contribution of user preferences to the effectiveness of the generated text.
  • 12. Ian Whitby (P2207964) Page 12 1.2 Developments in NLG A NLG milestone in achieving automatic text generation was Mann and Thompson’s (1986) development of Rhetorical Structure Theory (RST) as a means of formalising the rhetorical structure of texts. RST allows texts to be expressed as a hierarchy of elementary propositions and as a diagrammatic representation of ordered nodes, in which non-terminal nodes within the tree represent a relationship within the text and terminal nodes represent text phrases. Subsequent authors (Power et al., 2003) built on this work, advocating refinements in the separation of rhetorical structure from both the abstract and text structures within a document. Power et al (2003) disputed Mann and Thompson’s assertion that non-terminal nodes represent a relationship within the text, arguing that whilst these relationships exist within the document’s rhetorical structure (i.e. its meaning) they do not necessarily persist through to its textual realisation. Power et al (2003) argued the case for viewing a document as three distinct structural elements (Figure 1): Figure 1 Structural Elements (Power et al, 2003) The authors further postulated that the abstract document structure could be viewed as an extension of Nunberg’s (1990) “text-grammar” and its close affinities with text Rhetorical Structure  Hierarchy of Elementary Propositions  Rhetorical Structure Diagrams Abstract Document Structure  Titles  Headings  Captions  Lists Text Structure  Chapters  Sections  Paragraphs  Sentences  Clauses  Phrases
  • 13. Ian Whitby (P2207964) Page 13 markup languages such as HTML and LaTeX. Both HTML and LaTeX are founded on the belief that the visual appearance of the text assists in conveying its meaning but can also be represented as a distinct element (e.g. as Cascading Style Sheets). Research has also been undertaken into the automatic generation of graphics. Mackinlay (1987) investigated the generation of a range of graphical presentations (e.g. bar charts, scatter plots and connected graphs) from relational data held within a database. The research sought to codify graphic designs by their expressiveness (i.e. the extent to which the graphical language expressed the underlying message) and their effectiveness (how well this message was received and understood by its audience). Dale et al (1998) sought to extend NLG techniques beyond carefully prepared artificial intelligence knowledge bases to the use of real-world data repositories. Using the PEBA-II NLG system (Milosavljevic et al, 1996) they generated hypertext documents from both a knowledge base and an equivalent commercial database. Their work highlighted weaknesses in both data quality and information structure within the commercial offering which undermined its ability to generate quality texts. Bateman et al (2001) argued that neither the function nor the nature of information layout had been addressed in earlier works. They postulated that a document’s message came through the overall arrangement of information on the page whilst relationships between this information were commonly conveyed through layout. The authors stated that insufficient research had been conducted on the informational significance of presentation and layout. Bateman et al (2001) advocated an integrated approach to generating layout, text and graphics under a common framework, arguing that only by considering these elements collectively was it possible to attain coherent presentation designs. Their research focused on the
  • 14. Ian Whitby (P2207964) Page 14 empirical investigation of manually-generated presentations and the development of a prototype system. Matsushita et al (2003) extended the early work of Mackinlay (1987). The authors contended that users were often unable to succinctly state how they wish to query the underlying data, particularly for large datasets but relied instead on exploratory data analysis and a series of graphic iterations to build this understanding. Under these conditions the effectiveness of a graphic is determined not just by tenets of graphic presentation but also by the user’s previous interaction with the system. Matsushita et al (2003) argued that decisions on graphic effectiveness must include the context in which the user poses the query. 1.3 Research Aims This research builds on the work of others and seeks to establish, through a real- world example, whether:  A commercial data source can provide the quality of data required to generate sophisticated web page content.  Inclusion of user preferences leads to improvements in the effectiveness and coherency of web page designs and content. 1.4 Research Deliverables This research produced an analysis of the content and structure within handwritten reports used to inform readers of the major stock movements and events on the LSE. This analysed and documented the vocabulary, grammar and constraint satisfaction model necessary to generate equivalent texts through NLG techniques.
  • 15. Ian Whitby (P2207964) Page 15 A prototype application was developed to hold the LSE data, the lexicon of words, and the constraint satisfaction rules. The prototype allowed users to define their data and display preferences and subsequently to assess the results against the manually- written alternatives. The research delivers a quantitative and qualitative appraisal of whether commercial-sector data can result in sophisticated text content and the extent to which inclusion of user preferences improves the effectiveness of the generated pages. 1.5 Contribution to Knowledge The work of Dale et al (1998) suggested that commercial databases lacked the rigour of purpose-built knowledge bases, leading to sparse, low quality data and information structures unsuited to NLG. From both an academic and a commercial viewpoint this assertion has far-reaching consequences - much of our collective knowledge is held in such repositories but may be poor sources of NLG content. Through use of a specific commercial data (London Stock Exchange data) this study looks to challenge Dale et al’s (1998) assertion. The findings for this specific data source seek to provide greater insight into the validity of Dale et al’s (1998) assertion. In addition there is evidence that the effectiveness of graphic representations improves when user preferences and prior interactions with the system are taken into consideration (Matsushita et al’s, 2003). This research attempts to discover whether incorporation of user preferences leads to improvements in page design and content. No theoretical basis exists for measuring the effectiveness of a text in conveying its linguistic meaning to a human audience. Psychological variances within the audience make such measures only meaningful when obtained through experimental studies. This study, whilst limited in scope, contributes to the wider understanding of these factors both in research and industry.
  • 16. Ian Whitby (P2207964) Page 16 Chapter 2 A Review of the Literature 2.1 Introduction This chapter presents a review of the literature surrounding NLG and published research on the generation of the paragraphs, sentences and phrases associated with a natural language. NLG is both an established research discipline and an emerging technology within the commercial sector. The subject is wide-ranging in scope, using elements from linguistics, artificial intelligence, cognitive science and human-computer interaction. This review reflects the breadth of the subject and draws on a range of topics, from studies of document effectiveness, through text coherency theory, to the grammar of punctuation and the automatic generation of multimodal documents (i.e. documents which integrate several document modes, for example text and graphics). The following synthesises the work of others and places the subject within the context of what has already been established. From this basis the current project focuses on its key area of research interest, adopting principles established by previous authors to challenge current views. 2.2 Expressiveness/Effectiveness of Graphics Early research into the automatic generation of graphic designs was undertaken by Mackinlay (1987). This seminal work proposed that the style of graphic designs (e.g. bar graphs, pie charts) could be chosen automatically from the underlying data. Mackinlay argued that this was feasible if generation of possible styles was seen as a problem which could be resolved through composition algebra. In Mackinlay’s (1987) work an application prototype was built around artificial intelligence techniques to generate a wide range of potential designs. The prototype invoked composition algebra, a marriage of composition operators with simple graphic languages, to generate a wide range of potential graphic styles.
  • 17. Ian Whitby (P2207964) Page 17 Mackinlay (1987) proposed two measures of graphical success which would allow the prototype to select the more appropriate of these styles:  Expressiveness - the extent to which the language can express the graphic.  Effectiveness - the extent to which the language can use the available output media and knowledge of the human visual system to convey its message. The syntax of each graphical language was expressed as a collection of tuples (representing position, height, etc) which expressed the facts for each language. Once all such tuples were generated the prototype could measure expressiveness by the degree to which each language satisfied the available facts. Mackinlay’s (1987) work relied upon on the assumptions that:  All users share common conventions on how graphics should be constructed and interpreted.  Within each graphical language it is possible to encode all facts from the set within a single sentence. Although Mackinlay (1987) demonstrated that expressiveness could be determined empirically, no such measure existed for effectiveness. The author turned to observations by Cleveland and McGill (1984) that particular properties of a graphic (e.g. position, length) played a greater role in the effectiveness of users to complete set tasks. Mackinlay’s (1987) work used these observations to drive the effectiveness of his system. These observations however provide a far more subjective view than the empirical measure of expressiveness used within the same system. The current research project aims to quantify the sophistication of NLG texts together with the contribution of user preferences to their effectiveness and
  • 18. Ian Whitby (P2207964) Page 18 coherency. Mackinlay’s (1987) work suggests these measures cannot be derived theoretically but will require subjective analysis of the research results. 2.3 Rhetorical Structure Theory Mann and Thompson (1987, 1988) developed RST as a means of explaining the coherence of texts. The authors realised that within all coherent texts each element of the text had some plausible reason for its inclusions and furthermore, there were no constituent elements omitted. RST provided a framework by which the presence of these plausible text elements could be analysed and their structures described. Through the study of numerous texts Mann and Thompson (1987, 1988) were able to identify key components within coherent texts providing both its coherence relations (its “nuclearity” and “relations”) and, additionally, any groupings of key components (i.e. “schemas” applicable to a specific genre). Mann and Thompson (1987, 1988) identified that for many texts a frequent structural pattern is for two text spans, often adjoining, related by a condition. One span (the “nucleus”) acted as the primary structural element, whilst the other (the “satellite”) played a lesser role in the text. An alternative relationship arose where no particular span had the primary role – resulting in “multinuclear” structures. The authors identified 29 distinct relationships which could exist between nucleus and satellite in both nuclear and multinuclear texts and devised a diagramming method by which these structural relationships could be documented. The following RST diagram (Figure 2) illustrates “Elaboration” relationships between the nucleus (the leftmost text) and its adjoining satellite.
  • 19. Ian Whitby (P2207964) Page 19 Figure 2 RST Elaboration Example RST provides a method of stating text structure and has provided the means for many NLG systems to structure inputs to their document planning/structuring phases (Dilley et al, 1992; Reiter and Dale, 2000). Later authors (e.g. André and Rist, 1995) have argued that graphic elements may be defined by comparable structure principles to those of text and that text and graphics can be jointly defined through RST principles. The authors further outline a method by which the generation process of textual and graphical content can be coordinated. The current research will look to generate a restricted target text corpus (Reiter and Dale, 2000), based on a review of the corpus of input data and expected outputs. This target text corpus will be expressed through RST and form the basis of subsequent language generation phases. 2.4 The Linguistics of Punctuation Nunberg (1990) analysed the linguistics of punctuation from an alternate viewpoint to his predecessors. He argued that written text should be regarded not as a sub- category of speech, judged by its ability to convey the spoken language, but as a Early research into the automatic generation of graphic designs was undertaken by Mackinlay (1987). This seminal work proposed that the style of graphic designs (e.g. bar graphs, pie charts) could be chosen automatically from the underlying data. Elaboration
  • 20. Ian Whitby (P2207964) Page 20 distinct grammar in its own right. Nunberg (1990) contended that punctuation used within the written language displayed the characteristics of a true grammar worthy of linguistic analysis. Nunberg’s (1990) punctuation grammar included not only graphic elements, depicted as non-alphanumeric characters (e.g. commas, semicolons, colons, periods, parentheses, quotation marks), but also functional elements (e.g. font- and face-alterations, capitalisation, indentation, spacing). He distinguished graphical elements by the function they performed within a text:  Delimiters of one or both ends of a text sequence  Separators of two elements of the same type (e.g. separating list elements)  Typographic distinguishers of an element from its surroundings (e.g. Italics, font-and face-alterations). He also categorised the forms by which graphical elements could mark text elements and boundaries:  As distinct characters (e.g. commas, semicolons, capitalisation).  As font-, face- and size-alterations (e.g. italics, bold typeface).  As “null” elements (e.g. spaces, margins, line breaks) for text separation. As Nunberg noted: “These formal and functional properties are not entirely independent of one another. For example, it is in the nature of distinguishers that they can be realized only as font-, face-, or size-alterations or by underlining or analogous devices”. (Nunberg, 1990, p. 53) He proposed a set of “linearization rules” to map properties of graphic element form onto those of function and “pouring rules” to define the layout of text sequences on the page or screen.
  • 21. Ian Whitby (P2207964) Page 21 By satisfying a series of grammatical constraints Nunberg was able to assign specific punctuation attributes (e.g. capitalisation; italicisation) to elements of the lexical grammar and demonstrate a clear separation between the rhetorical and the textual elements of a text. These linearization rules, specifically those relating to typographical distinguishers, show strong similarities with methods subsequently adopted by contemporary mark-up languages (e.g. HTML, LaTeX). The current research prototype will build on this separation of the rhetorical from the textual elements of a text (e.g. by defining the display characteristics of the text independently from its actual content). 2.5 Multimodal Systems Multimodal systems use more then one presentation medium (e.g. text; graphics; video; audio; hypertext) to convey a message. Researchers (Mackinlay, 1987; Feiner and McKeown, 1991) have been quick to recognise the improvements this could make in a document’s ability to effectively express its meaning. If a picture is truly “worth a thousand words” then a document using text, graphics and other media in a complementary fashion must surely be superior to one based on text alone? Feiner and McKeown (1991) built on the work of Mackinlay in developing their Coordinated Multimedia Explanation Testbed (COMET). COMET used artificial intelligence to create a constraint satisfaction program which matched requests from its users with the underlying data and knowledge of the user’s prior interaction with the system. The system would subsequently generate multimedia presentations automatically. COMET employed separate text and graphics generators (Figure 3):
  • 22. Ian Whitby (P2207964) Page 22 Figure 3 COMET Architecture (Feiner & McKeown, 1991) The “Content Planner” drew on knowledge of its underlying data sources - a static object source; a rule base to drive text generation; and a geometric knowledge base for graphic generation. In addition the Content Planner maintained knowledge of the user context and prior interactions with the system. Output from the Content Planner consisted of a hierarchy of logical forms which were passed to the Media Coordinator to determine whether they should be generated as text or graphic representations. Feiner and McKeown (1991) decided upon a categorisation of logical forms: “After conducting a series of informal experiments and a survey of literature on media effectiveness”. (Feiner and McKeown, 1991, p.36) Building on the work of Mackinlay (1987) the Media Coordinator employed effectiveness criteria to derive six information categories for representation of logical forms. Physical and locational attributes were represented entirely through graphics, whilst abstract actions and relationships were generated as text. All other forms (e.g. physical actions) were generated as a combination of both text and graphics. Annotated logical forms were forwarded to the Text Generator, the Graphics Content Planner Media Coordinator Text Generator Graphics Generator Media Layout Render & Typeset Logical form Annotated logical form Text Illustrations
  • 23. Ian Whitby (P2207964) Page 23 Generator, or both for generation before layout of these elements and final typesetting. Throughout this process the Text Generator and Graphics Generator interact, allowing graphics to be placed within the text structure or the text to reference properties of the graphics. The Graphics Generator used the annotated logical forms to apply different attributes (e.g. size, shape, material, colour, lighting) to its pre-defined visual objects (e.g. pictures) rather than dynamically generate the graphic content. This differed from Mackinlay’s work where the 2-dimensional graphics were built “on- the-fly” through measures of expressiveness and effectiveness. Wahlster et al (1993) developed a similar multimodal presentation system to COMET. Their WIP system employed a Presentation Planner (analogous to COMET’s Media Coordinator) and separate text and graphic generators. Presentation Strategies TAG Graphics Design Strategies RAT Basic Ontology User Model ... Knowledge Base Document Design Plans Text Realization Text Design Graphics Design Graphics Realization Presentation Planner Layout ManagerPresentation Goal Generation Parameters Application Knowledge used in RAT Mower Espresso Machine Modem ... Illustrated Document Figure 4 WIP Architecture (Wahlster et al, 1993) The WIP Presentation Planner employed a set of generation parameters (e.g. user ability, layout preferences) to gauge the suitability of its candidate designs. These parameters were comparable to the effectiveness criteria used by both Mackinlay (1987) and Feiner and McKeown (1991). WIP differs from earlier research work in
  • 24. Ian Whitby (P2207964) Page 24 its clear separation of text/graphics design from its subsequent realisation. This separation allows opportunities for customisation of presentation elements. Time constraints prevented the current research from exploring the potential of multimodal displays. An early objective had been to examine the contribution such mixed media types made to document understanding and, in fact, the prototype’s web browser technology was chosen for its widespread support of such media. Further research is required to quantifying the contribution made by multimodal displays. 2.6 User Customisation Petre (1995) argued for a considered balance between the relative contributions of text and graphics to the overall understanding of a multimodal presentation. In studying visual programming techniques on groups of novice and expert users she observed considerable differences in their inspection strategies for graphics. Experts demonstrated effective navigation, a strong correlation of inspection strategy with goals and attention to secondary notation (e.g. layout, logical flow, colour conventions). Novices showed wide variations in strategy, some sticking rigidly to inappropriate strategies whilst others changed strategy in an unpredictable, chaotic manner. Petre (1995) concluded that graphical readership was an acquired skill – experts were able to take advantage of such secondary notation cues whilst novices struggled to accurately interpret them. Her assertions support the view that multimodal presentations require customisation to the needs of their audience (Matsushita et al, 2003). Milosavljevic et al (1996) extended the work on Natural Language Generation with their PEBA-II system to include the generation of dynamic hypertext. This illustrated how web sites could adopt NLG to dynamically building pages at the
  • 25. Ian Whitby (P2207964) Page 25 point of invocation. This flexibility allowed PEBA-II to tailor its output to the user’s preferences and their previous discourse with the site. These ideas will be examined further in the current research. In 1997 Calvi and De Bra described adaptive information browsing as a means of filtering both the navigation links and, to a lesser extent, the content presented to students at the Eindhoven University of Technology. The authors cited evidence that comprehension of presentation material relied upon the recipient building a conceptual model of the information and the semantic relations implicit within it (Van Dijk and Kintsch, 1983). Calvi and De Bra (1997) argued that the network navigational links presented in web-based documents placed a high cognitive overhead on the reader, leading to a reduction in their comprehension of the material itself. The authors developed an adaptive hypermedia system which presented users with a subset of the available navigational links (and hence content). The constituents of this subset were determined on a per-user basis by modelling each user’s previous navigation through the system. Unlike Milosavljevic et al’s (1996) PEBA-II system, this implementation did not demonstrate NLG in the true sense, restricting itself to toggling the visibility of fixed-format content within the outputs; however it did actively model the user context. Initial feedback on Calvi and De Bra’s (1997) system from students suggested concerns with this approach to information hiding: “Users seem nevertheless to complain about the impossibility of the present formalism to provide them with a snapshot of the system’s complete structure”. (Calvi and De Bra, 1997, p.272)
  • 26. Ian Whitby (P2207964) Page 26 Such comments are relevant to the current research. Calvi and De Bra’s (1997) study demonstrates that although it is possible to modify the design, content and navigation paths to account for a user’s previous interaction with the system this may prove counter-productive. Initial evidence suggests that users have in mind a conceptual model of the system and that altering the waypoints and pathways through this model reduces their acceptance of and proficiency with the actual system. The arena of proficiency-adapted information browsing (Calvi and De Bra, 1997) constitutes a significant area of research, majoring on the psychological aspects of human learning and understanding. The current research does not attempt to explore these topics, stopping short of modelling the user context within the research prototype. 2.7 Meaning and Markup Bateman et al (1998; 2001) presented an architecture which unified the data- aggregating methods of information visualisation and the communicative-goal techniques of NLG. Their KOMET-PAVE experimental prototype drew on Formal Concept Analysis (Wille, 1982) and the construction of dependency lattices to design both text and graphics. The system generated all potential texts and diagrams from a single pass through these dependency lattices and, through the extensive use of heuristics, arrived at its final choice of multimodal presentation. Reiter and Dale (2000) in their milestone textbook “Building Natural Language Generation Systems” defined standard architectural components in the creation of NLG systems. These are summarised in Figure 5 below.
  • 27. Ian Whitby (P2207964) Page 27 Content Determination The determination of which human-authored texts (the “corpus text”) will be targeted as NLG output texts (the “target text”). ↓ Document Structuring The assignment of the output text into message collections or groups and the relationships that applies to these groups (i.e. the “rhetorical structure” of the target text). ↓ Lexicalisation The identification of the words or dictionary of words (the “lexicon”) which will be applied to the target texts. ↓ Referring Expression Generation Determination of the expressions that will be used to refer to entities in the target texts. ↓ Aggregation The conceptual organisation of these rhetorical structures into linguistic structures (e.g. sentences, paragraphs) ↓ Linguistic Realisation The physical organisation of these message groups into linguistic structures. ↓ Surface Realisation Mapping abstract structures (e.g. paragraphs) onto the symbols necessary to display them in a document presentation medium. Figure 5 NLG Architecture Components The authors’ work shows a clear distinction between the logical construction of text and its physical presentation and provides pointers for the current project. Reiter and Dale’s (2000) use of markup languages (e.g. HTML and LaTeX) to provide surface realisation are employed in the research prototype as a mechanism for customising individual user displays.
  • 28. Ian Whitby (P2207964) Page 28 Power (2000) described a method by which the rhetorical structure for a text could be realised through a text structure of sections, paragraphs and sentences linked by discourse connectives (e.g. “since”, “however”, “whilst”) to mark the rhetorical relations. He contended that text structuring could be formulated as a constraint satisfaction problem whereby all potential texts were generated by satisfying the text structure constraints and the most suitable of these texts could be identified through the application of additional constraints. Power (2000) argued that whilst research efforts have largely focused on building rhetorical structures to organise elementary propositions into hierarchies less attention has been given to realising these rhetorical structures as text structures. He argues that such text structures are of sufficient complexity to require consideration early in the construction of the texts rather than on a per sentence basis. Power illustrated how the ICONOCLAST text structuring system was able to generate all possible candidate text structures from the rhetorical structure. Each candidate text structure represented an ordered tree in which the non-terminal nodes are labelled as text categories, and the terminal nodes held either discourse connectives or propositions (i.e. the content of the assertion). For example the following output text might be generated through NLG: “Energy stocks closed higher due to rising prices; however, the FTSE100 falls to record low.” Mann and Thompson (1986, 1987) categorised the style of relational proposition which links these phrases/discourse connectives as a “concession”, whose rhetorical structure is represented in Figure 6 below.
  • 29. Ian Whitby (P2207964) Page 29 concession NUCLEUS SATELLITE falls(FTSE100, record low) cause NUCLEUS SATELLITE closed(energy stocks, higher) rising(energy stocks, prices) Figure 6 RST Concession Example Power (2000) contended that each rhetorical structure would be applicable to one or more text structures, for example the following could produce by the RST structure above: sentence text_clause text_clause text_phrase text_phrase text_phrase text_phrase closed(energy stocks, higher) “however” falls(FTSE100, record low) text_phrase text_phrase “due to” rising(energy stocks, prices) Figure 7 Concession Text Structure The ICONOCLAST system, assigned both TEXT-LEVEL attributes and INDENTATION levels to each node within each text structure. An arbitrary number of TEXT-LEVELs are possible, with five chosen by the designers:
  • 30. Ian Whitby (P2207964) Page 30 TEXT_LEVEL Aggregation Level 0 text-phrase 1 text-clause 2 text-sentence 3 paragraph 4 section Figure 8 ICONOCLAST Text Structure Levels Nodes are also aligned by INDENTATION level, ranging from zero (no indentation) to the chosen maximum. Each candidate text must adhere to the rules of the text structure, namely that each TEXT-LEVEL consists of one or more nodes of a lower TEXT-LEVEL value (e.g. sections consist of paragraphs and paragraphs consist of text-sentences). It is argued that all candidate solutions could be generated by the addition of two further rhetorical structure node attributes – ORDER (the linear position of the texts relative to its peers) and CONNECTIVE (e.g. ‘since’; ‘consequently’). Candidate solutions are obtained through a number of steps: 1. Add TEXT-LEVEL and ORDER attributes to each of the rhetorical structure nodes. 2. Assign domains to both the TEXT-LEVEL and ORDER attributes (i.e. the possible ranges of these attributes for each node) 3. Apply constraints (e.g. root node should have a higher TEXT- LEVEL than its child nodes). 4. Compute all possible combinations. 5. Compute complete text structures which satisfy the text structure formation rules. This includes adding discourse connectives to either the nucleus or the satellite. 6. Validate all text structures (e.g. ensure each child node has a parent at the TEXT-LEVEL directly above it). Power also details the practical difficulties in completing all possible combinations and the necessity to pre-filter texts before developing their rhetorical and text
  • 31. Ian Whitby (P2207964) Page 31 structures. The current work will look at this and the subsequent works by Bateman and Delin (2001) and Power et al (2003) in isolating the structural and linguistic components of a document. Bateman and Delin (2001) and Bateman et al (2002a; 2002b) investigated the genre of multimodal documents. These authors contend that documents satisfy communicative goals on five levels: Content Structure - The structure of the information that is to be communicated. Rhetorical Structure - Rhetorical relationships which exist between content elements (How the content is ‘argued’). Layout Structure - The nature, appearance and position of document elements. Navigation Structure - Structures used to support the communication of the document’s message (e.g. Titles, colour, grouping). Linguistic Structure - The structure of language used to realise the document. The authors believe the manner in which a document harmonises these structures imparts to it a certain genre. This genre is further enhanced by its satisfaction of a series of constraints: Canvas Constraints - Constraints arising from the physical nature of the document produced. Production Constraints - Constraints arising from its production technology. Consumption Constraints - Constraints arising from the time, place and manner in which the document’s information is imparted. Bateman and Delin (2001) believed all document designs could be critiqued using these techniques to yield predictions of its usability, an approach supported by Power et al (2003) who argued that document structure should be defined as distinct elements. Power et al (2003) envisage three such elements: Rhetorical
  • 32. Ian Whitby (P2207964) Page 32 Structure, Text Structure and Abstract Document Structure (Figure 1). Their work builds on Power’s earlier research (2000) in which he identified both the rhetorical structure and the text structure of a document (Figure 7 & Figure 8). The authors argue that the text structure, through its arrangement of text and use of font variations, also imparts a significant graphical component to the text which directly contributes to its meaning. Power et al (2003) argue this third element, termed the “Abstract Document Structure”, must be considered in isolation from the text structure itself. The authors believe that to view the Abstract Document Structure as a component of text structure limits choices of both layout and wording. Using examples they illustrate how the deferral of abstract document structure (e.g. text layout and font characteristics) until after the formalisation of the text structure creates restrictions on the text and layout possibilities, with text layout decisions taken at a sentence or paragraph level rather than on a document-wide level. In part Power et al (2003) build their arguments on the earlier work of Nunberg (1990) who identified the need to separate the mapping of syntactic structures onto the lexical elements of a text grammar from its functional elements (e.g. font- and face-alterations), differentiating between concrete features of text structure and the abstract (Chapter 2.4). Power et al (2003) expanded upon the author’s earlier research (Power, 2000) to illustrate how rhetorical structure could be transposed into an ordered text structure tree and highlighted the problems of doing so (e.g. with ELABORATION). The paper detailed their work on the ICONOCLAST system and the generation of multiple texts/text structures from the same rhetorical structure. ICONOCLAST used a five stage process to generate natural language:
  • 33. Ian Whitby (P2207964) Page 33 Planning Module – Organises the document into a single rhetorical structure, in which each non-terminal node was represented by a rhetorical relationship and each terminal node by a simple proposition. Preliminary Module - Selects simple propositions from a knowledge base and organises these into arguments. Document Structurer - Distributes these arguments into sections, paragraphs and text-sentences. Syntax Realiser - Formulates the wording of propositions. Formatter - Applies the abstract document structure (e.g. graphical elements; font variations; tabbing). This step-wise process is very relevant to the NLG tasks necessary for the current research. It illustrates how the project might utilise data fed into a data repository to produce text content for publication within a web site. ICONOCLAST used constraint-resolution methods to formalise the potential valid options for solving the rhetorical structure. Again this method has relevance to the current research and the use of constraint-satisfaction programs, such as SICStus Prolog, to achieve this. Power et al (2003) determined that through the use of further classifications they would be able to generate all possible document permutations. The authors were unable to provide a means by which the most suitable permutation might be selected, using intuition for their simple examples. This is the same issue encountered by Mackinlay (1987) and others in their search for the “effectiveness” of texts. The current research will seek responses from a survey to provide an independent measure of effectiveness for the generated outputs. Matsushita et al (2003) extended the early work of Mackinlay (1987). The authors considered two factors in chart selection – the type of chart display and the type of user utterance. They reasoned that frequently users are unable to succinctly state how to query the underlying data, particularly for large datasets, but rely instead on exploratory data analysis. Through a series of intermediate query steps and
  • 34. Ian Whitby (P2207964) Page 34 associated graphic representations of the results users are able to iterate towards a final solution. Each successive query builds upon knowledge learnt from the preceding queries. Matsushita et al (2003) argue that the resulting graphic presented to the user must consider this user interaction in its chart generation. The authors argue that decisions on graphic effectiveness must include the context in which the user poses the query. 2.8 Summary NLG has been an active area of research since the 1980s. Mann and Thompson’s (1987, 1988) work on RST provided a framework by which discourse could be analysed and described. Mackinlay’s (1987) work proposed a method by which graphic designs could be selected on the basis of composition algebra. Subsequent authors have elaborated and expanded upon these early endeavours. The current research seeks to determine whether a commercial data source can provide sophisticated page content and whether building user preferences into the resultant pages improves their effectiveness. This work builds on the earlier work of others. Mackinlay (1987) postulated that effectiveness could not be derived theoretically whilst no theorem of human perceptual capabilities existed. Twenty years on this remains the case and the current research adopts a survey strategy to gauge the views of respondents. Mann and Thompson’s (1987, 1988) work on rhetorical structure theory will be used within the current research. Through the methods proposed by Reiter and Dale (2000) the lexicon, style and rhetorical structure adopted in manually-authored texts will be documented to form the basis of the target text corpus generated by the prototype through NLG.
  • 35. Ian Whitby (P2207964) Page 35 Petre’s studies (1995) on multimodal presentations concluded that graphical readership was an acquired skill, with experts able to take advantage of such secondary notation cues whilst novices struggled to accurately interpret them. Her assertions support the view that multimodal presentations require customisation to the needs of their audience (Matsushita et al, 2003). The current research investigates the customisation of page presentation and, through detailed analysis of survey responses, identifies those elements regarded as most effective by users. Bateman and Delin (2001) and Bateman et al (2002a; 2002b) argued for the division of text into five structural levels whilst Power et al (2003) envisaged a separation of the abstract document structure from its text or rhetorical structure. These divisions have affinities with current mark-up languages (e.g. HTML; LaTeX), founded on the principle of clear separation between document presentation and discourse structure. The current research uses web-based browser technology as its presentation medium, employing HTML and CSS (Cascading Style Sheets) to isolate the visual components from the underlying structure.
  • 36. Ian Whitby (P2207964) Page 36 Chapter 3 Research Strategies 3.1 Introduction Before moving on to the primary research it is worth reflecting on the strategies and methods available to the study. Whilst the review of the literature placed the current work in context it yielded little on the techniques necessary to undertake this endeavour. This chapter reviews these strategies and assesses their applicability. 3.2 Review of Research Strategies Research activity is, by its very nature, a wide-ranging and varied discipline. Methods vary with the field of study, the intended purpose, the approach adopted and the nature of the work (Sharp et al, 2002). Despite this each study must be repeatable, verifiable and contribute to the sum of knowledge. These demands create a need for general principles which can be employed, singularly or collectively, to research. Denscombe (2007) reviewed the major strategies available for small-scale social research projects and proposed a grouping of eight categories: 1. SURVEYS 2. CASE STUDIES 3. EXPERIMENTS 4. ETHNOGRAPHY 5. PHENOMENOLOGY 6. GROUNDED THEORY 7. MIXED METHODS 8. ACTION RESEARCH Robson’s (1983) review added a further category of “Psychological Experiments” to this list.
  • 37. Ian Whitby (P2207964) Page 37 3.2.1 Survey Surveys provide established techniques by which hypotheses can be tested against the actual views or actions of respondents and are widely used in social research projects. Denscombe (2007) summarised these techniques as: postal questionnaires; internet surveys/questionnaires; face-to-face interviews; telephone interviews; documents; observations. Surveys give wide coverage of the problem domain and are often selected to provide a broad range of views and responses on the subject area. Inevitably time or resources limitations mean surveys will contain a subset of the total population (a “sample”). Rowntree (2000) emphasised the difference between surveys which describe or summarise their sample and those which use it to make inferences on the wider population. In the former observations have applicability only to the sample itself whilst in the latter there is an assumption that the sample is representative of the total population and that inferences can be drawn from the sample for every instance of the population. A sample strategy which looks to make inferences on the total population must address the issues of random sampling and skewed distributions within its sample. Surveys focus on a point in time, a snapshot, from which a random sample of views or actions is observed or inferred. Further snapshots can be taken subsequently but the random nature of sampling means these samples are unlikely to contain the same respondents. This effect restricts the ability of surveys to analyse or describe variations over time except at a macro level. Denscombe (2007) also noted their empirical nature, targeting the measurement and recording of tangible attributes. Such characteristic would prove restrictive in a study of human relationships but are
  • 38. Ian Whitby (P2207964) Page 38 well suited to the current project where changes of independent variables cause quantifiable changes in the associated dependant variables. The survey approach is applicable to the current research which focuses on a point in time and seeks to quantify the influence of independent variables on the NLG outputs. The study looks to infer general characteristics from its sample, both for the effectiveness/sophistication of share reports and for web-based NLG documents in general. Implicit in this is the need to target a representative sample, whilst acknowledging that domain experts will be geographically dispersed and represent a small proportion of the total UK population. The study’s focus on web-based reports suggests an internet survey/questionnaire to be suitable to its purposes. 3.2.2 Case Studies In many ways a case study represents the antithesis of the survey, focusing on the specific and studying few instances in detail rather than a higher level examination of the broader population. Case studies tend to focus on relationships and processes, making them particularly appropriate to studies of human sociology and psychology. A case study will often involve monitoring the phenomenon in its normal setting and allowing the researcher to observe interactions between variables. Multiple methods are commonly employed to analyse the phenomenon and results are more qualitative than those of a survey, relying on description rather than statistics to characterise the phenomenon under study. As with surveys a case study may look to infer the applicability of its results beyond the sample. A case study approach is applicable to this research and was initially selected as a means of obtaining feedback. The approach requires a limited number of domain experts to make significant time available to the study (perhaps three sessions of one
  • 39. Ian Whitby (P2207964) Page 39 hour each). However early enquiries to share clubs within a 30 mile radius revealed that only a handful of clubs existed and of those only one met regularly and was willing to participate. Further investigation showed that this would not yield sufficient domain experts, with meetings scheduled only monthly and very low attendance figures (usually two or three and occasionally fewer members). Assuming not all members used web-based share reports this represented an unrewarding line of analysis. The research adopted an internet-based survey/questionnaire approach to gain a large sample population, although this brought attendant issues on how representative the sample was to the overall population (Chapter 4). 3.2.3 Experiments Experiments are characterised by the control they exhibit over their environment and their identification of causal factors. They aim to isolate the study from external influences and manipulate input variables to understand their influence on the resulting outputs. Experiments use empirical measurements and observations to draw conclusions from the results of the study. These conclusions have applicability beyond the experiment itself and, through careful isolation of competing factors, the experiment infers the causes of phenomena observed in the real world. The current study investigates a complex research area. In measuring and appraising the “effectiveness” of user customisation we need to understand what domain experts understand by “effective”. Furthermore, deciding if a “commercial data source can provide the quality of data required to generate sophisticated web page content” (Chapter 1.3), requires us to test the users’ understanding of “sophisticated”. The experimental approach provides the opportunity to achieve this. It isolates external influences, keeping the underlying data consistent across the experiments
  • 40. Ian Whitby (P2207964) Page 40 and removing the intangible effects of market sentiment and psychological pressures associated with share trading. By providing a prototype environment in which casual factors are controlled it is able to focus on the effects the remaining factors have on the results. The internet-based survey/questionnaire forms an integral part of the experiment (Appendix B) and probes respondents on specific cause-and-effect relationships (e.g. “How significant was colour in identifying text groups?”; “How similar was the grammar to that used in the ADVFN reports?”). 3.2.4 Ethnography Ethnographic studies analyse people and groups through their lifestyles, understandings and beliefs. This strategy has its origins in anthropological research and the study of alien cultures. Ethnography is a detailed examination of human society and social behaviour, often conducted from an insider’s viewpoint. This approach offers little to the current research. 3.2.5 Phenomenology Phenomenology is a strategy which deals with human perceptions, studying their feelings, emotions, beliefs and attitudes. It analyses phenomena through a descriptive approach aimed at interpreting its findings rather than quantifying them. The approach is characterised by seeking to describe and explain phenomena without abstracting, classifying or quantifying them. The current research examines the coherency of web designs and the ability of commercial data to provide sophisticated NLG content. Although elements of this assessment are subjective (e.g. interpretation of designs as they appear to others) phenomenology’s focus on feelings and emotions does not offer anything to the overall aims of this research.
  • 41. Ian Whitby (P2207964) Page 41 3.2.6 Grounded Theory Grounded Theory is an approach aimed at the generation of new theories rather than the verification of existing theories. The approach focuses on observations of the real world to build theories around these observations on the basis of empirical research. This strategy offers little to the current research which looks to challenge Dale et al’s (1998) theory that commercial data is too noisy to generate sophisticated NLG content and Matsushita et al’s (2003) assertion that user preferences lead to improvements in web-page effectiveness. 3.2.7 Mixed Methods A mixed method strategy emphasises the benefits of an approach which combines quantitative and qualitative analysis to improve research results. A central characteristic of the approach is that the study is not seen as an “either/or” decision between empirical and subjective analysis but a consistent application of both. Mixed method advocates select research techniques on their ability to clarify the problem domain, not on the basis of their categorisation as belonging to a particular approach. The current study is not specifically designed as a mixed method approach, although the incorporating of a survey within an experiment shows its applicability to the work. Furthermore the open questions within the survey require an alternative approach to the empirical path taken for the multiple choice closed questions. 3.2.8 Action Research Action research is a practical strategy aimed at real world problems. Both researcher and respondents are influence by the study and often become active collaborators in achieving its goals. The strategy is geared towards engineering change in real world
  • 42. Ian Whitby (P2207964) Page 42 situations and is cyclic in approach, each change feeding back into a cycle of renewed actions. The current research does not seek to establish real world change. It is expected that users might freely donate a little of their time to the study but unrealistic to expect respondents to become active collaborators in addressing the issue of sub-optimal web designs. 3.2.9 Psychological Experiments Robson (1983) reviewed strategies for designing psychological experiments and identified the need to differentiate between variables manipulated by the researcher (“independent variables”) and those variables observed by the researcher to determine the effect of this manipulation (“dependant” variables). Independent variables usually represent study inputs whilst dependant variables are its outputs. Within the current research user preferences will act as independent variables whilst the effectiveness of the generated pages is the dependant variable. In this sense the research shows the characteristics of an experimental strategy. Robson (1983) also refers to independent variables whose value is not derived from a numeric scale but through a subjective categorisation (e.g. “big”, “small”, “often”). Such variables do not lend themselves to rigorous quantitative analysis but to more qualitative analysis. The open questions of the research questionnaire are of this nature. The questionnaire ensures respondents have an opportunity to expand on their responses by supplementing the closed questions with a number of open ones. Analysis of these responses shows the characteristics of a mixed method strategy.
  • 43. Ian Whitby (P2207964) Page 43 3.3 Summary The study looks at current practices within NLG and examines the use of constraint satisfaction methods. An experimental approach is adopted in which the inputs to a prototype system are carefully controlled and the factors influencing the effectiveness and sophistication of the resulting reports carefully assessed. Existing work in this field (Wahlster et al, 1993; Reiter and Dale, 2000; Power et al, 2003) is used as the basis for the research prototype. The prototype separates the report’s presentational elements from its rhetorical and text structure through the use of cascading style sheets. Users are encouraged to exploit this separation in customising the presentational aspects of the report. The research also adopts a survey approach in eliciting the views of domain experts through an internet-based survey/questionnaire (Appendix B). The prototype dynamically generates a range of web page designs which, to varying degrees, incorporate aspects of user preference.
  • 44. Ian Whitby (P2207964) Page 44 Chapter 4 Data Acquisition 4.1 Introduction This chapter extends the research strategies previously outlined to document specific methods used for the current research. Acquisition techniques were applied both in determining the necessary inputs for the prototype and gathering user feedback on its outputs. Inevitably much of the input data for the handwritten reports came from sources unavailable to the study (e.g. inflation rates, market sentiment, foreign exchanges). Which text could be reproduced by the prototype was determined through analysis of the rhetorical and textual structure of the handwritten reports together with the creation of a dictionary of terms used by human authors. An overview of the research prototype is described along with the commercial data used and the process by which the range of output texts was determined. 4.2 Data Sources This research study focused on the structure and content of web-based reports aimed at informing the reader of daily share price movements for companies listed on the London Stock Exchange. Reports of this nature already exist and an internet search yields a wealth of information, ranging from the LSE itself (www.londonstockexchange.com) to internet-based news organisations (e.g. http://uk.reuters.com/business/markets; http://money.uk.msn.com) and the on-line presence of traditional broadsheets (http://business.timesonline.co.uk/tol/business/markets/; http://www.ft.com/markets). This project does not attempt to study each of these sources, recognising that the differences in their reporting styles would mask any variations caused by adjustments to the study’s independent variables. Attempts to assess the effectiveness of user customisation or the sophistication of NLG text are unlikely to prove fruitful when
  • 45. Ian Whitby (P2207964) Page 45 the benchmark against which they are compared represent a collection of varying styles. The current research opted for a single information source from internet company ADVFN (http://www.advfn.com/) to provide both daily versions of their own share reports and access to the underlying LSE data. Although the ADVFN reports and data are publicly available upon payment of a subscription fee their content is clearly aimed at trading experts with their web site geared towards monitoring, charting and reporting stock market movements for trading purposes. The study subscribed to ADVFN’s daily market data service (“Level 1” service), to receive a trading summary each evening for all shares listed on the LSE (Table 1). Variable Description TIDM Share identifier (Tradable Instrument Display Mnemonic) Opening Opening price at start of trading day High Highest price achieved during the day Low Lowest price achieved during the day Closing Closing price at end of trading day Volume Volume traded during the day Table 1 LSE Data Received The prototype provided access to LSE data and ADVFN reports across a three month trial period (Sep to Nov 2008) to simulate a wide variety of actual market conditions. Examples of an ADVFN share report and the Level 1 data are provided in Appendix A. It should be noted that whilst not required for the aims of this study the prototype could also have received market data in real-time (”Level 2” service), allowing share reports to change dynamically throughout the trading day. 4.3 Prototype Design The study employed a variety of technologies to produce a prototype capable of addressing the aims of the research. The Level 1 data was manually downloaded from ADVFN each evening as comma-separated files and uploaded into a MySQL
  • 46. Ian Whitby (P2207964) Page 46 database (Figure 9). The ADVFN data values were unchanged by this process, with no validation, error correction or data manipulation performed during committal to the Daily_Values table. Subsequent use of this data was also read-only to ensure all experiments were repeatable, albeit with minor variations of the actual texts produced due to the dynamic nature of NLG. FTSE100 index values were committed to the Index_Values table by a similar process whilst the remaining tables were populated through a one-off data entry exercise. Figure 9 Prototype - Entity Relationship Diagram The database also illustrates future design considerations. Whilst the Indexes table contained only a single entry for the FTSE100 it could easily be extended to other market indexes. Similarly the views (labelled “VW_”) allowed market trends to be analysed over time, a feature not implemented due to time constraints.
  • 47. Ian Whitby (P2207964) Page 47 The general purpose scripting language PHP provided the application “glue” between the database, the SICStus Prolog constraint satisfaction program and the HTML web pages (Figure 10). From the HTML interface individual display/data preferences were recorded as transient session variables, lost on termination of the session to ensure respondent anonymity. Share reports were invoked from the “Reports” page (Appendix C) by simply choosing the required date, with PHP querying the database for all share values and FTSE100 movements relating to this date. Figure 10 Prototype – High Level Design The script interrogated the direction of movement for the FTSE100 on the chosen day and, based on whether the index was rising, falling or static, altered the lexicon of terms employed by the SICStus Prolog constraint satisfaction program to build its outline sentences. These sentence templates were subsequently merged with the data values queried previously and the most significant movements built into complete sentences. The PHP queried the data and display preferences of the user adjusting the content and presentation layout of the NLG report before presenting it at the browser. Report Date ADVFN Share Data Share Data M801 Website GenerationM801 Website Generation Prolog Preferences Report Date Sentence Generation HTML HTML Sentence “Templates” Display Preferences Report Sentences Share Report Share Preferences Database
  • 48. Ian Whitby (P2207964) Page 48 The prototype encouraged respondents to experiment with the site (www.ianwhitby.co.uk), revise their preferences and assess the effect on the generated reports. Each NLG web page included a hyperlink to an equivalent handwritten report, displayed in a separate browser window, for comparison with the prototype’s output (Figure 11).
  • 49. Ian Whitby (P2207964) Page 49 Figure 11 Comparison of NLG and Manual Reports The prototype design shows simple navigation between pages, engendering a sense of workflow (Figure 12). The principal pages in this workflow were accessed through a tabbed menu structure with the feedback questionnaire marking the culmination of a user’s interaction with the system. Figure 12 Prototype - Workflow 4.4 Questionnaire The research questionnaire (Appendix B) illustrated a number of characteristics: 1. The number of questions was limited to twenty three to encourage respondents to fully complete the questionnaire. As Denscombe (2007) cautioned: “…there is, perhaps, no more effective deterrent to answering a questionnaire than its sheer size”. 2. Formats were standardised across the questionnaire to ease comprehension. 3. Respondents were not asked to login or provide any information by which they might be identified. The author believes the response rate would have been considerably lower had respondents been required to identify themselves. Home Page Survey/ Questionnaire NLG Reports User Preferences
  • 50. Ian Whitby (P2207964) Page 50 4. Most questions were multiple choice and closed in style (i.e. answers were restricted to the options provided). This format was chosen to allow respondents to fully complete the questionnaire with minimal effort. 5. Multiple choice answers were presented as a 5-star rating system accompanied by supporting descriptions to aid comprehension. 6. Three open questions were provided for more expansive answers. 7. All questions were optional with the author choosing to allow partially completed questionnaires in the results as the subject matter was deemed likely to cause omitted answers. The author considered that forcing respondents to complete all questions would significantly lower the response rate. The responses received support this view with five of the seventeen responses received failing to fully answer the closed questions. Rejecting these results from the analysis would have excluded a significant proportion of the sample (29%). 8. With no login procedure there was no limit on the number of responses an individual could submit and the prototype did not validate the answers received. This decision was taken to increase the response rate but did increase the risk of accidental or deliberate contamination of the sample. 9. Responses were written to a file on the server in a format that could be directly loaded into the database for statistical analysis, reducing the risk of transcription errors. Prior to the trial period the author conducted a pilot study with work colleagues within the Technical Assurance/Design Authority department at National Grid.
  • 51. Ian Whitby (P2207964) Page 51 The feedback received led to improvements in both the questionnaire and the site itself. The prototype was subsequently published and feedback sought from individuals and organisations with appropriate share trading knowledge (Table 2). Individual/ Organisation Potential Number of Respondents Method of Communication Energy Trading Department – National Grid 10 Face-to-face communication and follow- up meetings. Share trading clubs (generally internet- based) 200 Email to Secretary of each club requesting he/she forward the request to the members. (N.B. Twenty clubs were contacted – the figure opposite assumes an average membership of ten) Family and Friends 40 Letters encouraging those with share trading experience to respond to the survey or to forward the link on to others. Work Colleagues 100 Email encouraging those with share trading experience to respond to the survey or to forward the link on to others. M801 Students 30 Message posted on the M801 Chat Forum encouraging those with share trading experience to respond to the survey. Total 380 Table 2 Users Targeted by Survey From the target sample of 380 users it was felt 25-30 responses could be achieved. National Grid traders and share club members would comprise a high proportion domain experts and a 10% response rate was envisaged whilst a 5% response from other groups would give a total sample of approximately 30.
  • 52. Ian Whitby (P2207964) Page 52 The anonymous nature of the responses was identified as a potential problem. As Denscombe (2007) pointed out: “Questionnaires offer little opportunity for the researcher to check the truthfulness of the answers given by the respondents”. The author appraised this risk and chose to accept it rather than opt for the low response rates achievable through face-to-face interviews (Chapter 3.2.2). As mitigation the following steps were taken: 1. The target sample comprised domain experts operating in environments where the aims of the research were likely to be well received. 2. The target sample was sufficiently large for a number of erroneous responses to be accommodated without undermining the results. 3. Specific questions were asked to identify domain experts: How many years have you been trading shares? > 10 years 5 - 10 years 3 - 5 years 1 - 3 years 0 years How regularly? Daily Weekly Monthly Rarely Never Which media type(s) do you use for your share reports? Newspaper/Magazine Television Teletext Internet Datafeed Other How regularly do you read share reports? Daily Weekly Monthly Rarely Never Figure 13 Survey – Trading Experience 4. Additional “expert” questions were posed, requiring domain knowledge to answer correctly, to provide a qualified view of the respondent’s experience:
  • 53. Ian Whitby (P2207964) Page 53 What do you understand by the "AIM" market? A listing of the top 100 companies within the Main Market A listing on the LSE for smaller companies which offers less regulation A listing of defence stocks The Asian Investment Market Don't Know What do you understand by "Going Long"? Buying futures which commit you to buy the asset at a set price in the future Selling futures which commit you to sell the asset at a set price in the future Buying shares as the price of the share falls Selling shares as the price of the share falls Don't Know Figure 14 Survey – Domain Experts The responses received were subjected to a series of statistic and analytical tests and displayed no evidence of malicious misuse of the prototype. 4.5 Analysis of Handwritten Reports ADVFN share reports are published twice daily in HTML format and emailed to subscription members. The reports are derived from knowledge of LSE share movements and an understanding of political and economic influences. The abstract document structure of the reports remained broadly consistent across the trial period as shown by the wireframe below (Figure 15).
  • 54. Ian Whitby (P2207964) Page 54 01 Aug 2008 17:30 ADVFN III Evening Euro Markets Bulletin Daily world financial news from Thomson Financial News Supplied by advfn.com London Lorem ipsum onsectetuer adipiscing elit Lorem ipsum onsectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper aliquip ex ea commodo consequat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi. Nam liber tempor cum soluta Lorem ipsum onsectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi. Nam liber tempor cum soluta Lorem ipsum onsectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. Report Title FTSE100 Content Movements & editorial - standard font weight - standard font size Trend-following FTSE100 Shares Individual share movements within the LSE plus editorial content on market trends and sentiment. This text group consists exclusively of shares showing significant movements in the same direction as the FTSE100. - standard font weight - standard font size Trend-bucking FTSE100 Shares Individual share movements within the LSE plus editorial content on market trends and sentiment. If the previous group consisted of rising stocks then this group consists of falling stocks, and vice-versa. - standard font weight - standard font size LSE Headline - bold font - standard font size Report Date - right-hand side - bold font Market Analysis Content Political & economic factors affecting global markets. - standard font weight - standard font size Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi. Nam liber tempor cum soluta Lorem ipsum onsectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. LSE Section Heading - bold font - underlined font - larger font size Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi. Nam liber tempor cum soluta Lorem ipsum onsectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. Non-FTSE100 Shares Information on LSE stocks outside the FTSE100. Generally structured as shares which followed the FTSE100 trend followed by those which bucked the trend, although the abstract data structure is less rigid than earlier sections of the report. - standard font weight - standard font size Figure 15 ADVFN Report - Abstract Document Structure All reports began with the report title at the top of the page and the report date in the top right-hand corner. All text was justified to align with left and right-hand margins and, with the exception of the LSE section heading, used the same font size and style
  • 55. Ian Whitby (P2207964) Page 55 throughout. The LSE section heading differed from the rest of the report in having larger, underlined text in blue. The LSE headline, in common with all proper nouns, was displayed in bold typeface and demonstrated a simple rhetorical structure aimed at summarising the day’s share movements. Using 13th August 2008 as an example the LSE headline shows a simple nucleus of “London shares close lower” supported by elaborations of additional information (Figure 16). Figure 16 RST Example for LSE Headline Beneath the LSE headline the ADVFN report positioned a paragraph detailing movements of the FTSE100 index and followed this with several paragraphs of market analysis. Below these, movements of FTSE100 stocks which echoed the trend of the FTSE100 were detailed. This section generally reported the top 10 to 15 movers, providing price information and market analysis. This trend-following section was followed by details of the top 10 to 15 movers in the opposite direction. The report ended with a roundup of shares outside of the FTSE100. Figure 17 illustrates these characteristics for the 13th August 2008 evening report. Elaboration London shares close lower; Wall Street drops; Elaboration U.S. retail sales fall
  • 56. Ian Whitby (P2207964) Page 56 FTSE100 Content Market Analysis Content Trend-following FTSE100 Shares Trend-bucking FTSE100 Shares Non-FTSE100 Shares LSE Section Heading Report Date Report Title LSE Headline Figure 17 ADVFN Report (13th Aug 2008)
  • 57. Ian Whitby (P2207964) Page 57 From this abstract document structure it is apparent that the reports cover both the highest risers and fallers whatever the overall direction of the market, only the order in which they are reported changes. The research prototype used two distinct lexicons, one for rising stocks another for fallers, only the order in which they were used changed to reflect the direction of the FTSE100. The prototype did not have the relevant inputs to generate the “Market Analysis Content” section of the report and, in fact, it is questionable whether a section with such high editorial content could be generated through constraint satisfaction methods. This section did not appear in the resulting NLG reports. The research centred on ADVFN’s evening reports which provided information directly comparable to the Level 1 service data. However such daily updates of share movements prevented the research prototype from reporting intra-day share activity. For example the ADVFN report for the 4th August 2008 stated: “The FTSE100 index closed down 34.5 points to 5,320.2, having retreated from a morning peak of 5,414.7” (ADVFN report 4th August 2008) The research prototype could not produce this type of information as the satellite text span (“having retreated from a morning peak of 5,414.7”) required knowledge of intra-day trading activity. The research prototype was limited to the Level 1 data outlined in Chapter 4.2, a much narrower range of information compared to that available to the human authors of the ADVFN reports. The prototype was unable to draw on an understanding of market sentiment, external markets, political or economic factors in its report production. The target text generated for the LSE headline above (Figure 17) is reduced to “London shares close lower”.
  • 58. Ian Whitby (P2207964) Page 58 Similarly the grouping of stocks into market sectors is unachievable with the information to hand. Human authors bundle the reporting of related stocks into cohesive phrases: “Among the casualties, HBOS was off 24 pence at 307, Royal Bank of Scotland was down 15-3/4 pence at 229-3/4, Barclays was 27 pence lower at 355-1/2, and Lloyds TSB shed 20-1/2 pence to 308-1/2.” (ADVFN report – 13th Aug 2008) The prototype, with no feel for the allocation of stocks to market sectors, is unable to achieve this bundling and reports each stock as a separate, unrelated sentence. Similarly expressing price movements of a group of stocks as a range or describing them as list of related items is unachievable. Handwritten reports employ such techniques to add fluency: “Kazakhmys, Vedanta and Xstrata slid between 11.8 and 17.9 cent” (ADVFN report – 19th Nov 2008) “Oil service groups Petrofac and Wood Group were up 23-1/2 at 544 and 17 at 397, respectively…” (ADVFN report – 14th Aug 2008) The prototype is unable to perform these associations from data available to it with “respectively” and “between” becoming redundant from its lexicon of terms. A further differentiator between the handwritten reports and the NLG output is the latter’s lack of elaboration and supporting content. The ADVFN reports use additional text to support primary statements in a manner which cannot be achieved by the prototype. The following sentence is typical of those employed by human authors
  • 59. Ian Whitby (P2207964) Page 59 “Upmarket residential property group Savills was the top FTSE 250 faller” (ADVFN report – 13th Aug 2008) In rhetorical structure terms such phrases represent an elaboration pattern for which the prototype can supply the nucleus (“Savills was the top FTSE 250 faller”) but not the satellite text (“Upmarket residential property group”): Figure 18 RST Elaboration in Handwritten Texts In fact even the nucleus of this particular construct cannot be produced by the current prototype. Although it has knowledge of stocks constituting the FTSE100 the same is not true for alternative indices such as the FTSE250. A decision was taken to limit the prototype’s scope to the FTSE100 and, whilst they could readily be incorporated at a later date, alternative indices are not included in the current implementation. The “Non-FTSE100 Shares” of Figure 17 are thus not created by the prototype. Savills was the top FTSE 250 faller Upmarket residential property group Elaboration
  • 60. Ian Whitby (P2207964) Page 60 4.6 Vocabulary of a Volatile Market The study period of September to November 2008 proved one of the most eventful in LSE history, with sizeable daily swings in market values and a strong downward trend in the FTSE100 as the UK economy headed towards recession (Figure 19). Figure 19 FTSE100 Daily Values and 5-day Trend Line The current research evaluated the words/terms used in handwritten reports and their rhetorical/textual structure in August 2008 prior to the trial. This work developed a lexicon of terms and phrases adopted in the subsequent trial period for computer generation of reports. August represented a stable trading period for the FTSE100, illustrated by the small variations in daily values and steady 5-day rolling average to the left of Figure 19. By the start of the trial market conditions had worsened significantly with increased market volatility and the FTSE100 rapidly losing 20-25% of its value (shown to the
  • 61. Ian Whitby (P2207964) Page 61 right of Figure 19). Critically this research needs to determine whether human authors continued with August’s words and phrases through the trial period. The prototype was unable to alter its lexicon of terms once the study was underway and questionnaire results received. The following sections examine whether the changing market conditions led to changes in the vocabulary used by human authors and the consequences of this. Table 3 shows that during August the FTSE100 showed negligible changes in value for one in five trading days. By the trial period only 8% of days showed variations of less than 0.5% due to the increased market volatility. Pre-Trial (Aug 2008) Trial (Sep – Nov 2008) FTSE100 Trend Total % of Total Total % of Total Upward (daily rise ≥ 0.5%) 6 days 30% 25 days 38% Stable (daily movement < 0.5%) 4 days 20% 5 days 8% Downward (daily fall ≥ 0.5%) 10 days 50% 35 days 54% Table 3 FTSE100 Daily Trends The research categorised trading on the LSE into days exhibiting upward, downward or stable FTSE100 trends. For both the pre-trial and trial periods the lexicon of terms used by handwritten reports within each category were analysed for randomly chosen dates (Table 4).
  • 62. Ian Whitby (P2207964) Page 62 Pre-Trial (Aug 2008) Trial (Sep – Nov 2008) FTSE100 Trend Chosen Date FTSE100 Change Chosen Date FTSE100 Change 5th Aug 08 2.5% 19th Sep 08 8.8% 8th Aug 08 0.6% 13th Oct 08 8.3% Upward (daily rise ≥ 0.5%) 14th Aug 08 0.9% 20th Oct 08 5.4% -0.1% 13th Nov 08 -0.3% -0.0% 25th Nov 08 0.4% Stable (daily movement < 0.5%) 12th Aug 08 21st Aug 08 26th Nov 08 -0.4% 1st Aug 08 -1.1% 29th Sep 08 -5.3% 4th Aug 08 -0.6% 22nd Oct 08 -4.5% Downward (daily fall ≥ 0.5%) 13th Aug 08 -1.6% 19th Nov 08 -4.8% Table 4 FTSE100 Sample Days The handwritten reports were analysed for each of the sample days to create corpus texts (Reiter and Dale, 2000) of the words and terms used by human authors. This exercise was undertaken for each FTSE100 trend before and during the trial and these corpus texts were further refined to remove elements not directly available to the prototype or those which could not be computed from its inputs. For the purposes of comparing the vocabulary used before and during the trial these target text corpora were consolidated into two lexicons of terms, one pre-trial, the other spanning the trial period. Analysis of these lexicons illustrated tangible differences between the pre-trial vocabulary and that used during the actual trial. During the stable pre-trial period the handwritten reports showed little variation in grammar. Reports adopted a restricted vocabulary which, the author contends, reflected the restricted movements within the market. Sentences proved heavily reliant upon a few verbs and a limited range of adjectives. Figure 20 illustrates for the target text corpus the proportional use of verbs across all of the sample day reports in the pre-trial period and Figure 21 the equivalent use of adjectives. Any verb or adjective whose use accounted for more
  • 63. Ian Whitby (P2207964) Page 63 than 1% of the total verbs or adjectives within the target text corpus was plotted as an axis. The distance from the centre illustrates the relative use of the verb or adjective against the total used in sample days in the pre-trial period. Verbs Used in Manual Reports 0% 5% 10% 15% 20% is close fall shed add gain drop lose rise take jump slide Other… slip surge firm rally ease tick up climb end advance lead leap glance have sag sink Verbs Before Trial Figure 20 Pre-Trial ADVFN Reports - Verbs
  • 64. Ian Whitby (P2207964) Page 64 Adjectives 0% 5% 10% 15% 20% dow n up higher low er off on ahead among Other… strongerw eaker mixed top biggest soared battered respectively low plummeting Adjectives Before Trial Figure 21 Pre-Trial ADVFN Reports - Adjectives Over the trial period the lexicon expanded, showing less reliance upon stock terms and phrases. Figure 22 illustrates verbs used within the target text corpus during this period. The heavy reliance upon “is”, “close”, “fall”, “shed”, “add” and “gain” was balanced by a broader range of verbs and a wider vocabulary.
  • 65. Ian Whitby (P2207964) Page 65 Verbs Used in Manual Reports 0% 5% 10% 15% 20% is close fall shed add gain drop lose rise take jump slide Other… slip surge firm rally ease tick up climb end advance lead leap glance have sag sink Verbs During Trial Figure 22 Trial ADVFN Reports - Verbs The contrast between verbs used during the pre-trial period and the trial period itself is shown in Figure 23. The pre-trial vocabulary shows little variety in describing the FTSE100 or movements of individual shares whilst the volatile trading of the trial period coincides with an expanded verb base and more emotive terms. Verbs such as “jump”, “slip”, “surge”, “leap”, “sag” and “sink” gain significance within the lexicon in an attempt to effectively capture the magnitude of changes experienced by the markets.
  • 66. Ian Whitby (P2207964) Page 66 Verbs Used in Manual Reports 0% 5% 10% 15% 20% is close fall shed add gain drop lose rise take jump slide Other… slip surge firm rally ease tick up climb end advance lead leap glance have sag sink Verbs Before Trial Verbs During Trial Figure 23 Trial/Pre-Trial Comparison - Verbs The comparison (Figure 23) also illustrates that, despite a broadening of the vocabulary, the reports continue to draw heavily on verbs used in the pre-trial reports. For adjectives the same holds true, the early dependence on “up”, “down”, “higher” and “lower” is maintained across the trial period (Figure 24 and Figure 25) but more expressive terms (“battered”, “soared”, “plummeting”) begin to enter the lexicon.