An exploration of the steps necessary to prepare a corpus of (mainly) eighteenth century correspondence and make it available for interactive exploration of linguistic and stylistic features.
Exploring rhetoric in the Electronic Enlightenment
1. Digital Stylistics in Romance Studies and
Beyond
University of Würzburg, Germany
Friday 29th
February 2019
Exploring Rhetoric in the Electronic Enlightenment
Martin Wynne
martin.wynne@bodleian.ox.ac.uk
Bodleian Libraries &
Faculty of Linguistics, Philology and Phonetics,
University of Oxford
National Coordinator, CLARIN-UK
2. Exploring Rhetoric in the Electronic Enlightenment
(eventually)
1) The Electronic Enlightenment
2) The FrEE Corpus in CQPWeb
3) NLP for Eighteenth Century French
4) NLP for historical texts
5) Using FrEE in CQPWeb
6) Exploring rhetorical figures in EE
7) Future plans
6. The dataset
Electronic Enlightenment: a collection of scholarly
digital editions of correspondence.
Collection started with Voltaire & Rousseau, expanded
into other Enlightenment thinkers and writers, and has
expanded into other eras, languages and domains. Now
80 000+ letters, 15 000+ people
A project in the University of Oxford, initially part of the
Voltaire Foundation, moved to the Bodleian Libraries in
2008. Access is via institutional subscription managed
by Oxford University Press.
7.
8. Challenges and opportunities
Mon papa m'a fait cette grâce de me comander d'estre son secrettaire
ce premier d'année, et vous tesmoigner les humbles respects de nostre
maison, avec les veux et prières que nous faisons pour vostre
prospérité, santé, bonheur et satisfacion, qui ne sont en doutte de vostre
costé eu égard à nous. Il vous suplie, madame ma cousine, le croire
toujours bon parent et ne vous despartir de l'affecion que vous devez à
sa famille, et moy, le secrettaire, je finiray en me disant, et Zozo,
Vos très humbles et respectueux cousins,
Zozo
Arouet
https://doi.org/10.13051/ee:doc/voltfrVF0850001a1c
9. The opportunity...and the problem
We want to make new forms of research possible by
offering freely available online search and exploration
of our text collections ...
… but in order to search historical French texts
effectively, users need to be able to find inflected
forms and variant spellings. How can we make that
search possible?
10. What are we ultimately aiming for?
Ways to combine close reading with big data approaches.
11. The search for existing resources and prior art
Where are software applications to work
with French texts from these periods?
Where are digital texts in French from the
17th and 18th centuries in original
spellings?
Where are computational lexicons with
older variants?
12. Some preliminary conclusions
Most French digital historical texts
in circulation have modernized
spellings. (They fail the “Horace-
foibleſſe” test – do they have the
original “foibleſſe” or the modernized
“faiblesse” in the digital text?)
Therefore most software tools have had
not had to deal with original spellings.
More research required, but it appears
that few if any tools have been created
or adapted to deal with older forms of
French.
13. THE PROBLEM:
it’s difficult to search for
words across time periods
and in historical collections
THE SOLUTION:
search can be
enriched with
searching for
variant forms of
the search term,
inflections, parts of
speech
OPTIONS
1) fix the search algorithm
2) fix the corpus
FIX THE SEARCH
Develop or customize lexically-aware
search algorithms
OPTIONS
1) Apply customized NLP tools to the historical
corpus
2) Normalize the corpus, then use existing NLP
tools
DOMAIN ADAPTION
Direct application of NLP tools to historical
varieties
●
good for unrestricted text
●
customization, training models, or software
development are necessary
●
gold standard training data required
●
every tool has to be customized or
developed
MODERNIZATION
Translation of variant forms in the
corpus to a modern form, then
annotation
●
Good for fixed and stable corpus
●
Off-the-shelf NLP tools for the
modernized corpus 1
2
OPTIONS
Two ways to do modernization:
1) Machine translation
2) Word by word annotation with
modernized forms
TRANSLATION
inc. retokenization, complex
mappings, elaboration,
semantic change; difficult to
match original and
translation
MODERNIZATION
Word by word annotation with
modernized forms maintains integrity
and alignment with the original text
ANNOTATION
Add further layers of
annotation with standard
NLP tools (lemma, POS,
etc.)
1 2
USING LINGUISTIC
ANNOTATION TO IMPROVE
USABILITY OF HISTORICAL
CORPORA FOR
RESEARCHERS
Martin Wynne, Electronic Enlightenment
and Oxford Text Archive, Bodleian
Libraries, University of Oxford
martin.wynne@bodleian.ox.ac.uk
FIX THE CORPUS
Put the lexical information in
the corpus
2
1
14.
15. <text id="voltfrVF0850001a1c">
<p>
Paris Paris NAM Paris
, , PUN ,
le le DET:ART le
29 29 NUM @card@
décembre décembre NOM décembre
1704 1704 NUM @card@
</p>
<p>
Madame Madame NOM Madame
et et KON et
très très ADV très
honorée honorée ADJ honoré
cousine cousine NOM cousin
, , PUN ,
</p>
<p>
Mon Mon DET:POS mon
papa papa NOM papa
m' m' PRO:PER me
a a VER:pres avoir
fait fait VER:pper faire
cette cette PRO:DEM ce
grâce grâce NOM grâce
de de PRP de
me me PRO:PER me
comander commander VER:infi commander
d' d' PRP de
estre être VER:infi être
son son DET:POS son
secrettaire secrettaire NOM secrettaire
ce ce PRO:DEM ce
premier premier NOM premier
d' d' PRP de
année année NOM année
, , PUN ,
et et KON et
vous vous PRO:PER vous
tesmoigner témoigner VER:infi témoigner
les les DET:ART le
humbles humbles ADJ humble
respects respects NOM respect
de de PRP de
nostre nostre DET:POS notre
maison maison NOM maison
16. Which allows us to search by lemma, pos, original form, or modernized form
17. Rhetoric Figures (after Lanham 1991)
ellipsis: omission of a word easily supplied
parelcon: addition of superfluous words
congeries: word heaps
digestion: an orderly enumeration of points to be discussed
epexegesis: adding words or phrases to clarify or specify further
epicrisis: the speaker quotes a passage and comments on it
epimone: frequent repetition of a phrase or question, in order to
dwell on a point
synonomia: amplification by synonym
antimetabole: inverting the order of repeated words
chiasmus: inverting the order of repeated words or phrases
hypophora: asking questions and answering them
isocolon: phrases of equal length and (usually) parallel stucture
litotes: denial of the contrary
polysyndeton: using a conjunction between each clause
progressio: building a point around a series of comparisons
sermocinatio: the speaker answers the remarks or questions of a
pretended interlocutor
taxis: distributing to every subject its proper adjunct
proverb: short pithy statemement of a general truth (includes
subtypes) [A is B]
zeugma: use of one word to govern several congruent words or
clauses
effictio: a head to toe itemized description of a person
anacoenosis: asking the opinion of one's reader or hearers
ecphonesis: an exclamation expressing emotion
oraculum: "quoting" God's words or commandments
antapodosis: a simile in which the objects compared corresond in
several respects
antistasis: repetition of a word in a different or contrary sense
alliteration: recurrence of an initial Consonant sound,
and some-times of a vowel sound.
assonance: resemblance of internal vowel sounds in
neighboring words.
consonance: resemblance of stressed consonant
sounds where the associated vowels differ.
homoioptoton: using various words with similar case
endings in a sentence or verse.
homoioteleuton: using various uninflected words with
similar endings in a sentence or verse.
paroemion: a resolute alliteration in which every word
in a sentence or phrase begins with the same letter.
paromoiosis: a parallelism of sounds between words
of two clauses of approximately equal length.
anadiplosis: repetition of the last word of one line or
clause to begin the next
anaphora: repetition of the same word at the beginning
of successive clauses or verses
antistrophe: repetition of a closing word or words at
the end of several successive clauses, sentences or
verses
epanalepsis: repetition at the end of a clause or
sentence of the word or phrase with which it began
polyptoton: repetition of words from the same root but
with different endings
anacoluthon: ending a sentence with a different
structure from that with which it began
hyperbaton: a generic term for various forms of
departure from ordinary word order
19. Rhetoric in EE
Proverbs and similes
"You must excuse me for giving you a Line of
Latin now and then since I find my self in some
danger of Losing the Tongue, for I perceive a new
Language, like a New Mistress, is apt to make a
man forget all his old ones."
Joseph Addison to William Congreve, December 1699
http://www.e-enlightenment.com/item/addijoOU0010010a1c
“Mais une femme est un furieux remora; c'est
encor pis qu'un prieur”
Claude Adrien Helvétius to Léger Marie Deschamps Sunday, 7 October 1764
https://doi.org/10.13051/ee:doc/helvclUT0030145a1c
"Now you might say — Why are you against it ? " and "Were I to say as much Mr White might be in a rage, and
Mr Long might say — Mr B. why will you propose such things? — You see it can not be done. The consequence
would be, that between Mr White, and Mr Harrison, and Mr Long, possession never would be obtained."
Jeremy Bentham to Sir Evan Nepean, Monday, 9 June 1800
https://doi.org/10.13051/ee:doc/bentjeOU0060303b1c
Sermocinatio: the speaker answers the remarks or questions of a pretended interlocutor; or hypophora: asking questions and answering them
"Je meurs d'envie de savoir de leurs
nouvelles et en quels regimens, et sous
quels generaux ils ont servi. Vous me
direz que cela s'en va sans dire, qu'ils ont
servi en Hollande dans l'armée meme du
Roy puis que le Roy se trouvoit
alternativement dans toutes les armées,
et que ce qu'on appelloit armée de Mr le
prince, de Mr de Turenne, etc. n'etoit
proprement parlant que des detachemens
de la royale."
Pierre Bayle to Jacob Bayle Monday, 31 July 1673
http://www.e-enlightenment.com/item/baylpiVF0010218a1c
20. Rhetoric in EE
Proverbs and similes
“Mais un royaume est un grand
corps isolé, plus séparé de ses
voisins par la diversité d'intérèts, que
par la mer, les citadelles, et les
barrieres qui le renferment.”
Beaumarchais to Louis XIV 1717
https://doi.org/10.13051/ee:doc/beaupiVF2730014
a1cun
"Tu n'as pas deviné la-dessus, mais bien sur
l'humeur de Blaise. Il etoit comme un
dogue ; il est comme un tigre. Tu as eu bien
du pouvoir de faire parler la machine."
Françoise Huguet de Graffigny to François Antoine Devaux
Thursday, 23 July 1744
https://doi.org/10.13051/ee:doc/graffrVF0050722a1c
“grasse comme une abbesse”
Voltaire(twice)
"Voilà ce que fera votre despote, ambitieux, prodigue, avare, amoureux, vindicatif, jaloux,
foible: car c' est ainsi qu' ils font tous, et que nous faisons tous. Messieurs, permettez-moi de
vous le dire; vous donnez trop de force à vos calculs, et pas assez aux penchans du coeur
humain, et au jeu des passions. Votre système est très bon pour les gens de l'Utopie, il ne vaut
rien pour les enfans d' Adam."
Jean Jacques Rousseau to Victor Riqueti, marquis de Mirabeau
Sunday, 26 July 1767 https://doi.org/10.13051/ee:doc/rousjeVF0330238a
Repetition