Quran and Text-Fabric

Data Analysis for Ancient Corpora
applied to the Quran
Dirk Roorda
and
Cornelis van Lit
Filosofie en Religiewetenschap, Utrecht, 2019-03-28
0
50
100
150
200
250
conj nmpr subs adjv prep art
Parts of Speech after Atnach in ETCBC Phrase

A. reasons
B. a solution
C. toy example of a TF datasource
D. ministudy: rings and sentiments
C'. an easter egg
B'. new ways
A'. new horizons

• researchers in control of their own
data
• researchers empowered to fully
harness the data available to them
• researchers encouraged to DIY
computing

Data model
• Graph model: words, phrases, etc. are “nodes,” relationships
between them are edges.
• Graphs model complex data structures better than other
methods (e.g. XML).
• All stored in easy-to-understand, plain-text files. No messy
XML, SQL, etc.
• ... and we call him Text-Fabric (TF)

Data structure of TF - the IKEA spirit
node
order! order!
stacks of components
uniquely identified
words
phrases
chapters
verses

# Consider Phlebas
$ author=Iain M. Banks
## 1
Everything about us,
everything around us,
everything we know [and can know of] is composed ultimately of
patterns of nothing;
that’s the bottom line, the ﬁnal truth.
So where we ﬁnd we have any control over those patterns,
why not make the most elegant ones, the most enjoyable and good
ones,
in our own terms?
## 2
Besides,
it left the humans in the Culture free to take care of the things that
really mattered in life,
such as [sports, games, romance,] studying dead languages,
barbarian societies and impossible problems,
and climbing high mountains without the aid of a safety harness.

@node
@compiler=Dirk Roorda
@description=the letters of a word
@name=Culture quotes from Iain
Banks
@source=Good Reads
@url=https://www.goodreads.com/
work/quotes/14366-consider-phlebas
@valueType=str
@writtenBy=Text-Fabric
@dateWritten=2019-01-30T22:20:19Z
Everything
about
us
everything
around
us
everything
we
know
and
can
know
of
is
composed
ultimately
of
patterns
of
nothing
that’s
the
bottom
line
the
final
truth
So letters
@node
@description=the punctuation after
a word
@name=Culture quotes from Iain
Banks
@source=Good Reads
@url=https://www.goodreads.com/
work/quotes/14366-consider-phlebas
@valueType=str
@dateWritten=2019-01-30T22:20:19Z
3 ,
6 ,
20 ;
24 ,
27 .
38 ,
45 ,
51 ,
55 ?
,
75 ,
78 ,
,
,
83 ,
88 ,
99 .
punc
banks/tf/
author.tf
gap.tf
letters.tf
number.tf
oslots.tf
otext.tf
otype.tf
punc.tf
terminator.tf
title.tf
TF dataset

otype
@node
@name=Culture quotes from Iain Banks
@source=Good Reads
@url=https://www.goodreads.com/work/quotes/14366-consider-phlebas
@valueType=str
@dateWritten=2019-01-30T22:20:19Z
1-99 word
100 book
101-102 chapter
103-114 line
115-117 sentence

oslots
@edge
@source=Good Reads
@valueType=str
@dateWritten=2019-01-30T22:20:19Z
100 1-99
1-55
56-99
1-3
4-6
7-9,14-20
21-27
28-38
39-51
52-55
56
57-75
76-77,81-83
84-88
89-99
1-27
28-55
56-99
1-99 word
100 book
101-102 chapter
103-114 line
115-117 sentence
## 1
Everything about us,
everything around us,
everything we know [and can know of] is composed ultimately of patterns of
nothing;
that’s the bottom line, the final truth.
So where we find we have any control over those patterns,
why not make the most elegant ones, the most enjoyable and good ones,
in our own terms?
## 2
Besides,
it left the humans in the Culture free to take care of the things that really
mattered in life,
such as [sports, games, romance,] studying dead languages,
barbarian societies and impossible problems,
and climbing high mountains without the aid of a safety harness.

otext
@config
@fmt:text-orig-full={letters}{punc}
@sectionFeatures=title,number
@sectionTypes=book,chapter
@source=Good Reads
@dateWritten=2019-01-30T22:20:19Z

https://github.com/ETCBC/lingo/tree/master/easter/tf/c

CTBA|CTBA#CTBA#CTB###0#0#0#3#1#0#2#0#0#2#0#0#2#0#0#0#0#0 D;L;DOTH|;L;DOT#;L;DOTA#;LD#D#H#0#0#0#3#1#0#3#0#0#2#0#0#2#1#1#3#0#0
D;WOE|;WOE#;WOE#;WOE#D##0#0#0#0#0#0#0#0#0#1#0#0#2#0#0#0#0#0 MW;KA|MW;KA#MW;KA#MWK###0#1#0#3#1#0#2#0#0#0#0#2#0#0#0#0#0#0 BRH|
BR#BRA#BR##H#0#0#0#3#1#0#2#0#0#2#0#0#2#1#1#3#0#0 DDO;D|DO;D#DO;D#DO;D#D##0#0#0#0#0#0#0#0#0#1#0#0#2#0#0#0#0#0 BRH|
BR#BRA#BR##H#0#0#0#3#1#0#2#0#0#2#0#0#2#1#1#3#0#0 DABRHM|ABRHM#ABRHM#ABRHM#D##0#0#0#0#0#0#0#0#0#1#0#0#2#0#0#0#0#0
ABRHM|ABRHM#ABRHM#ABRHM###0#0#0#0#0#0#0#0#0#1#0#0#2#0#0#0#0#0 AOLD|AOLD#;LD#;LD###0#5#1#0#1#3#2#0#0#0#0#0#0#0#0#0#0#0 LA;SKX|
A;SKX#A;SKX#A;SKX#L##0#0#0#0#0#0#0#0#0#1#0#0#2#0#0#0#0#0 A;SKX|A;SKX#A;SKX#A;SKX###0#0#0#0#0#0#0#0#0#1#0#0#2#0#0#0#0#0 AOLD|
Syriac NT (Sedra database)
DEUT33,02 >C- >;71C 1.000 >;71C- >C-
DEUT33,02 DT D.@73T 1.000 D.@73T DT
DEUT33,09 BNW B.@N@73JW 1.000 B.@N@73W BNW
EST 01,16 MWMKN M:MW.K@81N 1.000 M:WM.K@81N MWMKN
EST 03,04 B- K.:- 1.000 B.:- B-
EST 03,04 >MRM >@M:R@70M 1.000 >@M:R@70M >MRM
Hebrew Ketiv-Qere (ETCBC)
&P005381 = MSVO 3, 70
#atf: lang qpc
@tablet
@obverse
@column 1
1.a. 2(N14) , SZE~a SAL TUR3~a NUN~a
1.b. 3(N19) , |GISZ.TE|
2. 1(N14) , NAR NUN~a SIG7
3. 2(N04)# , PIRIG~b1 SIG7 URI3~a NUN~a
@column 2
1. 3(N04) , |GISZ.TE| GAR |SZU2.((HI+1(N57))+(HI+1(N57)))| GI4~a
2. , GU7 AZ SI4~f
@reverse
@column 1
1. 3(N14) , SZE~a
2. 3(N19) 5(N04) ,
3. , GU7
@column 2
1. , AZ SI4~f
Cuneiform Uruk (CDLI)
(1:1:1:1) bi P PREFIX|bi+
(1:1:1:2) somi N STEM|POS:N|LEM:{som|ROOT:smw|M|GEN
(1:1:2:1) {ll~ahi PN STEM|POS:PN|LEM:{ll~ah|ROOT:Alh|GEN
(1:1:3:1) {l DET PREFIX|Al+
(1:1:3:2) r~aHoma`ni ADJ STEM|POS:ADJ|LEM:r~aHoma`n|ROOT:rHm|MS|GEN
(1:1:4:1) {l DET PREFIX|Al+
(1:1:4:2) r~aHiymi ADJ STEM|POS:ADJ|LEM:r~aHiym|ROOT:rHm|MS|GEN
(1:2:1:1) {lo DET PREFIX|Al+
(1:2:1:2) Hamodu N STEM|POS:N|LEM:Hamod|ROOT:Hmd|M|NOM
Arabic Quran (Tanzil)
Source data of a corpus
TEI, Markdown, ASCII, Database

Conversion to TF
TF does more than half of the work

Quran
https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/quran/start.ipynb

Computing - Python - Jupyter notebooks
https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/bhsa/start.ipynb
BHSA

https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/syrnt/start.ipynb
Syriac NT

Old Babylon'
https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/oldbabylonian/start.ipynb

Quran
https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/quran/search.ipynb
Computing - more power!

Quran
https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/quran/search.ipynb

https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/bhsa/searchFromMQL.ipynb
BHSA

Syriac NT
https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/syrnt/search.ipynb

Old Babylon'
https://nbviewer.jupyter.org/github/Nino-cunei/oldbabylonian/blob/master/analysis/ummama.ipynb

Uruk
https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/uruk/search.ipynb

QuranPower to you! (without the programming)
This is Text-Fabric 7.5.3
Using TF app quran commit f6543c213dad36050de3e90373af237e9a4f0bc1
in /Users/dirk/text-fabric-data/__apps__/quran
Cleaning up remnant processes, if any ...
Loading data for quran. Please wait ...
Setting up TF kernel for quran q-ran/exercises/mining/tf
Using q-ran/quran/tf - 0.3 rv0.4 in /Users/dirk/text-fabric-data
Using q-ran/exercises/mining/tf - 0.3 rv0.1 in /Users/dirk/text-fabric-data
TF setup done.
onDocker=False
* Running on http://localhost:8105/ (Press CTRL+C to quit)
Opening quran in browser
Listening at port 18985
127.0.0.1 - - [27/Mar/2019 15:03:55] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [27/Mar/2019 15:03:55] "POST /passage HTTP/1.1" 200 -
127.0.0.1 - - [27/Mar/2019 15:03:55] "POST /sections HTTP/1.1" 200 -
127.0.0.1 - - [27/Mar/2019 15:03:55] "POST /tuples HTTP/1.1" 200 -
127.0.0.1 - - [27/Mar/2019 15:03:56] "POST /query HTTP/1.1" 200 -
127.0.0.1 - - [27/Mar/2019 15:03:56] "POST / HTTP/1.1" 200 -
127.0.0.1 - - [27/Mar/2019 15:03:56] "POST /passage HTTP/1.1" 200 -
127.0.0.1 - - [27/Mar/2019 15:03:56] "POST /sections HTTP/1.1" 200 -
127.0.0.1 - - [27/Mar/2019 15:03:56] "POST /tuples HTTP/1.1" 200 -
127.0.0.1 - - [27/Mar/2019 15:03:56] "POST /query HTTP/1.1" 200 -
dirk:~ > text-fabric quran --mod=q-ran/exercises/mining/tf

Sharing and re-using data
Text-Fabric has been developed by a DANS-employee
as a consequence:
Data export is built in ✅
Provenance tracking is built in ✅
Redistribution of newly created data is built in ✅

sharing #1: GitHub & NBviewer
work done in a Jupyter Notebook inside a GitHub repository
is very sharable

sharing #2: Export from TF-browser

sharing #4: Create new features
https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/quran/share.ipynb
• etcbc/valence/tf : the results of the verbal valence work of Janet Dyk in the
SYNVAR project;

• etcbc/lingo/heads/tf : head words for phrases, work done by Cody Kingham;

• ch-jensen/Semantic-mapping-of-participants/actor/tf : participant analysis in
progress by Christian Høygaard-Jensen;

• cmerwich/bh-reference-system/tf: participant analysis in progress by
Christiaan Erwich;

• nino-cunei/oldbabylonian/parallels/tf: similar lines by Dirk Roorda

• q-ran/quran/parallels/tf: similar lines by Dirk Roorda

• q-ran/exercises/mining/tf: sentiments (crude) by Dirk Roorda

• you/quran/sentiments/tf: sentiments (reﬁned) by You

• cvlit/quran/semantics/tf: semantic ﬁelds by cvlit

The Text-Fabric Ethos
• Open source tool for corpus annotation and analysis.
• Corpus data in a repository, with standard license, as free as
possible
• Researchers: step out of your technological comfort zones and
pave the way for the ones after you
• Find computational inspiration across disciplines

Open Science Rocks
thank you
Dirk Roorda dirk.roorda@dans.knaw.nl

Quran and Text-Fabric

Recommended

Recommended

More Related Content

Similar to Quran and Text-Fabric

Similar to Quran and Text-Fabric (20)

More from Dirk Roorda

More from Dirk Roorda (20)

Recently uploaded

Recently uploaded (20)

Quran and Text-Fabric