The document discusses developing an intelligent search system for biblical texts that goes beyond traditional concordance searches based solely on identical word forms and word orders. It aims to enable searches based on similar meanings by accounting for syntax and semantics. An example is given of a traditional concordance search for an identical phrase across passages. The system seeks to improve on this by allowing searches for passages containing phrases that are not identical but have similar meanings.
The MSR-NLP Chinese word segmentation system is part of a full sentence analyzer. It uses a dictionary and rules for basic segmentation, morphology, and named entity recognition to build a word lattice. The system proposes new words, prunes the lattice, and uses a parser to produce the final segmentation. It participated in four segmentation bakeoff tracks, ranking highly in each. An analysis found that parameter tuning, morphology/NER, and lattice pruning contributed most to performance, while the parser helped less. Problems included inconsistent annotations and differences in defining new words.
Este documento describe los diferentes tipos de ilícitos tributarios según la legislación venezolana. Se clasifican en ilícitos formales e ilícitos relativos a las especies fiscales y gravadas. También se explica la responsabilidad penal por estos ilícitos y las circunstancias atenuantes y agravantes. Finalmente, se discuten las diferentes teorías sobre la naturaleza jurídica de los ilícitos tributarios.
This document discusses the mental health crisis among doctors in America. It describes high rates of depression, burnout, and suicide among medical residents and physicians. A pilot program at Stanford called Reflection Rounds aims to improve doctors' mental health by providing mandatory therapy sessions. The document also examines the culture of medicine that discourages vulnerability and weakness and contributes to poor mental health outcomes for physicians.
This paper presents a model for Chinese word segmentation that integrates it as part of sentence analysis using a parser. The model achieves high accuracy by resolving most ambiguities at the lexical level using dictionary information, but handles cases requiring syntactic context in the parsing process. The complexity usually associated with parsing is reduced by pruning implausible segmentations prior to parsing. The approach is implemented in a natural language understanding system developed at Microsoft Research.
Learning Verb-Noun Relations to Improve ParsingAndi Wu
This document describes a learning procedure to automatically acquire knowledge about verb-noun relations in Chinese. It uses an existing parser, a large corpus, and statistical methods to learn which verb-noun pairs typically occur in a verb-object relation versus a modifier-head relation. The learned knowledge is then used to disambiguate parses, improving the accuracy of the original parser. An evaluation on 500 sentences showed the parser's accuracy improved significantly, with the correct analysis found for 350 sentences when using the acquired knowledge.
This document summarizes the 2015 Corporate Responsibility and Sustainability Report of CSC. It highlights CSC's initiatives in enhancing shared values through long-term partnerships. These initiatives include reducing energy usage and greenhouse gas emissions at data centers, donating over $120,000 through a foundation, and hiring 15.1% of subcontractors from diverse and veteran-owned businesses. The report emphasizes that through collaboration between employees, clients, communities, and the environment, more impact can be achieved than any individual effort.
This document discusses using a data-mining approach to perform word sense detection and disambiguation in biblical texts. It aims to identify the different senses of words in the Bible and disambiguate which sense each instance refers to. The approach uses multiple Bible translations linked to the original texts and groups instances based on translation word similarities through a progressive merging technique. This allows automatic identification of word senses using translation data in an efficient and objective manner to build sense dictionaries and enable refined Bible search and translation tools.
This document analyzes the fidelity and readability of 13 English Bible translations using quantitative linguistic methods. It measures fidelity based on the syntactic transfer rate and consistency of word choices between the original texts and translations. It measures readability based on the rate of common vocabulary words and syntactic fluency compared to a sample of contemporary English. The analysis ranks the translations on fidelity and readability and explores whether a translation can achieve both high fidelity and readability. The results show some translations are ranked highly in both dimensions.
The MSR-NLP Chinese word segmentation system is part of a full sentence analyzer. It uses a dictionary and rules for basic segmentation, morphology, and named entity recognition to build a word lattice. The system proposes new words, prunes the lattice, and uses a parser to produce the final segmentation. It participated in four segmentation bakeoff tracks, ranking highly in each. An analysis found that parameter tuning, morphology/NER, and lattice pruning contributed most to performance, while the parser helped less. Problems included inconsistent annotations and differences in defining new words.
Este documento describe los diferentes tipos de ilícitos tributarios según la legislación venezolana. Se clasifican en ilícitos formales e ilícitos relativos a las especies fiscales y gravadas. También se explica la responsabilidad penal por estos ilícitos y las circunstancias atenuantes y agravantes. Finalmente, se discuten las diferentes teorías sobre la naturaleza jurídica de los ilícitos tributarios.
This document discusses the mental health crisis among doctors in America. It describes high rates of depression, burnout, and suicide among medical residents and physicians. A pilot program at Stanford called Reflection Rounds aims to improve doctors' mental health by providing mandatory therapy sessions. The document also examines the culture of medicine that discourages vulnerability and weakness and contributes to poor mental health outcomes for physicians.
This paper presents a model for Chinese word segmentation that integrates it as part of sentence analysis using a parser. The model achieves high accuracy by resolving most ambiguities at the lexical level using dictionary information, but handles cases requiring syntactic context in the parsing process. The complexity usually associated with parsing is reduced by pruning implausible segmentations prior to parsing. The approach is implemented in a natural language understanding system developed at Microsoft Research.
Learning Verb-Noun Relations to Improve ParsingAndi Wu
This document describes a learning procedure to automatically acquire knowledge about verb-noun relations in Chinese. It uses an existing parser, a large corpus, and statistical methods to learn which verb-noun pairs typically occur in a verb-object relation versus a modifier-head relation. The learned knowledge is then used to disambiguate parses, improving the accuracy of the original parser. An evaluation on 500 sentences showed the parser's accuracy improved significantly, with the correct analysis found for 350 sentences when using the acquired knowledge.
This document summarizes the 2015 Corporate Responsibility and Sustainability Report of CSC. It highlights CSC's initiatives in enhancing shared values through long-term partnerships. These initiatives include reducing energy usage and greenhouse gas emissions at data centers, donating over $120,000 through a foundation, and hiring 15.1% of subcontractors from diverse and veteran-owned businesses. The report emphasizes that through collaboration between employees, clients, communities, and the environment, more impact can be achieved than any individual effort.
This document discusses using a data-mining approach to perform word sense detection and disambiguation in biblical texts. It aims to identify the different senses of words in the Bible and disambiguate which sense each instance refers to. The approach uses multiple Bible translations linked to the original texts and groups instances based on translation word similarities through a progressive merging technique. This allows automatic identification of word senses using translation data in an efficient and objective manner to build sense dictionaries and enable refined Bible search and translation tools.
This document analyzes the fidelity and readability of 13 English Bible translations using quantitative linguistic methods. It measures fidelity based on the syntactic transfer rate and consistency of word choices between the original texts and translations. It measures readability based on the rate of common vocabulary words and syntactic fluency compared to a sample of contemporary English. The analysis ranks the translations on fidelity and readability and explores whether a translation can achieve both high fidelity and readability. The results show some translations are ranked highly in both dimensions.
This paper presents a method for automatically detecting and correcting erroneous characters in Chinese text. The method treats typo correction as an integral part of syntactic analysis. It considers both the original character and possible replacement characters from a list of confusable pairs during sentence parsing. The character that results in the best parse is identified as correct. The approach achieves substantially higher recall and precision than existing Chinese proofreaders, which do not perform a full syntactic analysis. An evaluation on 50 character pairs found an overall precision of 86.9% and recall of 96.3%. Cases involving characters that can only form words together tended to have perfect scores, while characters that can stand alone were more difficult to correct.
Karl Rosenberg has over 21 years of experience in international development working in food security, agriculture, and economic growth across 21 countries in Africa and Latin America. He currently serves as the Regional Director for West and Central Africa at NCBA CLUSA, overseeing a portfolio of projects valued at $130 million. Previously, he held director and manager level positions at organizations such as World Vision, CARE, and IRD.
Este documento resume los conceptos fundamentales del derecho concursal en Ecuador. Brevemente explica que el derecho concursal regula los procesos cuando un deudor no puede pagar sus deudas, describiendo la diferencia entre una ejecución singular y colectiva. También define conceptos como el patrimonio como prenda común de los acreedores, el atraso y sus requisitos, el rol del síndico y la comisión de acreedores, y el contenido de la sentencia sobre la admisión de un atraso.
Dynamic Lexical Acquisition in Chinese Sentence AnalysisAndi Wu
This document discusses a method for dynamically acquiring lexical information during sentence analysis in order to improve the coverage of a parser without requiring manual dictionary editing. New words and attributes are proposed based on contextual templates and accepted or rejected based on whether they are needed to parse sentences successfully. Accepted proposals are stored in auxiliary lexicons which can then be combined with the main lexicon to improve parsing of future sentences, especially in domain-specific texts. Evaluation on a technical manual corpus showed the method significantly improved parsing accuracy by recognizing new words and attributes.
- BibleGrapevine is a website developed by Global Bible Initiative to make linguistic data from Biblical texts and translations available for research
- It displays syntactic trees, alignments between source texts and translations, and links translations to allow comparison across languages
- Current features include basic views, interlinear views, tree views, and translation memory views, with plans to add search for similar linguistic units and word sense exploration
The main users would likely be biblical scholars, linguists, translators, and students interested in in-depth linguistic analysis of biblical texts and translations. Views showing syntactic relationships and alignments between
This document outlines Nike's strategy to launch their first women's basketball apparel line. It identifies the challenges of a lack of excitement and relevance around women's basketball currently. The target audience is identified as 14-17 year old girls who are passionate basketball players. Research with teenage girl basketball players provided insights about their motivations and support systems. The strategy creates the "Nike Huddle" campaign to build a network and resources for collective improvement among women basketball players through experiential activations, a mobile app, and social media engagement centered around the theme of support systems. The communications plan aims to promote the launch through city events, publications, influencers, and ongoing social media.
This document describes a single cycle processor implemented in Verilog that executes instructions in one clock cycle. It includes a register file to store data, an program counter (PC) to track the next instruction address, a controller to decode instructions and control operations, an arithmetic logic unit (ALU) to perform arithmetic and logical operations on data, and a memory to store and retrieve data. The processor supports load, store, add, and, or, subtract, set on less than, and branch equal instructions operating on 32-bit data words through its register ports, ALU, and memory components controlled by the single cycle controller.
This document discusses several topics related to botany and biology. It provides definitions of key terms like taxonomy, classification, biodiversity, and discusses subfields of botany like applied botany. It also contains passages from the Quran in Arabic. The document touches on classification of living things, characteristics of living things, diversity of life, and discusses both special creation and evolution theories.
The document provides commentary on Psalms 15-19 and 24 from the book of Psalms. It includes the Hebrew text of Psalms 4 and 14 with commentary on some shared themes between the two Psalms. Specifically, it notes that Psalm 4 sees the righteous person as part of God's creation and blessed by God for dwelling righteously in the world.
God in the 21st Century: Is There a Place for the Divine in our Congregationa...caje32
This document discusses different perspectives on God and spirituality. It explores how God can be described through analogies and references from popular culture. It also examines proofs of God's existence through observations of nature's order and intelligence. Quotes from Jewish scholars emphasize that belief in a creator is a foundational principle of Judaism and gives purpose and meaning to life.
This document provides a summary of an 80% word list of the Quran compiled by Dr. Abdulazeez Abdulraheem. It notes that if a person masters these word lists, they would only need to know 2 out of every 9 words on average in the Quran. The first 6 pages contain words that occur very frequently and make up 41.5% of the total words in the Quran. Each word is provided with its number of occurrences and common meanings to help with memorization and understanding of the Quran. Additional information and guidelines are also included to aid in using this word list to learn the Quran.
This document provides an introduction and overview for Rabbi Nachman Chaimovich's Haggadah titled "Reshet Keshet". It explains the title's meaning referring to interconnectedness and multiple perspectives. Illustrations were provided by Rob Lebowitz. The text uses various fonts for different sections and sources including the Scholar's Haggadah. Thanks are given to the author's teacher Menachem Ha-Kohen Leibtag and Congregation Kehilath Jeshurun for their support. Guidance is provided on answering the questions from the four types of children described in the Haggadah - wise, wicked, simple and one who does not know how to ask - by citing relevant biblical passages.
The document discusses the use of the Hebrew word "Elohim" in scripture. It examines whether Elohim refers to a singular God or plural divine beings. While some see it as deliberately bad grammar to convey meaning, most evidence suggests Elohim is a plural intensive form used singularly. When used with singular verbs, it intensifies the noun to refer to the one true God, not angels or other divine manifestations. The plural form is also used for false gods or as an intensive for judges or angels.
The document summarizes seven plagues that God brought upon Egypt in the story of Exodus. It describes each plague and how Pharaoh's heart was hardened each time, leading him to not free the Israelites from slavery. The plagues discussed are: blood, frogs, gnats, flies, pestilence of livestock, boils, and hail. Each plague is introduced with "God said to Moses..." and the impact on Pharaoh is noted.
1. Nominals in Hebrew include nouns, adjectives, and numbers which share forms and uses.
2. Nominals can be in an absolute or construct state. In the construct state, the nominal loses stress and may change vowels to indicate its syntactic relationship to the following word.
3. Two nominals in a construct chain form a linguistic unit expressing a single meaning, with the first having full stress and the second showing construct state variations.
The document discusses different types of clauses and their functions in texts. It distinguishes between verbal clauses, which focus on actions and form the narrative foreground, and nominal clauses, which provide background information about settings and descriptions. When discussing discourse texts, it notes that verbal clauses stand in the foreground to describe actions, while nominal clauses offer background context.
Creative Differences in Marriage (Gen. 2:18-23)CompassChurch
The story of the first marriage as described in the Bible is rejected by many today as a myth that has no bearing on 21st century relationships in a sophisticated society. This sermon will challenge that notion and present the marriage between Adam and Eve as God's design for all marriages. The creative differences God made are good, true, and beautiful.
The document describes the ancient Hebrew alphabet and the meanings of each letter. It provides:
1) Three versions of the ancient Hebrew alphabet order from right to left and their English translations.
2) The symbol or meaning represented by each of the 22 letters, such as aleph meaning beginning or conception and bet meaning body or development.
3) Short descriptions of the symbols or objects each letter is derived from and their associated meanings, such as dalet representing a tent or door and passageway or exit.
Tehilim vbm course 43 davids last collection 144 vs 18 20160117 ftwAkiva Berger
The document discusses lessons 42-44 from Beni Gesundheit on Psalms. Lesson 42 focuses on the structure, meaning and location of Psalms 138-145, David's last collection. Lesson 43 provides a new reading of Psalms 104-106 in the context of Book Five of Psalms. Lesson 44 discusses David's invitation to "all flesh" to praise God in Psalms 145.
The document discusses the biblical study of Christology, which is the study of the person of Jesus Christ. It provides biblical evidence from passages like Micah 5:2, Isaiah 9:6, and John 1:1-2 to support that Jesus Christ is eternal and the creator. Further evidence is presented from passages like John 1:3, Colossians 1:16 to indicate that Jesus is God.
This paper presents a method for automatically detecting and correcting erroneous characters in Chinese text. The method treats typo correction as an integral part of syntactic analysis. It considers both the original character and possible replacement characters from a list of confusable pairs during sentence parsing. The character that results in the best parse is identified as correct. The approach achieves substantially higher recall and precision than existing Chinese proofreaders, which do not perform a full syntactic analysis. An evaluation on 50 character pairs found an overall precision of 86.9% and recall of 96.3%. Cases involving characters that can only form words together tended to have perfect scores, while characters that can stand alone were more difficult to correct.
Karl Rosenberg has over 21 years of experience in international development working in food security, agriculture, and economic growth across 21 countries in Africa and Latin America. He currently serves as the Regional Director for West and Central Africa at NCBA CLUSA, overseeing a portfolio of projects valued at $130 million. Previously, he held director and manager level positions at organizations such as World Vision, CARE, and IRD.
Este documento resume los conceptos fundamentales del derecho concursal en Ecuador. Brevemente explica que el derecho concursal regula los procesos cuando un deudor no puede pagar sus deudas, describiendo la diferencia entre una ejecución singular y colectiva. También define conceptos como el patrimonio como prenda común de los acreedores, el atraso y sus requisitos, el rol del síndico y la comisión de acreedores, y el contenido de la sentencia sobre la admisión de un atraso.
Dynamic Lexical Acquisition in Chinese Sentence AnalysisAndi Wu
This document discusses a method for dynamically acquiring lexical information during sentence analysis in order to improve the coverage of a parser without requiring manual dictionary editing. New words and attributes are proposed based on contextual templates and accepted or rejected based on whether they are needed to parse sentences successfully. Accepted proposals are stored in auxiliary lexicons which can then be combined with the main lexicon to improve parsing of future sentences, especially in domain-specific texts. Evaluation on a technical manual corpus showed the method significantly improved parsing accuracy by recognizing new words and attributes.
- BibleGrapevine is a website developed by Global Bible Initiative to make linguistic data from Biblical texts and translations available for research
- It displays syntactic trees, alignments between source texts and translations, and links translations to allow comparison across languages
- Current features include basic views, interlinear views, tree views, and translation memory views, with plans to add search for similar linguistic units and word sense exploration
The main users would likely be biblical scholars, linguists, translators, and students interested in in-depth linguistic analysis of biblical texts and translations. Views showing syntactic relationships and alignments between
This document outlines Nike's strategy to launch their first women's basketball apparel line. It identifies the challenges of a lack of excitement and relevance around women's basketball currently. The target audience is identified as 14-17 year old girls who are passionate basketball players. Research with teenage girl basketball players provided insights about their motivations and support systems. The strategy creates the "Nike Huddle" campaign to build a network and resources for collective improvement among women basketball players through experiential activations, a mobile app, and social media engagement centered around the theme of support systems. The communications plan aims to promote the launch through city events, publications, influencers, and ongoing social media.
This document describes a single cycle processor implemented in Verilog that executes instructions in one clock cycle. It includes a register file to store data, an program counter (PC) to track the next instruction address, a controller to decode instructions and control operations, an arithmetic logic unit (ALU) to perform arithmetic and logical operations on data, and a memory to store and retrieve data. The processor supports load, store, add, and, or, subtract, set on less than, and branch equal instructions operating on 32-bit data words through its register ports, ALU, and memory components controlled by the single cycle controller.
This document discusses several topics related to botany and biology. It provides definitions of key terms like taxonomy, classification, biodiversity, and discusses subfields of botany like applied botany. It also contains passages from the Quran in Arabic. The document touches on classification of living things, characteristics of living things, diversity of life, and discusses both special creation and evolution theories.
The document provides commentary on Psalms 15-19 and 24 from the book of Psalms. It includes the Hebrew text of Psalms 4 and 14 with commentary on some shared themes between the two Psalms. Specifically, it notes that Psalm 4 sees the righteous person as part of God's creation and blessed by God for dwelling righteously in the world.
God in the 21st Century: Is There a Place for the Divine in our Congregationa...caje32
This document discusses different perspectives on God and spirituality. It explores how God can be described through analogies and references from popular culture. It also examines proofs of God's existence through observations of nature's order and intelligence. Quotes from Jewish scholars emphasize that belief in a creator is a foundational principle of Judaism and gives purpose and meaning to life.
This document provides a summary of an 80% word list of the Quran compiled by Dr. Abdulazeez Abdulraheem. It notes that if a person masters these word lists, they would only need to know 2 out of every 9 words on average in the Quran. The first 6 pages contain words that occur very frequently and make up 41.5% of the total words in the Quran. Each word is provided with its number of occurrences and common meanings to help with memorization and understanding of the Quran. Additional information and guidelines are also included to aid in using this word list to learn the Quran.
This document provides an introduction and overview for Rabbi Nachman Chaimovich's Haggadah titled "Reshet Keshet". It explains the title's meaning referring to interconnectedness and multiple perspectives. Illustrations were provided by Rob Lebowitz. The text uses various fonts for different sections and sources including the Scholar's Haggadah. Thanks are given to the author's teacher Menachem Ha-Kohen Leibtag and Congregation Kehilath Jeshurun for their support. Guidance is provided on answering the questions from the four types of children described in the Haggadah - wise, wicked, simple and one who does not know how to ask - by citing relevant biblical passages.
The document discusses the use of the Hebrew word "Elohim" in scripture. It examines whether Elohim refers to a singular God or plural divine beings. While some see it as deliberately bad grammar to convey meaning, most evidence suggests Elohim is a plural intensive form used singularly. When used with singular verbs, it intensifies the noun to refer to the one true God, not angels or other divine manifestations. The plural form is also used for false gods or as an intensive for judges or angels.
The document summarizes seven plagues that God brought upon Egypt in the story of Exodus. It describes each plague and how Pharaoh's heart was hardened each time, leading him to not free the Israelites from slavery. The plagues discussed are: blood, frogs, gnats, flies, pestilence of livestock, boils, and hail. Each plague is introduced with "God said to Moses..." and the impact on Pharaoh is noted.
1. Nominals in Hebrew include nouns, adjectives, and numbers which share forms and uses.
2. Nominals can be in an absolute or construct state. In the construct state, the nominal loses stress and may change vowels to indicate its syntactic relationship to the following word.
3. Two nominals in a construct chain form a linguistic unit expressing a single meaning, with the first having full stress and the second showing construct state variations.
The document discusses different types of clauses and their functions in texts. It distinguishes between verbal clauses, which focus on actions and form the narrative foreground, and nominal clauses, which provide background information about settings and descriptions. When discussing discourse texts, it notes that verbal clauses stand in the foreground to describe actions, while nominal clauses offer background context.
Creative Differences in Marriage (Gen. 2:18-23)CompassChurch
The story of the first marriage as described in the Bible is rejected by many today as a myth that has no bearing on 21st century relationships in a sophisticated society. This sermon will challenge that notion and present the marriage between Adam and Eve as God's design for all marriages. The creative differences God made are good, true, and beautiful.
The document describes the ancient Hebrew alphabet and the meanings of each letter. It provides:
1) Three versions of the ancient Hebrew alphabet order from right to left and their English translations.
2) The symbol or meaning represented by each of the 22 letters, such as aleph meaning beginning or conception and bet meaning body or development.
3) Short descriptions of the symbols or objects each letter is derived from and their associated meanings, such as dalet representing a tent or door and passageway or exit.
Tehilim vbm course 43 davids last collection 144 vs 18 20160117 ftwAkiva Berger
The document discusses lessons 42-44 from Beni Gesundheit on Psalms. Lesson 42 focuses on the structure, meaning and location of Psalms 138-145, David's last collection. Lesson 43 provides a new reading of Psalms 104-106 in the context of Book Five of Psalms. Lesson 44 discusses David's invitation to "all flesh" to praise God in Psalms 145.
The document discusses the biblical study of Christology, which is the study of the person of Jesus Christ. It provides biblical evidence from passages like Micah 5:2, Isaiah 9:6, and John 1:1-2 to support that Jesus Christ is eternal and the creator. Further evidence is presented from passages like John 1:3, Colossians 1:16 to indicate that Jesus is God.
Genesis 28 discusses Jacob fleeing from his brother Esau to Laban in Paddan-Aram. Before departing, Isaac blesses Jacob and commands him not to marry a Canaanite woman. The document then provides historical context on Paddan-Aram and the meanings of its components. It notes God rewards obedience, as seen in Isaac's blessing of fruitfulness, growth, godly relationships, and inheritance of land for Jacob and his descendants. The summary concludes obedience to God is honored with rewards reaped at the proper time.
1. Andi Wu
Asia Bible Society
From Identical StringsFrom Identical StringsFrom Identical StringsFrom Identical Strings
to Similar Stringsto Similar Stringsto Similar Stringsto Similar Strings
Intelligent Search of Biblical Texts Based onIntelligent Search of Biblical Texts Based onIntelligent Search of Biblical Texts Based onIntelligent Search of Biblical Texts Based on
Syntax and SemanticsSyntax and SemanticsSyntax and SemanticsSyntax and Semantics
2. Original Motivation
Systematic approach to Bible translation
To make the translation consistent,
translators need to know not only the
phrases that are identical but phrases that
are not identical but similar in meaning.
Asia Bible Society 2
3. 亚洲圣经协会
Traditional Search:
Based on matches in form
Same words
Same word orders
Intelligent Search:
Based on matches in meaning
Words can be different
Word orders can be different
Identical Strings vs. Similar StringsIdentical Strings vs. Similar StringsIdentical Strings vs. Similar StringsIdentical Strings vs. Similar Strings
5. 亚洲圣经协会
Example of Similar Strings:Example of Similar Strings:Example of Similar Strings:Example of Similar Strings:
Same words in different orders
Jeremiah 2:1
Ezekiel 24:20
6. 亚洲圣经协会
Example of Similar Strings:Example of Similar Strings:Example of Similar Strings:Example of Similar Strings:
Different words in different orders
Proverbs 1:7
Psalms 111:10
7. Similar Strings
Strings that are similar in meaning
Similar words in similar syntactic
relationships
Need in Bible translation
Asia Bible Society 7
8. The importance of Syntactic Relations
Similar strings != strings containing similar words
The same words in different syntactic relations can
mean very different things
An old man with a dog chased a young lady with an umbrella.
vs.
An old lady with a dog chased a young man with an umbrella.
Asia Bible Society 8
9. Semantic Units of Sentences
Triples: dependency relationships between
two words
e.g. In the beginning God created the heavens
and the earth.
God – create ( subject-verb)
create – heavens (verb-object)
create – earth (verb-object)
create – in the beginning (verb-adverbial)
heavens – earth (conjunction).
Asia Bible Society 9
10. Different Strings With the Same Triples
God created the heavens and the earth.
The heavens and the earth were created by God.
God created the heavens and He created the earth.
It is God who created the heavens and the earth.
God – create ( subject-verb)
create – heavens (verb-object)
create – earth (verb-object)
heavens – earth (conjunction).
Asia Bible Society 10
11. Different Strings With Similar Triples
God created man in his own image.
Adam is the man that God created.
Man was created by God on the sixth day.
I am a man created by God.
Triples in common:
God – create ( subject-verb)
create – man (verb-object)
Asia Bible Society 11
12. Similar Triples With Different Words
His troops were annihilated.
His army was destroyed.
His forces were wiped out.
annihilate troops
destroy army (verb-object)
wipe-out forces
Asia Bible Society 12
13. Data Requirement
To recognize similar strings in Biblical texts,
we need
Syntactic analysis of the original Hebrew
and Greek texts
Synonym database of Hebrew and Greek
Both of them have already been developed
at Asia Bible Society
Asia Bible Society 13
16. Triples
Extracted from the trees
Strings for comparison:
Text covered by each node/subtree
Similar strings:
Subtrees containing similar triples
Asia Bible Society 16
19. Compute Similarities Between Subtrees
Semantic space of a subtree:
The set of triples (including their synonymous
expansions) contained in the subtree
Similar subtrees
Subtrees whose semantic spaces overlap
(set intersection)
Degree of similarity
Set Intersection / Set Union
Asia Bible Society 19
20. Semantic Distance
= log ( Intersection / Union ) * -1
Set A = { a, b, c } Set B = { b, c, d, e }
Intersection = { b, c }
Union = { a, b, c, d, e }
Distance(A,B) = log(2/5)* -1 = 0.9162907318742
Set C = { a, b, c, d } Set D = { c, e, f, g, h }
Intersection = { c }
Union = { a, b, c, d, e, f, g, h }
Distance(C,D) = log(1/8)* -1 = 2.0794415416798
Asia Bible Society 20
24. Asia Bible Society 24
Semantic Space of Psalms 14:12
= { repay~person(V-O), as~deed(P-O),deed~him(Poss),
repay~as(V-PP)}
Semantic Space of Psalms 62:1
= { reward~everyone(V-O), as~deed(P-O),deed~him(Poss),
reward~as(V-PP), you~reward(S-V)}
Intersection = { repay/reward~person/everyone(V-O),
as~deed(P-O),deed~him(Poss), repay/reward~as(V-PP)}
Union = {repay/reward~person/everyone(V-O), as~deed(P-
O),deed~him(Poss),repay/reward~as(V-PP),you~reward2(S-V) }
25. The computation
Pair-wise comparison of all phrases
Keep pairs with semantic distance < 9.0
1,607,721 in the database
More than 24 hours on a single machine
for the computation
Asia Bible Society 25
27. Linking OT and NT
Hebrew OT Septuagint Greek NT
Automatic alignment
Strong number matching
Greek Strong numbers for all words in OT which
occur in NT
Match based on Greek Strong numbers
Asia Bible Society 27
29. Search in Bible translations
Alignment between translations and original
texts
Queries in other languages queries in
Hebrew/Greek
Search always done in Hebrew/Greek
Asia Bible Society 29
30. Further Improvements
The results will be better if
All the references are annotated
Better alignment between the Hebrew OT
and Septuagint
Asia Bible Society 30
31. Conclusion
Rich linguistic knowledge (syntactic and
semantic knowledge) enables us to
compare linguistic units on the basis of
meaning rather than form, thus making
the search of Biblical texts more
intelligent.
Asia Bible Society 31