SlideShare a Scribd company logo
1 of 8
“Visualizing Textual Data”
Drayton C. Benner
Founder/President, Miklal Software Solutions
PhD Candidate, Northwest Semitic Philology
University of Chicago
DraytonBenner@MiklalSoftware.com
Word alignment: uses
• Increases access to the source text (e.g. Hebrew and Greek Bible)
to readers of the target (e.g. English) text
• Commonly used as an input for statistical machine translation
• Also useful to scholars
• Analyzing literary dependence
• Translation technique
• Find usual patterns and deviations from usual patterns
• Reception history
• Ability to find pluses and minuses very easily
• Textual criticism
• Better identification of the source text underlying the translation
• Better macro-understanding of translation technique avoids many mistakes
• Input to algorithmic attempts to reconstruct tree structure of manuscripts
• Lexicography
• Philology
• Linguistics: historical, contact, corpus
• Representation of /ʕ/ and /ɣ/ by ‫ע‬
Past visualizations of aligned texts
Source: Logos Bible Software
Past visualizations of aligned texts
Source: BibleWorks
Past visualizations of aligned texts
Source: esvbible.org
Past visualizations by computational linguists
Lines (from Smith and Jahr 2000)
Alignment matrix (from Germann 2007)
Past visualizations by computational linguists
Colors (from Merkel et al 2003)
Mouseover (from Germann 2008)
Visualization
Language
helps
Colors
Blank rows
Lines

More Related Content

Similar to Visualizing Textual Data

Dh2014 e mopcobre-complete
Dh2014 e mopcobre-completeDh2014 e mopcobre-complete
Dh2014 e mopcobre-completeLaura Mandell
 
2010-04-29-swnj-pcls-presentation
2010-04-29-swnj-pcls-presentation2010-04-29-swnj-pcls-presentation
2010-04-29-swnj-pcls-presentationDouglas Randall
 
BibleTech2015
BibleTech2015BibleTech2015
BibleTech2015Andi Wu
 
Auto Mapping Texts for Human-Machine Analysis and Sensemaking
Auto Mapping Texts for Human-Machine Analysis and SensemakingAuto Mapping Texts for Human-Machine Analysis and Sensemaking
Auto Mapping Texts for Human-Machine Analysis and SensemakingShalin Hai-Jew
 
Enabling the Production of High-Quality English Glosses of Every Word in the ...
Enabling the Production of High-Quality English Glosses of Every Word in the ...Enabling the Production of High-Quality English Glosses of Every Word in the ...
Enabling the Production of High-Quality English Glosses of Every Word in the ...jrcovington
 
Presentation to 2014 University of Guelph Accessibility Conference Perspectiv...
Presentation to 2014 University of Guelph Accessibility Conference Perspectiv...Presentation to 2014 University of Guelph Accessibility Conference Perspectiv...
Presentation to 2014 University of Guelph Accessibility Conference Perspectiv...Shawna Reibling
 
Tooba laraib citation and references
Tooba laraib citation and referencesTooba laraib citation and references
Tooba laraib citation and referencesToobaLaraib1
 
Benner LaTech 2014 Presentation
Benner LaTech 2014 PresentationBenner LaTech 2014 Presentation
Benner LaTech 2014 Presentationjrcovington
 
Designing Metadata to Meet User Needs for Special Collections
Designing Metadata to Meet User Needs for Special CollectionsDesigning Metadata to Meet User Needs for Special Collections
Designing Metadata to Meet User Needs for Special CollectionsAllison Jai O'Dell
 
The benefits of using Crossref metadata for libraries and scientists - Crossr...
The benefits of using Crossref metadata for libraries and scientists - Crossr...The benefits of using Crossref metadata for libraries and scientists - Crossr...
The benefits of using Crossref metadata for libraries and scientists - Crossr...Crossref
 
LIWC-ing at Texts for Insights from Linguistic Patterns
LIWC-ing at Texts for Insights from Linguistic PatternsLIWC-ing at Texts for Insights from Linguistic Patterns
LIWC-ing at Texts for Insights from Linguistic PatternsShalin Hai-Jew
 
Books and Webs: Pulling the Down Rows
Books and Webs: Pulling the Down RowsBooks and Webs: Pulling the Down Rows
Books and Webs: Pulling the Down RowsPeter Brantley
 
Enrichment of Cross-Lingual Information on Chinese Genealogical Linked Data
Enrichment of Cross-Lingual Information on Chinese Genealogical Linked DataEnrichment of Cross-Lingual Information on Chinese Genealogical Linked Data
Enrichment of Cross-Lingual Information on Chinese Genealogical Linked DataHang Dong
 
In want of a dataset: Text Analysis and the VRC, Catherine D. Adams
In want of a dataset: Text Analysis and the VRC, Catherine D. AdamsIn want of a dataset: Text Analysis and the VRC, Catherine D. Adams
In want of a dataset: Text Analysis and the VRC, Catherine D. AdamsVisual Resources Association
 
LEXICOGRAPHY
LEXICOGRAPHY LEXICOGRAPHY
LEXICOGRAPHY mimisy
 
Deploying Semantic Technologies for Digital Publishing: A Case Study from Log...
Deploying Semantic Technologies for Digital Publishing: A Case Study from Log...Deploying Semantic Technologies for Digital Publishing: A Case Study from Log...
Deploying Semantic Technologies for Digital Publishing: A Case Study from Log...sboisen
 
Educ 1821 teaching reading & writing
Educ 1821 teaching reading & writingEduc 1821 teaching reading & writing
Educ 1821 teaching reading & writingCynthia Hatch
 

Similar to Visualizing Textual Data (20)

Dh2014 e mopcobre-complete
Dh2014 e mopcobre-completeDh2014 e mopcobre-complete
Dh2014 e mopcobre-complete
 
Interverbum falcon-10oct14-az
Interverbum falcon-10oct14-azInterverbum falcon-10oct14-az
Interverbum falcon-10oct14-az
 
2010-04-29-swnj-pcls-presentation
2010-04-29-swnj-pcls-presentation2010-04-29-swnj-pcls-presentation
2010-04-29-swnj-pcls-presentation
 
BibleTech2015
BibleTech2015BibleTech2015
BibleTech2015
 
Auto Mapping Texts for Human-Machine Analysis and Sensemaking
Auto Mapping Texts for Human-Machine Analysis and SensemakingAuto Mapping Texts for Human-Machine Analysis and Sensemaking
Auto Mapping Texts for Human-Machine Analysis and Sensemaking
 
Enabling the Production of High-Quality English Glosses of Every Word in the ...
Enabling the Production of High-Quality English Glosses of Every Word in the ...Enabling the Production of High-Quality English Glosses of Every Word in the ...
Enabling the Production of High-Quality English Glosses of Every Word in the ...
 
Presentation to 2014 University of Guelph Accessibility Conference Perspectiv...
Presentation to 2014 University of Guelph Accessibility Conference Perspectiv...Presentation to 2014 University of Guelph Accessibility Conference Perspectiv...
Presentation to 2014 University of Guelph Accessibility Conference Perspectiv...
 
Tooba laraib citation and references
Tooba laraib citation and referencesTooba laraib citation and references
Tooba laraib citation and references
 
Benner LaTech 2014 Presentation
Benner LaTech 2014 PresentationBenner LaTech 2014 Presentation
Benner LaTech 2014 Presentation
 
Designing Metadata to Meet User Needs for Special Collections
Designing Metadata to Meet User Needs for Special CollectionsDesigning Metadata to Meet User Needs for Special Collections
Designing Metadata to Meet User Needs for Special Collections
 
EDS for JIBS
EDS for JIBSEDS for JIBS
EDS for JIBS
 
The benefits of using Crossref metadata for libraries and scientists - Crossr...
The benefits of using Crossref metadata for libraries and scientists - Crossr...The benefits of using Crossref metadata for libraries and scientists - Crossr...
The benefits of using Crossref metadata for libraries and scientists - Crossr...
 
LIWC-ing at Texts for Insights from Linguistic Patterns
LIWC-ing at Texts for Insights from Linguistic PatternsLIWC-ing at Texts for Insights from Linguistic Patterns
LIWC-ing at Texts for Insights from Linguistic Patterns
 
Textmining
TextminingTextmining
Textmining
 
Books and Webs: Pulling the Down Rows
Books and Webs: Pulling the Down RowsBooks and Webs: Pulling the Down Rows
Books and Webs: Pulling the Down Rows
 
Enrichment of Cross-Lingual Information on Chinese Genealogical Linked Data
Enrichment of Cross-Lingual Information on Chinese Genealogical Linked DataEnrichment of Cross-Lingual Information on Chinese Genealogical Linked Data
Enrichment of Cross-Lingual Information on Chinese Genealogical Linked Data
 
In want of a dataset: Text Analysis and the VRC, Catherine D. Adams
In want of a dataset: Text Analysis and the VRC, Catherine D. AdamsIn want of a dataset: Text Analysis and the VRC, Catherine D. Adams
In want of a dataset: Text Analysis and the VRC, Catherine D. Adams
 
LEXICOGRAPHY
LEXICOGRAPHY LEXICOGRAPHY
LEXICOGRAPHY
 
Deploying Semantic Technologies for Digital Publishing: A Case Study from Log...
Deploying Semantic Technologies for Digital Publishing: A Case Study from Log...Deploying Semantic Technologies for Digital Publishing: A Case Study from Log...
Deploying Semantic Technologies for Digital Publishing: A Case Study from Log...
 
Educ 1821 teaching reading & writing
Educ 1821 teaching reading & writingEduc 1821 teaching reading & writing
Educ 1821 teaching reading & writing
 

Recently uploaded

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Recently uploaded (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Visualizing Textual Data

  • 1. “Visualizing Textual Data” Drayton C. Benner Founder/President, Miklal Software Solutions PhD Candidate, Northwest Semitic Philology University of Chicago DraytonBenner@MiklalSoftware.com
  • 2. Word alignment: uses • Increases access to the source text (e.g. Hebrew and Greek Bible) to readers of the target (e.g. English) text • Commonly used as an input for statistical machine translation • Also useful to scholars • Analyzing literary dependence • Translation technique • Find usual patterns and deviations from usual patterns • Reception history • Ability to find pluses and minuses very easily • Textual criticism • Better identification of the source text underlying the translation • Better macro-understanding of translation technique avoids many mistakes • Input to algorithmic attempts to reconstruct tree structure of manuscripts • Lexicography • Philology • Linguistics: historical, contact, corpus • Representation of /ʕ/ and /ɣ/ by ‫ע‬
  • 3. Past visualizations of aligned texts Source: Logos Bible Software
  • 4. Past visualizations of aligned texts Source: BibleWorks
  • 5. Past visualizations of aligned texts Source: esvbible.org
  • 6. Past visualizations by computational linguists Lines (from Smith and Jahr 2000) Alignment matrix (from Germann 2007)
  • 7. Past visualizations by computational linguists Colors (from Merkel et al 2003) Mouseover (from Germann 2008)