SlideShare a Scribd company logo
1 of 30
SMART-GS: A Tool for Studying
Digitized Historical Manuscripts
Yuta Hashimoto
PhD student, Department of Humanistic Informatics
Kyoto University
March 15, 2015 @ University of Michigan
Introduction
• Who am I
• A PhD student studying DH at Kyoto University
• Research interest: Digital History
• Background: History of Science
• Also an iOS/Android Developer
• Kin Digi Reader (近デジリーダー)
• A mobile reader for the Kindai Digital Library
• In this talk, I will…
• Introduce an application named SMART-GS
• And its possible contributions to Japanese studies
What is SMART-GS?
• A transcription/annotation suite for digitized
historical manuscripts
• Has been developed in Kyoto University since
2007
• An open source project
• SMART-GS is NOT
• An OCR application for handwritten texts
• A language-dependent application
A Screenshot
Project Background:
The Increase of Large-Scale Digital Archives
How Should Historians Handle Digital Images?
David Hilbert
(1862-1943)
Problems with Paper-based Research
1. Papers are heavy and require space
1. Difficult to share the “metadata” added to the
manuscripts with co-workers
2. Organizing information is also difficult
• Searching, grouping, indexing, etc…
Main Features of SMART-GS
Introducing SMART-GS
Markup Functions for Texts and Images
• Various ways of marking up
image regions:
• rectangle or polygon shape
• Drawing an arrow from one
region to another
• Putting a comment on it
• etc.
• HTML markup for texts:
• Highlighting a certain word
or phrase
• Adding a link to an external
website
Linking Markups
• Any two markups can be
linked to each other
• These links are one-to-many
and bidirectional
• Link itself can be annotated
Word Spotting for Handwritten Text (DSC Search)
Search results for query “Scheler” (a German philosopher’s name)
How DSC Search indexes images
1. Separate the image into
lines
2. Divide each line into thin
slits
3. Compute a gradient vector
for each pixel in each slits
4. Accumulate these gradient
vectors (which will be used
as “feature vectors”)
How DSC Search Finds Similar Images
Query image
Candidate Image
• DSC Search calculates the
“distances” between the query
and candidate images by
comparing their feature vector
sequences
• The smaller the distance is, the
more likely two images have
similar shapes
Pros and Cons of DSC Search
• Pros
• Can be applied to any type of documents, regardless of
its languages and text directions
• No need for executing machine learning
• Cons
• Requires preprocessing by users for separating lines
• Not accurate for manuscripts written by multiple authors
Applications of SMART-GS
to Historical Research Projects
Transcription Project of Kuratomi’s Diary
• Baron Yuzaburo Kuratomi
(1853-1943)
• An elite bureaucrat-politician of
Meiji, Taisho, and early Showa era
• Project goal
• to publish complete transcription of
Kuratomi’s diary
• which consists of more than 300
notebooks
Team-based Transcription with SMART-GS
WebDAV Server
gsx file
1. Create draft transcriptions 2. Add annotations 3. Revise and finalize
transcription texts
Transcription of Hajime Tanabe’s
Lecture Notebooks
• Hajime Tanabe (1885-1962)
• One of prominent philosophers
of Kyoto School
• Tanabe’s lecture notebooks
• Written in Japanese, German,
Latin, Greek, and English
• And written in extremely bad
handwriting
Group Reading of Tanabe’s Notebooks
Transcription of Earthquake Recordings
◀ Teibi Shinsai Roku (丁未震災録):
A recording of a large earthquake that
took place in 1847
▲Reading Group of Earthquake Recordings
(古地震研究会)
How SMART-GS can Contribute to
Japanese Studies
As a Group Learning Tool
Creating a Shared Dictionary with SMART-GS
As a Platform for the International Collaboration
• NIJL’s large-scale project
• Titled “Construction of the International Collaborative
Network on Japanese Classical Books”
• 0.3 million books will be digitized and published on the
web by 2024
Our Current Attempts
• To have NIJL use SMART-GS as their official
transcription tool
• And to make SMART-GS a global platform for
Japanese studies
• So that scholars all over the world can cooperate
through the network on the same platform
Ongoing Development: the Web Version
Conclusion
• More and more digital images of historical
manuscripts have become available on the web
• SMART-GS provides a set of features to handle
these digital images effectively
• And it offers ways to collaborate with other
scholars through the network
• Our next attempt is to make SMART-GS a global
platform where scholars can collaborate with each
other
Thank you for listening!
ご清聴ありがとうございました
Any questions and comments?

More Related Content

Similar to SMART-GS: A Tool for Studying Digitized Historical Manuscripts

Research software susainability
Research software susainabilityResearch software susainability
Research software susainabilityDaniel S. Katz
 
Maia report
Maia report Maia report
Maia report shekchuen
 
UKSG Conference 2016 Breakout Session - Measuring the research impact of digi...
UKSG Conference 2016 Breakout Session - Measuring the research impact of digi...UKSG Conference 2016 Breakout Session - Measuring the research impact of digi...
UKSG Conference 2016 Breakout Session - Measuring the research impact of digi...UKSG: connecting the knowledge community
 
Digital Humanities: An Introduction
Digital Humanities: An IntroductionDigital Humanities: An Introduction
Digital Humanities: An IntroductionDilip Barad
 
UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18Rafael Alvarado
 
1220 7106026052 7106026051
1220 7106026052 71060260511220 7106026052 7106026051
1220 7106026052 7106026051adhisry
 
Présentation Günter Mühlberger, BnF Information Day
Présentation Günter Mühlberger, BnF Information DayPrésentation Günter Mühlberger, BnF Information Day
Présentation Günter Mühlberger, BnF Information DayEuropeana Newspapers
 
Beyond the Scanned Image: A Needs Assessment of Faculty Users of Digital Coll...
Beyond the Scanned Image: A Needs Assessment of Faculty Users of Digital Coll...Beyond the Scanned Image: A Needs Assessment of Faculty Users of Digital Coll...
Beyond the Scanned Image: A Needs Assessment of Faculty Users of Digital Coll...Harriett Green
 
Overview and Summarize knowledge areas: a dual approach in knowledge mapping ...
Overview and Summarize knowledge areas: a dual approach in knowledge mapping ...Overview and Summarize knowledge areas: a dual approach in knowledge mapping ...
Overview and Summarize knowledge areas: a dual approach in knowledge mapping ...KNOWeSCAPE2014
 
RDAP 15 Data Management Outreach for the Humanities: A University of Illinois...
RDAP 15 Data Management Outreach for the Humanities: A University of Illinois...RDAP 15 Data Management Outreach for the Humanities: A University of Illinois...
RDAP 15 Data Management Outreach for the Humanities: A University of Illinois...ASIS&T
 
Outcomes Visual Navigation Project
Outcomes Visual Navigation ProjectOutcomes Visual Navigation Project
Outcomes Visual Navigation ProjectTimelessFuture
 
Navigating the Storm: eMOP, Big DH Projects, and Agile Steering Standards
Navigating the Storm: eMOP, Big DH Projects, and Agile Steering StandardsNavigating the Storm: eMOP, Big DH Projects, and Agile Steering Standards
Navigating the Storm: eMOP, Big DH Projects, and Agile Steering StandardsLiz Grumbach
 
Being Practical. Electronic editions of Flemish literary texts and documents ...
Being Practical. Electronic editions of Flemish literary texts and documents ...Being Practical. Electronic editions of Flemish literary texts and documents ...
Being Practical. Electronic editions of Flemish literary texts and documents ...Edward Vanhoutte
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012lljohnston
 
WeGov Analysis Tools to connect Policy Makers with Citizens Online
WeGov Analysis Tools to connect Policy Makers with Citizens OnlineWeGov Analysis Tools to connect Policy Makers with Citizens Online
WeGov Analysis Tools to connect Policy Makers with Citizens OnlineTimo Wandhoefer
 

Similar to SMART-GS: A Tool for Studying Digitized Historical Manuscripts (20)

2019 TURIN scholarly publishing
2019 TURIN scholarly publishing2019 TURIN scholarly publishing
2019 TURIN scholarly publishing
 
Research software susainability
Research software susainabilityResearch software susainability
Research software susainability
 
co:op-READ-Convention Marburg - Günter Mühlberger
co:op-READ-Convention Marburg - Günter Mühlbergerco:op-READ-Convention Marburg - Günter Mühlberger
co:op-READ-Convention Marburg - Günter Mühlberger
 
Maia report
Maia report Maia report
Maia report
 
UKSG Conference 2016 Breakout Session - Measuring the research impact of digi...
UKSG Conference 2016 Breakout Session - Measuring the research impact of digi...UKSG Conference 2016 Breakout Session - Measuring the research impact of digi...
UKSG Conference 2016 Breakout Session - Measuring the research impact of digi...
 
Dsir2019(nishiyama)
Dsir2019(nishiyama)Dsir2019(nishiyama)
Dsir2019(nishiyama)
 
Digital Humanities: An Introduction
Digital Humanities: An IntroductionDigital Humanities: An Introduction
Digital Humanities: An Introduction
 
UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18
 
1220 7106026052 7106026051
1220 7106026052 71060260511220 7106026052 7106026051
1220 7106026052 7106026051
 
Présentation Günter Mühlberger, BnF Information Day
Présentation Günter Mühlberger, BnF Information DayPrésentation Günter Mühlberger, BnF Information Day
Présentation Günter Mühlberger, BnF Information Day
 
Beyond the Scanned Image: A Needs Assessment of Faculty Users of Digital Coll...
Beyond the Scanned Image: A Needs Assessment of Faculty Users of Digital Coll...Beyond the Scanned Image: A Needs Assessment of Faculty Users of Digital Coll...
Beyond the Scanned Image: A Needs Assessment of Faculty Users of Digital Coll...
 
Overview and Summarize knowledge areas: a dual approach in knowledge mapping ...
Overview and Summarize knowledge areas: a dual approach in knowledge mapping ...Overview and Summarize knowledge areas: a dual approach in knowledge mapping ...
Overview and Summarize knowledge areas: a dual approach in knowledge mapping ...
 
Szomszor "Methods and Tools for Scholarly Data Analytics"
Szomszor "Methods and Tools for Scholarly Data Analytics"Szomszor "Methods and Tools for Scholarly Data Analytics"
Szomszor "Methods and Tools for Scholarly Data Analytics"
 
RDAP 15 Data Management Outreach for the Humanities: A University of Illinois...
RDAP 15 Data Management Outreach for the Humanities: A University of Illinois...RDAP 15 Data Management Outreach for the Humanities: A University of Illinois...
RDAP 15 Data Management Outreach for the Humanities: A University of Illinois...
 
Outcomes Visual Navigation Project
Outcomes Visual Navigation ProjectOutcomes Visual Navigation Project
Outcomes Visual Navigation Project
 
Navigating the Storm: eMOP, Big DH Projects, and Agile Steering Standards
Navigating the Storm: eMOP, Big DH Projects, and Agile Steering StandardsNavigating the Storm: eMOP, Big DH Projects, and Agile Steering Standards
Navigating the Storm: eMOP, Big DH Projects, and Agile Steering Standards
 
QR Codes online event 28.2.2018
QR Codes online event 28.2.2018QR Codes online event 28.2.2018
QR Codes online event 28.2.2018
 
Being Practical. Electronic editions of Flemish literary texts and documents ...
Being Practical. Electronic editions of Flemish literary texts and documents ...Being Practical. Electronic editions of Flemish literary texts and documents ...
Being Practical. Electronic editions of Flemish literary texts and documents ...
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
 
WeGov Analysis Tools to connect Policy Makers with Citizens Online
WeGov Analysis Tools to connect Policy Makers with Citizens OnlineWeGov Analysis Tools to connect Policy Makers with Citizens Online
WeGov Analysis Tools to connect Policy Makers with Citizens Online
 

Recently uploaded

英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 

Recently uploaded (20)

英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 

SMART-GS: A Tool for Studying Digitized Historical Manuscripts

  • 1. SMART-GS: A Tool for Studying Digitized Historical Manuscripts Yuta Hashimoto PhD student, Department of Humanistic Informatics Kyoto University March 15, 2015 @ University of Michigan
  • 2. Introduction • Who am I • A PhD student studying DH at Kyoto University • Research interest: Digital History • Background: History of Science • Also an iOS/Android Developer • Kin Digi Reader (近デジリーダー) • A mobile reader for the Kindai Digital Library • In this talk, I will… • Introduce an application named SMART-GS • And its possible contributions to Japanese studies
  • 3.
  • 4. What is SMART-GS? • A transcription/annotation suite for digitized historical manuscripts • Has been developed in Kyoto University since 2007 • An open source project • SMART-GS is NOT • An OCR application for handwritten texts • A language-dependent application
  • 6. Project Background: The Increase of Large-Scale Digital Archives
  • 7. How Should Historians Handle Digital Images? David Hilbert (1862-1943)
  • 8. Problems with Paper-based Research 1. Papers are heavy and require space 1. Difficult to share the “metadata” added to the manuscripts with co-workers 2. Organizing information is also difficult • Searching, grouping, indexing, etc…
  • 9. Main Features of SMART-GS
  • 11. Markup Functions for Texts and Images • Various ways of marking up image regions: • rectangle or polygon shape • Drawing an arrow from one region to another • Putting a comment on it • etc. • HTML markup for texts: • Highlighting a certain word or phrase • Adding a link to an external website
  • 12. Linking Markups • Any two markups can be linked to each other • These links are one-to-many and bidirectional • Link itself can be annotated
  • 13. Word Spotting for Handwritten Text (DSC Search) Search results for query “Scheler” (a German philosopher’s name)
  • 14. How DSC Search indexes images 1. Separate the image into lines 2. Divide each line into thin slits 3. Compute a gradient vector for each pixel in each slits 4. Accumulate these gradient vectors (which will be used as “feature vectors”)
  • 15. How DSC Search Finds Similar Images Query image Candidate Image • DSC Search calculates the “distances” between the query and candidate images by comparing their feature vector sequences • The smaller the distance is, the more likely two images have similar shapes
  • 16. Pros and Cons of DSC Search • Pros • Can be applied to any type of documents, regardless of its languages and text directions • No need for executing machine learning • Cons • Requires preprocessing by users for separating lines • Not accurate for manuscripts written by multiple authors
  • 17. Applications of SMART-GS to Historical Research Projects
  • 18. Transcription Project of Kuratomi’s Diary • Baron Yuzaburo Kuratomi (1853-1943) • An elite bureaucrat-politician of Meiji, Taisho, and early Showa era • Project goal • to publish complete transcription of Kuratomi’s diary • which consists of more than 300 notebooks
  • 19. Team-based Transcription with SMART-GS WebDAV Server gsx file 1. Create draft transcriptions 2. Add annotations 3. Revise and finalize transcription texts
  • 20. Transcription of Hajime Tanabe’s Lecture Notebooks • Hajime Tanabe (1885-1962) • One of prominent philosophers of Kyoto School • Tanabe’s lecture notebooks • Written in Japanese, German, Latin, Greek, and English • And written in extremely bad handwriting
  • 21. Group Reading of Tanabe’s Notebooks
  • 22. Transcription of Earthquake Recordings ◀ Teibi Shinsai Roku (丁未震災録): A recording of a large earthquake that took place in 1847 ▲Reading Group of Earthquake Recordings (古地震研究会)
  • 23. How SMART-GS can Contribute to Japanese Studies
  • 24. As a Group Learning Tool
  • 25. Creating a Shared Dictionary with SMART-GS
  • 26. As a Platform for the International Collaboration • NIJL’s large-scale project • Titled “Construction of the International Collaborative Network on Japanese Classical Books” • 0.3 million books will be digitized and published on the web by 2024
  • 27. Our Current Attempts • To have NIJL use SMART-GS as their official transcription tool • And to make SMART-GS a global platform for Japanese studies • So that scholars all over the world can cooperate through the network on the same platform
  • 29. Conclusion • More and more digital images of historical manuscripts have become available on the web • SMART-GS provides a set of features to handle these digital images effectively • And it offers ways to collaborate with other scholars through the network • Our next attempt is to make SMART-GS a global platform where scholars can collaborate with each other
  • 30. Thank you for listening! ご清聴ありがとうございました Any questions and comments?