SlideShare a Scribd company logo
xml2tex
The easy way to define
XML-to-LaTeX converters
Created by /Keiichiro Shikano @golden_lucky
What xml2tex is NOT.
xml2tex is NOT a markup.
xml2tex is NOT a ready-to-use converter application.
xml2tex is a framework to give XML
syntax a nice presentation layer
using LaTeX.
xml2tex is a framework to give XML
syntax a nice presentation layer
using LaTeX.
xml2tex is a framework for using
XML syntax as a source of LaTeX.
Overview of xml2tex
Accusation
”Are you idiot?
Why on earth are you going to use ugly XML syntax instead
some more concise syntax?“
Our Way of Creating a Book
First of all, we don't want to use WYSIWYG application to
create a book being sold.
We usually maintain
manuscripts using VCS
(github) to the very last.
It makes working together
smoother over the Internet.
LaTeX is one of the best tools
for typesetting in this kind of
environment.
Taking advantage of xml2tex, we could use XML as
manuscripts.
Alternative approaches
Use XSLT and XSL-FO for typesetting.
Use XSLT for getting LaTeX.
Convert them into LaTeX (once and for all).
Or better yet, convert them into other standard markups or
markdowns (and use the corresponding environment like
DocBook, Sphinx, pandoc, TeXML, and so on).
Pros of using XML
(which LaTeX also has)
We need a variety of meta data to create a book; editor's
comments, index entries, and .
We need to be able to achieve rich enough page layout for a
.
original texts
commercial book
We usually let the translator put down the corresponding
Japanese text below each original paragraphs like this.
A real example of XML source for
translating projects
<titlelang="en">Loremipsum</title>
<titlelang="en">いろはにほへと</title>
<plang="en">dolorsitamet,</p>
<plang="ja">ちりぬるを</p>
<blockquotelang="en">consecteturadipisicingelit.</blockquote>
<blockquotelang="ja">わかよたれそつねならむ。</blockquote>
A real example of unusual page
layout
<leftlang="en">
Moveoutthenewfunctionsothatweget/lengthback.
</left>
<leftlang="ja">
再び /lengthが得られるように、
この新しい関数をくくり出してください。
</left>
<rightlang="en">
<program>
((lambda(/mk-length)
(/mk-length/mk-length))
(lambda(/mk-length)
([(lambda(/length) ]
[ (lambda(/l) ]
...
[ (/add1(/length(/cdr/l)))))))]
(lambda(/x)
((/mk-length/mk-length)/x)))))
</program>
</right>
<rightlang="ja">
<program>
A real example of using meta data
Pros of using XML,
instead of LaTeX
Authors of technical books tend to have HTML literacy,
provided that there's no excessive information against human.
Easy to confirm the appearance of structured text just by
Web browsers, provided that there's an appropriate CSS.
It's technically possible to generate EPUB as well as PDF.
Cons of using XML
If there's excessive information just for machine, it becomes
hard to edit the manuscript.
If there's no appropriate CSS, you have to rely only on the
abstract structure of documents.
It's technically possible, but not trivial to generate EPUB.
More to the point...
Creating PDFs from XML often requires some proprietary
software and XSLT!
What we need is an easy way to get
XML-to-LaTeX converter, as needed.
Without any prior
information for the
structure,
Without any restriction for
the structure,
Without explicitly writing
XML parser every time,
With a profound support for generating LaTeX documents.
(XSLT won't be the best solution!)
xml2tex — our approach
How to get a LaTeX representation
of this XML?
Note that it was converted from a DTP application.
<?xmlversion="1.0"encoding="UTF-8"?>
<?xml-stylesheethref="book.css"type="text/css"charset="UTF-8"?>
<XML>
<TITLE>AGreatBook</TITLE>
<Chapter-Title>WritinginPractice</Chapter-Title>
<Body-Text-First>Beforeyoucanstartwritingarealbook...</Body-Text-First>
<Body-Text>Let&rsquo;sgetstarted!</Body-Text>
<Heading-1>IntroductiontoLaTeX</Heading-1>
<Body-Text-First>Tokeeponattractingyourreaders...</Body-Text-First>
<Code-First>$latex</Code-First>
<Code>ThisispdfTeX,Version3.14159265-2.6-1.40.15(TeXLive2014)(preloadedformat=latex)</Code>
<Code>restrictedwrite18enabled.</Code>
<Code-Last>**</Code-Last>
<Body-Text>Notethatyoucan&rsquo;tputdown%inyourmasterpiece...</Body-Text>
</XML>
Things you can tell
just by looking at the XML
It at least has the elements named <TITLE>, <Chapter-
Title>, <Body-Text-First>, <Body-Text>, <Heading-1>,
<Code-First>, <Code>, and <Code-Last>.
It at least has a XML character entity &rsquo;.
<?xml...?>is the processing instruction, which generally
seems to be useless in LaTeX.
Characters like %, $and may need to be escaped.
All you need to tell xml2tex
That's it!
Save these lines into a file, and it can be used as a kind of
specification for xml2tex.
(define-tagXML (make-latex-env'document))
(define-tagTITLE(make-latex-cmd'title))
(define-tagChapter-Title (make-latex-cmd'chapter))
(define-tagHeading-1 (make-latex-cmd'section))
(define-tagBody-Text-First(define-rule"nnoindent{}"trim"parn"))
(define-tagBody-Text (define-rule"n"trim"parn"))
(define-tagCode-First(define-rule"begin{alltt}"kick-comment""))
(define-tagCode (define-rule""kick-comment""))
(define-tagCode-Last (define-rule""kick-comment"end{alltt}"))
Converting XML with the rule file
where demo.rulesis the rule file defined before.
$xml2tex-rdemo.rulesdemo.xml>demo.tex
documentclass{book}
usepackage[T1]{fontenc}
usepackage{alltt}
begin{document}
title{AGreatBook}
chapter{WritinginPractice}
noindent{}Beforeyoucanstartwritingarealbook...par
Let'sgetstarted!par
section{IntroductiontoLaTeX}
noindent{}Tokeeponattractingyourreaders...par
begin{alltt}{symbol{36}}latex
pdfLaTeX result
Generating LaTeX syntax from the
document tree is defined as a rule.
1. Preceding string, or a thunk which returns a preceding
string. Possible example is texttt{.
2. A procedure from string to another string. trimis one of
such procedures. It takes a string and returns a string in
which special characters in LaTeX are escaped properly.
3. Following string, or a thunk which returns a following
string. Possible example is }.
(define-rule
"n" ;Putthisatthebeginning.
trim ;Itstextnodesshouldbetreatedwiththis.
"parn")) ;Putthisattheending.
Let the defined rule map the content
from XML element to LaTeX syntax,
based on the tag name.
Just putting down these definitions for each XML tags is
enough to convert the XML into LaTeX.
(define-tagBody-Text ;IftheXMLnodehasthisname,...
(define-rule ;Applythisruletothecontent.
"n"
trim
"parn"))
Supportive features in defining rules
(make-latex-cmd'cmdname)
generates a rule for creating a LaTeX command
cmdname{...}with the contents.
(make-latex-env'envname)
generates a rule for creating a LaTeX environment
begin{envname}...ent{envname}with the contents.
(through)
generates a rule for putting down all the contents with
necessary escaping.
(ignore)
generates a rule for discarding the contents.
Default rule is "through"
The elements you haven't define any explicit rule are indicated
while the conversion. It helps you try detecting the unknown
elements within the given XML file.
$xml2tex-rmy.ruleinput.xml
NotknowingtheLaTeXsyntaxfor<div>,...applyed(through).
NotknowingtheLaTeXsyntaxfor<div>,...applyed(through).
NotknowingtheLaTeXsyntaxfor<div>,...applyed(through).
NotknowingtheLaTeXsyntaxfor<div>,...applyed(through).
NotknowingtheLaTeXsyntaxfor<div>,...applyed(through).
documentclass{book}
usepackage[T1]{fontenc}
begin{document}
chapter{StartingOut}
..par
...
Supportive features in adoring tree
($parent?'(tag1tag2...))
returns True if the node is directly under tag1or tag2...
There's a lot more similar functions like $parent(takes the
parent name), $siblings?, $under?, and so on.
($@'attrname)
returns a string value if the node has an attribute of the
attrname.
An example of
$parentand $parent?
$-functions can be used within (define-rule....
Also note that (define-rule...takes a procedure instead
of a fixed string for its first argument.
(define-tagtitle
(define-rule
(lambda()
(cond
(($parent?'(chapter)) "chapter{")
(($parent?'(sect1)) "section{")
(($parent?'(sect2sect3))"subsection{")
(else(error"norulefortitle"($parent)))))
trim
"}"))
An example of
$@(getting attribute value)
Note that #`"..."is a syntax for a string interpolation. It
actually a feature of Gauche, a Scheme programming
language on which xml2tex works.
(define-tagimg
(define-rule
(list"begin{figure}n"
"includegraphics"
#`"[width=,($@'width)]"
#`"{,($@'src)}")))
trim
"end{figure}"))
More practical example —
HTML tables
<body>
<table>
<tr>
<tdbgcolor="#ff0000"width="30%">1</td>
<tdbgcolor="#33cccc"width="20%">2</td>
<tdbgcolor="#00ff00">3</td>
<tdrowspan="2"bgcolor="#ff9900">4</td>
</tr>
<tr>
<tdbgcolor="#00ffff">A</td>
<tdbgcolor="#ff00ff"align="right"colspan="2"rowspan="2"width="50%">B</td>
</tr>
<tr>
<tdbgcolor="#ffff00">C</td><tdbgcolor="#00cc33"width="20%">D</td>
</tr>
</table>
</body>
Possible rule to convert HTML table
to LaTeX's tabular environment
It requires transformation of the tree, before applying a rule
defined by define-rule.
To transform the tree, use :prekeyword within define-
rule.
(define-tagtable
(define-rule
#`"begin{tabular}{|,($@'colspec)|}n";colspecisageneratedattribute
trim
"end{tabular}"
:pre
(lambda(bodyroot)
(let*((trs((node-closure(ntype-names??'(tr)))body))
(tds(map(node-closure(ntype-names??'(tdth)))trs))
(tr-attrs(mapsxml:attr-as-listtrs))
(colspec(make-colspectds)))
(sxml:set-attr
(cons(sxml:namebody)
(append
(transform-trstdstr-attrs)
(sxml:aux-as-listbody)
(sxml:content
(filter(sxml:invert(ntype-names??'(tr)))body))))
(list'colspeccolspec))))
))
(define(transform-trstrstr-attrs)
(define(make-contentnameattrcont)
(consname(appendattrcont)))
(map
(lambda(tr)
Results
1 2 3 4
A B
C D
Practical example 2 —
Footnotes
<XML>
<Body-Text>Notethatyou<Ahref="#pgfId-1018223"CLASS="footnote">1</A>
can'tputdown%inyourmasterpiece...</Body-Text>
<FOOTNOTES>
<FOOTNOTE>
<Footnote-Text><AID="pgfId-1018223"></A>
Yes,it'syou.
</Footnote-Text>
</FOOTNOTE>
</FOOTNOTES>
</XML>
Possible rule to make LaTeX's
footnotefrom <FOOTNOTES>at the
bottom
This time you need to get hold of a subtree, and attach it to
another node.
Things has been rather messy, but :prestill works.
(define-tagA
(define-rule""trim""
:pre
(lambda(br)
(cons'A
(map-union(lambda(e)(map-union
(lambda(a)
(if(string=?
(ifstr(sxml:attr-ub'href))
#`"#,(sxml:attr-ua'ID)")
e#f))
((select-kids(ntype-names??'(A)))e)))
((node-closure(ntype-names??'(Footnote-Text)))root))))))
(define-tagFootnote-Text(make-latex-cmd'footnote))
(define-tagFOOTNOTES(ignore))
Results
Conclusion
XML is not bad for making books.
Simple markups/markdowns won't offer you a rich layout.
LaTeX will provide a concrete presentation layer for XML documents.
However, using XSLT for converting XML to LaTeX is a hard way, because XSLT is a tool
for getting another XML from a XML.
One major missing-link is a complete and lightweight way
of getting LaTeX from XML.
xml2tex would be a solution with its declarative style of defining whatever conversion rules.
ConTeXt's XML support is probably another solution.
It requires some experiences on XML, as well as LaTeX.
In addition to that, xml2texrequires Scheme experience.

More Related Content

What's hot

Training basic latex
Training basic latexTraining basic latex
Training basic latex
University of Technology
 
LaTeX Tutorial
LaTeX TutorialLaTeX Tutorial
LaTeX Tutorial
Tai Lun Tseng
 
Sax parser
Sax parserSax parser
Sax parser
Mahara Jothi
 
Introduction to Latex
Introduction to LatexIntroduction to Latex
Introduction to Latex
Mohamed Alrshah
 
Simple xml in .net
Simple xml in .netSimple xml in .net
Simple xml in .net
Vi Vo Hung
 
Parsing XML Data
Parsing XML DataParsing XML Data
Parsing XML Data
Mu Chun Wang
 

What's hot (6)

Training basic latex
Training basic latexTraining basic latex
Training basic latex
 
LaTeX Tutorial
LaTeX TutorialLaTeX Tutorial
LaTeX Tutorial
 
Sax parser
Sax parserSax parser
Sax parser
 
Introduction to Latex
Introduction to LatexIntroduction to Latex
Introduction to Latex
 
Simple xml in .net
Simple xml in .netSimple xml in .net
Simple xml in .net
 
Parsing XML Data
Parsing XML DataParsing XML Data
Parsing XML Data
 

Viewers also liked

イテレーティブでインクリメンタルな技術書の作り方
イテレーティブでインクリメンタルな技術書の作り方イテレーティブでインクリメンタルな技術書の作り方
イテレーティブでインクリメンタルな技術書の作り方
Keiichiro Shikano
 
TUG 2014 参加体験記
TUG 2014 参加体験記TUG 2014 参加体験記
TUG 2014 参加体験記
Keiichiro Shikano
 
Gaucheで本を作る
Gaucheで本を作るGaucheで本を作る
Gaucheで本を作る
Keiichiro Shikano
 
Index makes your book perfect
Index makes your book perfectIndex makes your book perfect
Index makes your book perfect
Keiichiro Shikano
 
『新装版リファクタリング ―既存のコードを安全に改善する―』 のここがすごい
『新装版リファクタリング ―既存のコードを安全に改善する―』 のここがすごい『新装版リファクタリング ―既存のコードを安全に改善する―』 のここがすごい
『新装版リファクタリング ―既存のコードを安全に改善する―』 のここがすごい
Keiichiro Shikano
 
脚注をめぐる冒険
脚注をめぐる冒険脚注をめぐる冒険
脚注をめぐる冒険
Keiichiro Shikano
 
TeX原稿からEPUBを作りたい
TeX原稿からEPUBを作りたいTeX原稿からEPUBを作りたい
TeX原稿からEPUBを作りたい
Keiichiro Shikano
 
ドキュメントシステムはこれを使え2015年版
ドキュメントシステムはこれを使え2015年版ドキュメントシステムはこれを使え2015年版
ドキュメントシステムはこれを使え2015年版
Keiichiro Shikano
 
多値で簡単パーサーコンビネーター
多値で簡単パーサーコンビネーター多値で簡単パーサーコンビネーター
多値で簡単パーサーコンビネーター
Keiichiro Shikano
 

Viewers also liked (11)

Csspage1 2014-06-22
Csspage1 2014-06-22Csspage1 2014-06-22
Csspage1 2014-06-22
 
Our docsys-pyfes-2012-11
Our docsys-pyfes-2012-11Our docsys-pyfes-2012-11
Our docsys-pyfes-2012-11
 
イテレーティブでインクリメンタルな技術書の作り方
イテレーティブでインクリメンタルな技術書の作り方イテレーティブでインクリメンタルな技術書の作り方
イテレーティブでインクリメンタルな技術書の作り方
 
TUG 2014 参加体験記
TUG 2014 参加体験記TUG 2014 参加体験記
TUG 2014 参加体験記
 
Gaucheで本を作る
Gaucheで本を作るGaucheで本を作る
Gaucheで本を作る
 
Index makes your book perfect
Index makes your book perfectIndex makes your book perfect
Index makes your book perfect
 
『新装版リファクタリング ―既存のコードを安全に改善する―』 のここがすごい
『新装版リファクタリング ―既存のコードを安全に改善する―』 のここがすごい『新装版リファクタリング ―既存のコードを安全に改善する―』 のここがすごい
『新装版リファクタリング ―既存のコードを安全に改善する―』 のここがすごい
 
脚注をめぐる冒険
脚注をめぐる冒険脚注をめぐる冒険
脚注をめぐる冒険
 
TeX原稿からEPUBを作りたい
TeX原稿からEPUBを作りたいTeX原稿からEPUBを作りたい
TeX原稿からEPUBを作りたい
 
ドキュメントシステムはこれを使え2015年版
ドキュメントシステムはこれを使え2015年版ドキュメントシステムはこれを使え2015年版
ドキュメントシステムはこれを使え2015年版
 
多値で簡単パーサーコンビネーター
多値で簡単パーサーコンビネーター多値で簡単パーサーコンビネーター
多値で簡単パーサーコンビネーター
 

Similar to xml2tex at TUG 2014

XML
XMLXML
Latex workshop: Essentials and Practices
Latex workshop: Essentials and PracticesLatex workshop: Essentials and Practices
Latex workshop: Essentials and Practices
Mohamed Alrshah
 
Xml writers
Xml writersXml writers
Xml writers
Raghu nath
 
Processing XML with Java
Processing XML with JavaProcessing XML with Java
Processing XML with Java
BG Java EE Course
 
Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)
STC-Philadelphia Metro Chapter
 
Xml
XmlXml
Xml
XmlXml
Xml and xml processor
Xml and xml processorXml and xml processor
Xml and xml processor
Himanshu Soni
 
Xml and xml processor
Xml and xml processorXml and xml processor
Xml and xml processor
Himanshu Soni
 
Applied xml programming for microsoft 2
Applied xml programming for microsoft  2Applied xml programming for microsoft  2
Applied xml programming for microsoft 2
Raghu nath
 
latex document for IT workshop Lab . B.Tech
latex document for IT workshop Lab . B.Techlatex document for IT workshop Lab . B.Tech
latex document for IT workshop Lab . B.Tech
Sandhya Gandham
 
Sax Dom Tutorial
Sax Dom TutorialSax Dom Tutorial
Sax Dom Tutorial
vikram singh
 
Xml representation oftextspecifications
Xml representation oftextspecificationsXml representation oftextspecifications
Xml representation oftextspecifications
usert098
 
XML Bible
XML BibleXML Bible
XML Bible
LiquidHub
 
93 peter butterfield
93 peter butterfield93 peter butterfield
93 peter butterfield
Society for Scholarly Publishing
 
Semantic RDF based integration framework for heterogeneous XML data sources
Semantic RDF based integration framework for heterogeneous XML data sourcesSemantic RDF based integration framework for heterogeneous XML data sources
Semantic RDF based integration framework for heterogeneous XML data sources
Deniz Kılınç
 
XML Tutor maXbox starter27
XML Tutor maXbox starter27XML Tutor maXbox starter27
XML Tutor maXbox starter27
Max Kleiner
 
latex-workshop Dr: Mohamed A. Alrshah
latex-workshop Dr: Mohamed A. Alrshahlatex-workshop Dr: Mohamed A. Alrshah
latex-workshop Dr: Mohamed A. Alrshah
Abdulazim N.Elaati
 
XML Transformations With PHP
XML Transformations With PHPXML Transformations With PHP
XML Transformations With PHP
Stephan Schmidt
 
Web data management (chapter-1)
Web data management (chapter-1)Web data management (chapter-1)
Web data management (chapter-1)
Dhaval Asodariya
 

Similar to xml2tex at TUG 2014 (20)

XML
XMLXML
XML
 
Latex workshop: Essentials and Practices
Latex workshop: Essentials and PracticesLatex workshop: Essentials and Practices
Latex workshop: Essentials and Practices
 
Xml writers
Xml writersXml writers
Xml writers
 
Processing XML with Java
Processing XML with JavaProcessing XML with Java
Processing XML with Java
 
Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)
 
Xml
XmlXml
Xml
 
Xml
XmlXml
Xml
 
Xml and xml processor
Xml and xml processorXml and xml processor
Xml and xml processor
 
Xml and xml processor
Xml and xml processorXml and xml processor
Xml and xml processor
 
Applied xml programming for microsoft 2
Applied xml programming for microsoft  2Applied xml programming for microsoft  2
Applied xml programming for microsoft 2
 
latex document for IT workshop Lab . B.Tech
latex document for IT workshop Lab . B.Techlatex document for IT workshop Lab . B.Tech
latex document for IT workshop Lab . B.Tech
 
Sax Dom Tutorial
Sax Dom TutorialSax Dom Tutorial
Sax Dom Tutorial
 
Xml representation oftextspecifications
Xml representation oftextspecificationsXml representation oftextspecifications
Xml representation oftextspecifications
 
XML Bible
XML BibleXML Bible
XML Bible
 
93 peter butterfield
93 peter butterfield93 peter butterfield
93 peter butterfield
 
Semantic RDF based integration framework for heterogeneous XML data sources
Semantic RDF based integration framework for heterogeneous XML data sourcesSemantic RDF based integration framework for heterogeneous XML data sources
Semantic RDF based integration framework for heterogeneous XML data sources
 
XML Tutor maXbox starter27
XML Tutor maXbox starter27XML Tutor maXbox starter27
XML Tutor maXbox starter27
 
latex-workshop Dr: Mohamed A. Alrshah
latex-workshop Dr: Mohamed A. Alrshahlatex-workshop Dr: Mohamed A. Alrshah
latex-workshop Dr: Mohamed A. Alrshah
 
XML Transformations With PHP
XML Transformations With PHPXML Transformations With PHP
XML Transformations With PHP
 
Web data management (chapter-1)
Web data management (chapter-1)Web data management (chapter-1)
Web data management (chapter-1)
 

More from Keiichiro Shikano

技術を本にして売る、という仕事
技術を本にして売る、という仕事技術を本にして売る、という仕事
技術を本にして売る、という仕事
Keiichiro Shikano
 
表とリスト(List Driven Table in LaTeX )
表とリスト(List Driven Table in LaTeX )表とリスト(List Driven Table in LaTeX )
表とリスト(List Driven Table in LaTeX )
Keiichiro Shikano
 
KindleでMathMLの現実
KindleでMathMLの現実KindleでMathMLの現実
KindleでMathMLの現実
Keiichiro Shikano
 
2018年でもEPSをTeXで使う
2018年でもEPSをTeXで使う2018年でもEPSをTeXで使う
2018年でもEPSをTeXで使う
Keiichiro Shikano
 
TeXの気持ちを理解するために知っておくと役立つかもしれないこと
TeXの気持ちを理解するために知っておくと役立つかもしれないことTeXの気持ちを理解するために知っておくと役立つかもしれないこと
TeXの気持ちを理解するために知っておくと役立つかもしれないこと
Keiichiro Shikano
 
Sphinxで売り物の書籍を作ってみた
Sphinxで売り物の書籍を作ってみたSphinxで売り物の書籍を作ってみた
Sphinxで売り物の書籍を作ってみた
Keiichiro Shikano
 
TeXは軽量マークアップの夢を見るか
TeXは軽量マークアップの夢を見るかTeXは軽量マークアップの夢を見るか
TeXは軽量マークアップの夢を見るか
Keiichiro Shikano
 
Texuser 2012-lt
Texuser 2012-lt Texuser 2012-lt
Texuser 2012-lt
Keiichiro Shikano
 
オーム社開発部がTeXを使う3つのおもな理由
オーム社開発部がTeXを使う3つのおもな理由オーム社開発部がTeXを使う3つのおもな理由
オーム社開発部がTeXを使う3つのおもな理由
Keiichiro Shikano
 

More from Keiichiro Shikano (9)

技術を本にして売る、という仕事
技術を本にして売る、という仕事技術を本にして売る、という仕事
技術を本にして売る、という仕事
 
表とリスト(List Driven Table in LaTeX )
表とリスト(List Driven Table in LaTeX )表とリスト(List Driven Table in LaTeX )
表とリスト(List Driven Table in LaTeX )
 
KindleでMathMLの現実
KindleでMathMLの現実KindleでMathMLの現実
KindleでMathMLの現実
 
2018年でもEPSをTeXで使う
2018年でもEPSをTeXで使う2018年でもEPSをTeXで使う
2018年でもEPSをTeXで使う
 
TeXの気持ちを理解するために知っておくと役立つかもしれないこと
TeXの気持ちを理解するために知っておくと役立つかもしれないことTeXの気持ちを理解するために知っておくと役立つかもしれないこと
TeXの気持ちを理解するために知っておくと役立つかもしれないこと
 
Sphinxで売り物の書籍を作ってみた
Sphinxで売り物の書籍を作ってみたSphinxで売り物の書籍を作ってみた
Sphinxで売り物の書籍を作ってみた
 
TeXは軽量マークアップの夢を見るか
TeXは軽量マークアップの夢を見るかTeXは軽量マークアップの夢を見るか
TeXは軽量マークアップの夢を見るか
 
Texuser 2012-lt
Texuser 2012-lt Texuser 2012-lt
Texuser 2012-lt
 
オーム社開発部がTeXを使う3つのおもな理由
オーム社開発部がTeXを使う3つのおもな理由オーム社開発部がTeXを使う3つのおもな理由
オーム社開発部がTeXを使う3つのおもな理由
 

Recently uploaded

How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...
DianaGray10
 
Acumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptxAcumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptx
BrainSell Technologies
 
Communications Mining Series - Zero to Hero - Session 3
Communications Mining Series - Zero to Hero - Session 3Communications Mining Series - Zero to Hero - Session 3
Communications Mining Series - Zero to Hero - Session 3
DianaGray10
 
Step-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From ScratchStep-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From Scratch
softsuave
 
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and CitiesThe Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
Arpan Buwa
 
Semantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software DevelopmentSemantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software Development
Baishakhi Ray
 
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and DisadvantagesBLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
SAI KAILASH R
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
bhumivarma35300
 
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
Zilliz
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
ldtexsolbl
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
shyamraj55
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
ssuser1915fe1
 
Camunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptxCamunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptx
ZachWylie3
 
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
shanihomely
 
Using LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and MilvusUsing LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and Milvus
Zilliz
 
The Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - CoatueThe Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - Coatue
Razin Mustafiz
 
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
Bhajan Mehta
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
Steven Carlson
 
Gen AI: Privacy Risks of Large Language Models (LLMs)
Gen AI: Privacy Risks of Large Language Models (LLMs)Gen AI: Privacy Risks of Large Language Models (LLMs)
Gen AI: Privacy Risks of Large Language Models (LLMs)
Debmalya Biswas
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
SubhamMandal40
 

Recently uploaded (20)

How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...
 
Acumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptxAcumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptx
 
Communications Mining Series - Zero to Hero - Session 3
Communications Mining Series - Zero to Hero - Session 3Communications Mining Series - Zero to Hero - Session 3
Communications Mining Series - Zero to Hero - Session 3
 
Step-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From ScratchStep-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From Scratch
 
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and CitiesThe Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
 
Semantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software DevelopmentSemantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software Development
 
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and DisadvantagesBLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
 
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
 
Camunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptxCamunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptx
 
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
 
Using LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and MilvusUsing LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and Milvus
 
The Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - CoatueThe Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - Coatue
 
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
 
Gen AI: Privacy Risks of Large Language Models (LLMs)
Gen AI: Privacy Risks of Large Language Models (LLMs)Gen AI: Privacy Risks of Large Language Models (LLMs)
Gen AI: Privacy Risks of Large Language Models (LLMs)
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
 

xml2tex at TUG 2014

  • 1. xml2tex The easy way to define XML-to-LaTeX converters Created by /Keiichiro Shikano @golden_lucky
  • 2. What xml2tex is NOT. xml2tex is NOT a markup. xml2tex is NOT a ready-to-use converter application.
  • 3. xml2tex is a framework to give XML syntax a nice presentation layer using LaTeX.
  • 4. xml2tex is a framework to give XML syntax a nice presentation layer using LaTeX. xml2tex is a framework for using XML syntax as a source of LaTeX.
  • 6. Accusation ”Are you idiot? Why on earth are you going to use ugly XML syntax instead some more concise syntax?“
  • 7. Our Way of Creating a Book First of all, we don't want to use WYSIWYG application to create a book being sold. We usually maintain manuscripts using VCS (github) to the very last. It makes working together smoother over the Internet. LaTeX is one of the best tools for typesetting in this kind of environment. Taking advantage of xml2tex, we could use XML as manuscripts.
  • 8. Alternative approaches Use XSLT and XSL-FO for typesetting. Use XSLT for getting LaTeX. Convert them into LaTeX (once and for all). Or better yet, convert them into other standard markups or markdowns (and use the corresponding environment like DocBook, Sphinx, pandoc, TeXML, and so on).
  • 9. Pros of using XML (which LaTeX also has) We need a variety of meta data to create a book; editor's comments, index entries, and . We need to be able to achieve rich enough page layout for a . original texts commercial book
  • 10. We usually let the translator put down the corresponding Japanese text below each original paragraphs like this. A real example of XML source for translating projects <titlelang="en">Loremipsum</title> <titlelang="en">いろはにほへと</title> <plang="en">dolorsitamet,</p> <plang="ja">ちりぬるを</p> <blockquotelang="en">consecteturadipisicingelit.</blockquote> <blockquotelang="ja">わかよたれそつねならむ。</blockquote>
  • 11. A real example of unusual page layout <leftlang="en"> Moveoutthenewfunctionsothatweget/lengthback. </left> <leftlang="ja"> 再び /lengthが得られるように、 この新しい関数をくくり出してください。 </left> <rightlang="en"> <program> ((lambda(/mk-length) (/mk-length/mk-length)) (lambda(/mk-length) ([(lambda(/length) ] [ (lambda(/l) ] ... [ (/add1(/length(/cdr/l)))))))] (lambda(/x) ((/mk-length/mk-length)/x))))) </program> </right> <rightlang="ja"> <program>
  • 12. A real example of using meta data
  • 13. Pros of using XML, instead of LaTeX Authors of technical books tend to have HTML literacy, provided that there's no excessive information against human. Easy to confirm the appearance of structured text just by Web browsers, provided that there's an appropriate CSS. It's technically possible to generate EPUB as well as PDF.
  • 14. Cons of using XML If there's excessive information just for machine, it becomes hard to edit the manuscript. If there's no appropriate CSS, you have to rely only on the abstract structure of documents. It's technically possible, but not trivial to generate EPUB.
  • 15. More to the point... Creating PDFs from XML often requires some proprietary software and XSLT!
  • 16. What we need is an easy way to get XML-to-LaTeX converter, as needed. Without any prior information for the structure, Without any restriction for the structure, Without explicitly writing XML parser every time, With a profound support for generating LaTeX documents. (XSLT won't be the best solution!)
  • 17. xml2tex — our approach
  • 18. How to get a LaTeX representation of this XML? Note that it was converted from a DTP application. <?xmlversion="1.0"encoding="UTF-8"?> <?xml-stylesheethref="book.css"type="text/css"charset="UTF-8"?> <XML> <TITLE>AGreatBook</TITLE> <Chapter-Title>WritinginPractice</Chapter-Title> <Body-Text-First>Beforeyoucanstartwritingarealbook...</Body-Text-First> <Body-Text>Let&rsquo;sgetstarted!</Body-Text> <Heading-1>IntroductiontoLaTeX</Heading-1> <Body-Text-First>Tokeeponattractingyourreaders...</Body-Text-First> <Code-First>$latex</Code-First> <Code>ThisispdfTeX,Version3.14159265-2.6-1.40.15(TeXLive2014)(preloadedformat=latex)</Code> <Code>restrictedwrite18enabled.</Code> <Code-Last>**</Code-Last> <Body-Text>Notethatyoucan&rsquo;tputdown%inyourmasterpiece...</Body-Text> </XML>
  • 19. Things you can tell just by looking at the XML It at least has the elements named <TITLE>, <Chapter- Title>, <Body-Text-First>, <Body-Text>, <Heading-1>, <Code-First>, <Code>, and <Code-Last>. It at least has a XML character entity &rsquo;. <?xml...?>is the processing instruction, which generally seems to be useless in LaTeX. Characters like %, $and may need to be escaped.
  • 20. All you need to tell xml2tex That's it! Save these lines into a file, and it can be used as a kind of specification for xml2tex. (define-tagXML (make-latex-env'document)) (define-tagTITLE(make-latex-cmd'title)) (define-tagChapter-Title (make-latex-cmd'chapter)) (define-tagHeading-1 (make-latex-cmd'section)) (define-tagBody-Text-First(define-rule"nnoindent{}"trim"parn")) (define-tagBody-Text (define-rule"n"trim"parn")) (define-tagCode-First(define-rule"begin{alltt}"kick-comment"")) (define-tagCode (define-rule""kick-comment"")) (define-tagCode-Last (define-rule""kick-comment"end{alltt}"))
  • 21. Converting XML with the rule file where demo.rulesis the rule file defined before. $xml2tex-rdemo.rulesdemo.xml>demo.tex documentclass{book} usepackage[T1]{fontenc} usepackage{alltt} begin{document} title{AGreatBook} chapter{WritinginPractice} noindent{}Beforeyoucanstartwritingarealbook...par Let'sgetstarted!par section{IntroductiontoLaTeX} noindent{}Tokeeponattractingyourreaders...par begin{alltt}{symbol{36}}latex
  • 23. Generating LaTeX syntax from the document tree is defined as a rule. 1. Preceding string, or a thunk which returns a preceding string. Possible example is texttt{. 2. A procedure from string to another string. trimis one of such procedures. It takes a string and returns a string in which special characters in LaTeX are escaped properly. 3. Following string, or a thunk which returns a following string. Possible example is }. (define-rule "n" ;Putthisatthebeginning. trim ;Itstextnodesshouldbetreatedwiththis. "parn")) ;Putthisattheending.
  • 24. Let the defined rule map the content from XML element to LaTeX syntax, based on the tag name. Just putting down these definitions for each XML tags is enough to convert the XML into LaTeX. (define-tagBody-Text ;IftheXMLnodehasthisname,... (define-rule ;Applythisruletothecontent. "n" trim "parn"))
  • 25. Supportive features in defining rules (make-latex-cmd'cmdname) generates a rule for creating a LaTeX command cmdname{...}with the contents. (make-latex-env'envname) generates a rule for creating a LaTeX environment begin{envname}...ent{envname}with the contents. (through) generates a rule for putting down all the contents with necessary escaping. (ignore) generates a rule for discarding the contents.
  • 26. Default rule is "through" The elements you haven't define any explicit rule are indicated while the conversion. It helps you try detecting the unknown elements within the given XML file. $xml2tex-rmy.ruleinput.xml NotknowingtheLaTeXsyntaxfor<div>,...applyed(through). NotknowingtheLaTeXsyntaxfor<div>,...applyed(through). NotknowingtheLaTeXsyntaxfor<div>,...applyed(through). NotknowingtheLaTeXsyntaxfor<div>,...applyed(through). NotknowingtheLaTeXsyntaxfor<div>,...applyed(through). documentclass{book} usepackage[T1]{fontenc} begin{document} chapter{StartingOut} ..par ...
  • 27. Supportive features in adoring tree ($parent?'(tag1tag2...)) returns True if the node is directly under tag1or tag2... There's a lot more similar functions like $parent(takes the parent name), $siblings?, $under?, and so on. ($@'attrname) returns a string value if the node has an attribute of the attrname.
  • 28. An example of $parentand $parent? $-functions can be used within (define-rule.... Also note that (define-rule...takes a procedure instead of a fixed string for its first argument. (define-tagtitle (define-rule (lambda() (cond (($parent?'(chapter)) "chapter{") (($parent?'(sect1)) "section{") (($parent?'(sect2sect3))"subsection{") (else(error"norulefortitle"($parent))))) trim "}"))
  • 29. An example of $@(getting attribute value) Note that #`"..."is a syntax for a string interpolation. It actually a feature of Gauche, a Scheme programming language on which xml2tex works. (define-tagimg (define-rule (list"begin{figure}n" "includegraphics" #`"[width=,($@'width)]" #`"{,($@'src)}"))) trim "end{figure}"))
  • 30. More practical example — HTML tables <body> <table> <tr> <tdbgcolor="#ff0000"width="30%">1</td> <tdbgcolor="#33cccc"width="20%">2</td> <tdbgcolor="#00ff00">3</td> <tdrowspan="2"bgcolor="#ff9900">4</td> </tr> <tr> <tdbgcolor="#00ffff">A</td> <tdbgcolor="#ff00ff"align="right"colspan="2"rowspan="2"width="50%">B</td> </tr> <tr> <tdbgcolor="#ffff00">C</td><tdbgcolor="#00cc33"width="20%">D</td> </tr> </table> </body>
  • 31. Possible rule to convert HTML table to LaTeX's tabular environment It requires transformation of the tree, before applying a rule defined by define-rule. To transform the tree, use :prekeyword within define- rule. (define-tagtable (define-rule #`"begin{tabular}{|,($@'colspec)|}n";colspecisageneratedattribute trim "end{tabular}" :pre (lambda(bodyroot) (let*((trs((node-closure(ntype-names??'(tr)))body)) (tds(map(node-closure(ntype-names??'(tdth)))trs)) (tr-attrs(mapsxml:attr-as-listtrs)) (colspec(make-colspectds))) (sxml:set-attr (cons(sxml:namebody) (append (transform-trstdstr-attrs) (sxml:aux-as-listbody) (sxml:content (filter(sxml:invert(ntype-names??'(tr)))body)))) (list'colspeccolspec)))) )) (define(transform-trstrstr-attrs) (define(make-contentnameattrcont) (consname(appendattrcont))) (map (lambda(tr)
  • 32. Results 1 2 3 4 A B C D
  • 33. Practical example 2 — Footnotes <XML> <Body-Text>Notethatyou<Ahref="#pgfId-1018223"CLASS="footnote">1</A> can'tputdown%inyourmasterpiece...</Body-Text> <FOOTNOTES> <FOOTNOTE> <Footnote-Text><AID="pgfId-1018223"></A> Yes,it'syou. </Footnote-Text> </FOOTNOTE> </FOOTNOTES> </XML>
  • 34. Possible rule to make LaTeX's footnotefrom <FOOTNOTES>at the bottom This time you need to get hold of a subtree, and attach it to another node. Things has been rather messy, but :prestill works. (define-tagA (define-rule""trim"" :pre (lambda(br) (cons'A (map-union(lambda(e)(map-union (lambda(a) (if(string=? (ifstr(sxml:attr-ub'href)) #`"#,(sxml:attr-ua'ID)") e#f)) ((select-kids(ntype-names??'(A)))e))) ((node-closure(ntype-names??'(Footnote-Text)))root)))))) (define-tagFootnote-Text(make-latex-cmd'footnote)) (define-tagFOOTNOTES(ignore))
  • 36. Conclusion XML is not bad for making books. Simple markups/markdowns won't offer you a rich layout. LaTeX will provide a concrete presentation layer for XML documents. However, using XSLT for converting XML to LaTeX is a hard way, because XSLT is a tool for getting another XML from a XML. One major missing-link is a complete and lightweight way of getting LaTeX from XML. xml2tex would be a solution with its declarative style of defining whatever conversion rules. ConTeXt's XML support is probably another solution. It requires some experiences on XML, as well as LaTeX. In addition to that, xml2texrequires Scheme experience.