SlideShare a Scribd company logo
Pandoc:
The Deep Dive
All that is great
stands in the storm
● Universal markup converter == " the swiss
army knife of text markup formats"
● ALL HASKELL
● Example:
pandoc -o myDoc.md myDoc.html
pandoc -f html -t latex hackage.org
pandoc myDoc.txt -o myDoc.pdf
What is Pandoc?
● Reads:
○ Markdown (GitHub, Strict, etc.), HTML, LaTeX,
Textile, reStructuredText, JSON,
● Writes:
○ Markdown, reStructuredText, HTML, Docbook
XML, OpenDocument XML, ODT, RTF, groff
man, MediaWiki markup, GNU Texinfo, LaTeX,
ConTeXt, EPUB, Textile, Emacs org-mode, Slidy,
S5
● Extensions for LaTeX math, tables, etc.
● Note to self: Pandoc in the CLI
What is Pandoc? (pt. 2)
● Performance vis-à-vis scripting languages
● Type safety
● Text.Parsec library
● Hypermuscular list processing (more
about FP more generally than about
Haskell)
Why Haskell?
● One possibility: functions devoted to each
type-to-type combination
○ markdownToHTML
○ HTMLtoEPUB
○ 12^31 possibilities
○ FUCK THAT
● Vastly better possibility?
Reader -->
Neutral Haskell data type -->
Writer -->
Converted document
Possible approaches
● Semi-stateful, non-opinionated REGEX
machine
○ Accumulative — return (x:xs)
○ getParserState
○ modifyState
● Core functions
○ parse
■ parse parser filePath input
■ parse numbers "" "a,b,2,3"
○ many
○ skipMany
○ manyAccum
● type Parser t s = Parsec t s
Text.Parsec
● Neutral data types
○ Pandoc = [Block]
○ Block = [(Inline || Block)]
○ Inline
○ etc.
● Reader
○ Applies parsers to documents
○ Documents are treated as lists
● Writer
○ Converts neutral data type into document
○ Again, documents are just structured lists
Basic flow
● Readers/Markdown.hs
● Writers/HTML.hs
● Pandoc/Builder.hs
Markdown to HTML
● When doing big, complex things with FP,
you're probably going to end up thinking in
terms of lists
● Lists are infinitely flexible
● Hard to escape state entirely
○ ReaderState
○ WriterState
● Don't give up
● Force yourself to give a presentation at
PDXFunc
General lessons

More Related Content

What's hot

FluentDom
FluentDomFluentDom
FluentDom
Thomas Weinert
 
Automata Invasion
Automata InvasionAutomata Invasion
Automata Invasion
lucenerevolution
 
Stripe CTF3 wrap-up
Stripe CTF3 wrap-upStripe CTF3 wrap-up
Stripe CTF3 wrap-up
Stripe
 
Learning groovy -EU workshop
Learning groovy  -EU workshopLearning groovy  -EU workshop
Learning groovy -EU workshop
adam1davis
 
Tips and Tricks for Increased Development Efficiency
Tips and Tricks for Increased Development EfficiencyTips and Tricks for Increased Development Efficiency
Tips and Tricks for Increased Development Efficiency
Olivier Bourgeois
 
Jade
JadeJade
Jade
siva ram
 
Restinio (actual aug 2018)
Restinio (actual aug 2018)Restinio (actual aug 2018)
Restinio (actual aug 2018)
Nicolai Grodzitski
 
TANET 2018 - Insights into the reliability of open-source distributed file sy...
TANET 2018 - Insights into the reliability of open-source distributed file sy...TANET 2018 - Insights into the reliability of open-source distributed file sy...
TANET 2018 - Insights into the reliability of open-source distributed file sy...
Hua Chu
 
Introduction to Web Development - JavaScript
Introduction to Web Development - JavaScriptIntroduction to Web Development - JavaScript
Introduction to Web Development - JavaScript
SadhanaParameswaran
 
ActiveDoc
ActiveDocActiveDoc
ActiveDoc
Ivan Nečas
 
In a Nutshell: Rancher
In a Nutshell: RancherIn a Nutshell: Rancher
In a Nutshell: Rancher
Jeffrey Sica
 
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
PROIDEA
 
Compress and the other side
Compress and the other sideCompress and the other side
Compress and the other side
YoungChoonTae
 
Rust system programming language
Rust system programming languageRust system programming language
Rust system programming language
robin_sy
 
Mongodb meetup
Mongodb meetupMongodb meetup
Mongodb meetup
Eytan Daniyalzade
 
Introduction to Sublime text 2
Introduction to Sublime text 2Introduction to Sublime text 2
Introduction to Sublime text 2
Mahmoud Alqam
 
Writing Groovy DSLs
Writing Groovy DSLsWriting Groovy DSLs
Writing Groovy DSLs
adam1davis
 
KubeCon EU 2019 - P2P Docker Image Distribution in Hybrid Cloud Environment w...
KubeCon EU 2019 - P2P Docker Image Distribution in Hybrid Cloud Environment w...KubeCon EU 2019 - P2P Docker Image Distribution in Hybrid Cloud Environment w...
KubeCon EU 2019 - P2P Docker Image Distribution in Hybrid Cloud Environment w...
Yiran Wang
 
Caffe + H2O - By Cyprien noel
Caffe + H2O - By Cyprien noelCaffe + H2O - By Cyprien noel
Caffe + H2O - By Cyprien noel
Sri Ambati
 

What's hot (19)

FluentDom
FluentDomFluentDom
FluentDom
 
Automata Invasion
Automata InvasionAutomata Invasion
Automata Invasion
 
Stripe CTF3 wrap-up
Stripe CTF3 wrap-upStripe CTF3 wrap-up
Stripe CTF3 wrap-up
 
Learning groovy -EU workshop
Learning groovy  -EU workshopLearning groovy  -EU workshop
Learning groovy -EU workshop
 
Tips and Tricks for Increased Development Efficiency
Tips and Tricks for Increased Development EfficiencyTips and Tricks for Increased Development Efficiency
Tips and Tricks for Increased Development Efficiency
 
Jade
JadeJade
Jade
 
Restinio (actual aug 2018)
Restinio (actual aug 2018)Restinio (actual aug 2018)
Restinio (actual aug 2018)
 
TANET 2018 - Insights into the reliability of open-source distributed file sy...
TANET 2018 - Insights into the reliability of open-source distributed file sy...TANET 2018 - Insights into the reliability of open-source distributed file sy...
TANET 2018 - Insights into the reliability of open-source distributed file sy...
 
Introduction to Web Development - JavaScript
Introduction to Web Development - JavaScriptIntroduction to Web Development - JavaScript
Introduction to Web Development - JavaScript
 
ActiveDoc
ActiveDocActiveDoc
ActiveDoc
 
In a Nutshell: Rancher
In a Nutshell: RancherIn a Nutshell: Rancher
In a Nutshell: Rancher
 
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
 
Compress and the other side
Compress and the other sideCompress and the other side
Compress and the other side
 
Rust system programming language
Rust system programming languageRust system programming language
Rust system programming language
 
Mongodb meetup
Mongodb meetupMongodb meetup
Mongodb meetup
 
Introduction to Sublime text 2
Introduction to Sublime text 2Introduction to Sublime text 2
Introduction to Sublime text 2
 
Writing Groovy DSLs
Writing Groovy DSLsWriting Groovy DSLs
Writing Groovy DSLs
 
KubeCon EU 2019 - P2P Docker Image Distribution in Hybrid Cloud Environment w...
KubeCon EU 2019 - P2P Docker Image Distribution in Hybrid Cloud Environment w...KubeCon EU 2019 - P2P Docker Image Distribution in Hybrid Cloud Environment w...
KubeCon EU 2019 - P2P Docker Image Distribution in Hybrid Cloud Environment w...
 
Caffe + H2O - By Cyprien noel
Caffe + H2O - By Cyprien noelCaffe + H2O - By Cyprien noel
Caffe + H2O - By Cyprien noel
 

Similar to Pandoc: the deep dive (PDXFunc presentation)

A Multiformat Document Workflow With Docutils
A Multiformat Document Workflow With DocutilsA Multiformat Document Workflow With Docutils
A Multiformat Document Workflow With Docutils
Matthew Leingang
 
NANO266 - Lecture 9 - Tools of the Modeling Trade
NANO266 - Lecture 9 - Tools of the Modeling TradeNANO266 - Lecture 9 - Tools of the Modeling Trade
NANO266 - Lecture 9 - Tools of the Modeling Trade
University of California, San Diego
 
Why go ?
Why go ?Why go ?
Why go ?
Mailjet
 
Grant Rogerson SDEC2015
Grant Rogerson SDEC2015Grant Rogerson SDEC2015
Grant Rogerson SDEC2015
Grant Rogerson
 
sphinx-i18n — The True Story
sphinx-i18n — The True Storysphinx-i18n — The True Story
sphinx-i18n — The True Story
Robert Lehmann
 
Balisage - EXPath - A practical introduction
Balisage - EXPath - A practical introductionBalisage - EXPath - A practical introduction
Balisage - EXPath - A practical introduction
Florent Georges
 
ROS distributed architecture
ROS  distributed architectureROS  distributed architecture
ROS distributed architecture
Pablo Iñigo Blasco
 
Introduction to MapReduce and Hadoop
Introduction to MapReduce and HadoopIntroduction to MapReduce and Hadoop
Introduction to MapReduce and Hadoop
Mohamed Elsaka
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible research
Andrew Lowe
 
From XML to eBooks Part 2: The Details
From XML to eBooks Part 2: The DetailsFrom XML to eBooks Part 2: The Details
From XML to eBooks Part 2: The Details
Richard Hamilton
 
Programming languages
Programming languagesProgramming languages
Programming languages
Dmitry Zinoviev
 
IAS for IBM WebSphere MQ Users
IAS for IBM WebSphere MQ UsersIAS for IBM WebSphere MQ Users
IAS for IBM WebSphere MQ Users
Invenire Aude
 
The Go features I can't live without, 2nd round
The Go features I can't live without, 2nd roundThe Go features I can't live without, 2nd round
The Go features I can't live without, 2nd round
Rodolfo Carvalho
 
Go Is Your Next Language — Sergii Shapoval
Go Is Your Next Language — Sergii ShapovalGo Is Your Next Language — Sergii Shapoval
Go Is Your Next Language — Sergii Shapoval
GlobalLogic Ukraine
 
Latex workshop: Essentials and Practices
Latex workshop: Essentials and PracticesLatex workshop: Essentials and Practices
Latex workshop: Essentials and Practices
Mohamed Alrshah
 
Fscons scalable appplication transfers
Fscons scalable appplication transfersFscons scalable appplication transfers
Fscons scalable appplication transfers
Daniel Stenberg
 
In the DOM, no one will hear you scream
In the DOM, no one will hear you screamIn the DOM, no one will hear you scream
In the DOM, no one will hear you scream
Mario Heiderich
 
Lecture 4: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 4: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 4: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 4: Data-Intensive Computing for Text Analysis (Fall 2011)
Matthew Lease
 
LaTeX for beginners
LaTeX for beginnersLaTeX for beginners
LaTeX for beginners
Stéphane Péchard
 
數位出版2.0 it
數位出版2.0 it數位出版2.0 it
數位出版2.0 it
CYJ
 

Similar to Pandoc: the deep dive (PDXFunc presentation) (20)

A Multiformat Document Workflow With Docutils
A Multiformat Document Workflow With DocutilsA Multiformat Document Workflow With Docutils
A Multiformat Document Workflow With Docutils
 
NANO266 - Lecture 9 - Tools of the Modeling Trade
NANO266 - Lecture 9 - Tools of the Modeling TradeNANO266 - Lecture 9 - Tools of the Modeling Trade
NANO266 - Lecture 9 - Tools of the Modeling Trade
 
Why go ?
Why go ?Why go ?
Why go ?
 
Grant Rogerson SDEC2015
Grant Rogerson SDEC2015Grant Rogerson SDEC2015
Grant Rogerson SDEC2015
 
sphinx-i18n — The True Story
sphinx-i18n — The True Storysphinx-i18n — The True Story
sphinx-i18n — The True Story
 
Balisage - EXPath - A practical introduction
Balisage - EXPath - A practical introductionBalisage - EXPath - A practical introduction
Balisage - EXPath - A practical introduction
 
ROS distributed architecture
ROS  distributed architectureROS  distributed architecture
ROS distributed architecture
 
Introduction to MapReduce and Hadoop
Introduction to MapReduce and HadoopIntroduction to MapReduce and Hadoop
Introduction to MapReduce and Hadoop
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible research
 
From XML to eBooks Part 2: The Details
From XML to eBooks Part 2: The DetailsFrom XML to eBooks Part 2: The Details
From XML to eBooks Part 2: The Details
 
Programming languages
Programming languagesProgramming languages
Programming languages
 
IAS for IBM WebSphere MQ Users
IAS for IBM WebSphere MQ UsersIAS for IBM WebSphere MQ Users
IAS for IBM WebSphere MQ Users
 
The Go features I can't live without, 2nd round
The Go features I can't live without, 2nd roundThe Go features I can't live without, 2nd round
The Go features I can't live without, 2nd round
 
Go Is Your Next Language — Sergii Shapoval
Go Is Your Next Language — Sergii ShapovalGo Is Your Next Language — Sergii Shapoval
Go Is Your Next Language — Sergii Shapoval
 
Latex workshop: Essentials and Practices
Latex workshop: Essentials and PracticesLatex workshop: Essentials and Practices
Latex workshop: Essentials and Practices
 
Fscons scalable appplication transfers
Fscons scalable appplication transfersFscons scalable appplication transfers
Fscons scalable appplication transfers
 
In the DOM, no one will hear you scream
In the DOM, no one will hear you screamIn the DOM, no one will hear you scream
In the DOM, no one will hear you scream
 
Lecture 4: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 4: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 4: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 4: Data-Intensive Computing for Text Analysis (Fall 2011)
 
LaTeX for beginners
LaTeX for beginnersLaTeX for beginners
LaTeX for beginners
 
數位出版2.0 it
數位出版2.0 it數位出版2.0 it
數位出版2.0 it
 

Recently uploaded

Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
CAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on BlockchainCAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on Blockchain
Claudio Di Ciccio
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 

Recently uploaded (20)

Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
CAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on BlockchainCAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on Blockchain
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 

Pandoc: the deep dive (PDXFunc presentation)

  • 1. Pandoc: The Deep Dive All that is great stands in the storm
  • 2. ● Universal markup converter == " the swiss army knife of text markup formats" ● ALL HASKELL ● Example: pandoc -o myDoc.md myDoc.html pandoc -f html -t latex hackage.org pandoc myDoc.txt -o myDoc.pdf What is Pandoc?
  • 3. ● Reads: ○ Markdown (GitHub, Strict, etc.), HTML, LaTeX, Textile, reStructuredText, JSON, ● Writes: ○ Markdown, reStructuredText, HTML, Docbook XML, OpenDocument XML, ODT, RTF, groff man, MediaWiki markup, GNU Texinfo, LaTeX, ConTeXt, EPUB, Textile, Emacs org-mode, Slidy, S5 ● Extensions for LaTeX math, tables, etc. ● Note to self: Pandoc in the CLI What is Pandoc? (pt. 2)
  • 4. ● Performance vis-à-vis scripting languages ● Type safety ● Text.Parsec library ● Hypermuscular list processing (more about FP more generally than about Haskell) Why Haskell?
  • 5. ● One possibility: functions devoted to each type-to-type combination ○ markdownToHTML ○ HTMLtoEPUB ○ 12^31 possibilities ○ FUCK THAT ● Vastly better possibility? Reader --> Neutral Haskell data type --> Writer --> Converted document Possible approaches
  • 6. ● Semi-stateful, non-opinionated REGEX machine ○ Accumulative — return (x:xs) ○ getParserState ○ modifyState ● Core functions ○ parse ■ parse parser filePath input ■ parse numbers "" "a,b,2,3" ○ many ○ skipMany ○ manyAccum ● type Parser t s = Parsec t s Text.Parsec
  • 7. ● Neutral data types ○ Pandoc = [Block] ○ Block = [(Inline || Block)] ○ Inline ○ etc. ● Reader ○ Applies parsers to documents ○ Documents are treated as lists ● Writer ○ Converts neutral data type into document ○ Again, documents are just structured lists Basic flow
  • 8. ● Readers/Markdown.hs ● Writers/HTML.hs ● Pandoc/Builder.hs Markdown to HTML
  • 9. ● When doing big, complex things with FP, you're probably going to end up thinking in terms of lists ● Lists are infinitely flexible ● Hard to escape state entirely ○ ReaderState ○ WriterState ● Don't give up ● Force yourself to give a presentation at PDXFunc General lessons