SlideShare a Scribd company logo
1 of 11
Download to read offline
HTML2Presentation
IRE Major Project
- Chandan Singh
- Harsh Vardhan Shukla
- Nehal J Wani
- Rahul Patidar
Introduction
● This tool was designed to summarize HTML version of the papers published in
the proceedings of CHI 96 - Conference on Human Factors in Computing
Systems, 1996: http://sigchi.org/chi96/proceedings/papers.htm
● Since we've written a parser of our own to parse the HTML source, we realize
that its not very generic and may not work for all the inputs apart from the ones
in these proceedings.
● Since this is just a Proof-Of-Concept application, don't expect too much of error
handling. But we try to provide some basic error messages when something
fails.
How does it work?
● First, we parse the HTML of the paper, so as to distinguish between HTML
tags and the actual text.
● Next, divide the paper into sections and subsections based on the heading in
the paper. For instance, text in between first <h1> becomes section 1, text
under first <h2> becomes section 1.1 and so on.
● Now, we extract the actual text for each subsection and ignore any other tags
like <div>, <span>, etc.
How does it work?
● We pass the extracted plain text for each subsection to the summarizer so
that we get a brief summary of each subsection. The size of the summary
should be limited to be about 4-5 sentences.
● Along with the text, we also extract relevant images and tables from the
paper and insert them into the presentation under relevant sections.
● Once we have heading for each section and content under it from the parser,
we just need to pass the appropriate arguments to Latexslides
How does it work?
(a Python tool) which generates the presentation in LaTeX.
● Finally we obtain the presentation in ‘.pdf’ format from the LaTeX source
using ‘pdflatex’.
Features of the parser
● Grabs all the tags and their relevant text in a well formatted html page.
● Classifies the text into proper sections. Like:
● Intelligently detects the type of tag and assigns proper at attribute to it.
● Output is out in as a well-formatted JSON array, so that it can be used
independently in other applications.
● It also takes care of hyperlinks and includes them in the appropriate
section/subsection.
Features of the summarizer
● Summarizer takes as input a blob of text as input and outputs an array of
sentences that summarizes the given text.
● The summarizer takes as input maximum number of sentences to be
returned as output, so its flexible in this regard.
● Calculates the importance of a sentence by comparing it with all other
sentences in the given text and assigning appropriate weights to all the
words.
● Since we’re using stop-words from nltk(natural language toolkit), it can be
extended to any natural language.
Features of the PDF Generator
● It takes a JSON file as input, so it is independent in way that it can work with
any JSON input, the only condition being that the JSON file must follow our
standard format.
● If a particular section is appearing several times in the input with different
contents, it automatically combines their content into one section.
● It generates the `tex` file before converting it to PDF so user can download
the PDF. Also, he can edit the `tex` file itself if he wants so.
Web Interface
● We’ve also developed a web interface to use this tool in an easy manner:
http://web.iiit.ac.in/~chandan.singh/html2presentation/
Possible Use Cases
● Can be used to automatically generate presentations of one’s paper.
● Can also be tweaked easily to summarize blogs.
● Since the code is written in a modular manner, more modules can easily be
added or removed to enhance the user interface.
Thank You!
● Any feedback/suggestions are welcome.
● You can contact us via our contact page: http://web.iiit.ac.in/~chandan.
singh/html2presentation/team/

More Related Content

Viewers also liked

Apache NiFiで、楽して、つながる、広がる IoTプロジェクト
Apache NiFiで、楽して、つながる、広がる IoTプロジェクトApache NiFiで、楽して、つながる、広がる IoTプロジェクト
Apache NiFiで、楽して、つながる、広がる IoTプロジェクトKoji Kawamura
 
Kafka含むデータ処理フローを NiFiで構築するさまを実演する5分間
Kafka含むデータ処理フローを NiFiで構築するさまを実演する5分間Kafka含むデータ処理フローを NiFiで構築するさまを実演する5分間
Kafka含むデータ処理フローを NiFiで構築するさまを実演する5分間Koji Kawamura
 
そのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょうそのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょうKoji Kawamura
 
Fixing medicine bill_final_sep_18_2012_final
Fixing medicine bill_final_sep_18_2012_finalFixing medicine bill_final_sep_18_2012_final
Fixing medicine bill_final_sep_18_2012_finalAmr Abdelbadee
 
もうひとつのNo sql couchdbとは
もうひとつのNo sql couchdbとはもうひとつのNo sql couchdbとは
もうひとつのNo sql couchdbとはKoji Kawamura
 
Introduce couchbase server
Introduce couchbase serverIntroduce couchbase server
Introduce couchbase serverKoji Kawamura
 
Zzz satélites geoestacionarios
Zzz satélites geoestacionariosZzz satélites geoestacionarios
Zzz satélites geoestacionariosfenix10005
 

Viewers also liked (9)

Lexical approach
Lexical approachLexical approach
Lexical approach
 
Apache NiFiで、楽して、つながる、広がる IoTプロジェクト
Apache NiFiで、楽して、つながる、広がる IoTプロジェクトApache NiFiで、楽して、つながる、広がる IoTプロジェクト
Apache NiFiで、楽して、つながる、広がる IoTプロジェクト
 
Kafka含むデータ処理フローを NiFiで構築するさまを実演する5分間
Kafka含むデータ処理フローを NiFiで構築するさまを実演する5分間Kafka含むデータ処理フローを NiFiで構築するさまを実演する5分間
Kafka含むデータ処理フローを NiFiで構築するさまを実演する5分間
 
そのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょうそのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょう
 
Usmle step by step 1
Usmle step by step 1Usmle step by step 1
Usmle step by step 1
 
Fixing medicine bill_final_sep_18_2012_final
Fixing medicine bill_final_sep_18_2012_finalFixing medicine bill_final_sep_18_2012_final
Fixing medicine bill_final_sep_18_2012_final
 
もうひとつのNo sql couchdbとは
もうひとつのNo sql couchdbとはもうひとつのNo sql couchdbとは
もうひとつのNo sql couchdbとは
 
Introduce couchbase server
Introduce couchbase serverIntroduce couchbase server
Introduce couchbase server
 
Zzz satélites geoestacionarios
Zzz satélites geoestacionariosZzz satélites geoestacionarios
Zzz satélites geoestacionarios
 

Similar to Html2 presentation

What is html xml and xhtml
What is html xml and xhtmlWhat is html xml and xhtml
What is html xml and xhtmlFkdiMl
 
Putting DITA Localization into Practice
Putting DITA Localization into PracticePutting DITA Localization into Practice
Putting DITA Localization into PracticeXMetaL
 
Automatic Text Summarization: A Critical Review
Automatic Text Summarization: A Critical ReviewAutomatic Text Summarization: A Critical Review
Automatic Text Summarization: A Critical ReviewIRJET Journal
 
0506-django-web-framework-for-python.pdf
0506-django-web-framework-for-python.pdf0506-django-web-framework-for-python.pdf
0506-django-web-framework-for-python.pdfradhianiedjan1
 
IRJET- Resume Information Extraction Framework
IRJET- Resume Information Extraction FrameworkIRJET- Resume Information Extraction Framework
IRJET- Resume Information Extraction FrameworkIRJET Journal
 
Database Website on Django
Database Website on DjangoDatabase Website on Django
Database Website on DjangoHamdaAnees
 
UNIT-2web technologybchelor .pptx
UNIT-2web technologybchelor        .pptxUNIT-2web technologybchelor        .pptx
UNIT-2web technologybchelor .pptxnidhidube10
 
HTML5 - Introduction
HTML5 - IntroductionHTML5 - Introduction
HTML5 - IntroductionDavy De Pauw
 
Protocols and standards (http , html, xhtml, cgi, xml, wml, c html, etc)
Protocols and standards (http , html, xhtml, cgi, xml, wml, c html, etc)Protocols and standards (http , html, xhtml, cgi, xml, wml, c html, etc)
Protocols and standards (http , html, xhtml, cgi, xml, wml, c html, etc)techlovers3
 
Exploring PHP's Built-in Functions
Exploring PHP's Built-in FunctionsExploring PHP's Built-in Functions
Exploring PHP's Built-in FunctionsEmma Thompson
 
Understanding_Markdowns_Pandoc_and_YALM
Understanding_Markdowns_Pandoc_and_YALMUnderstanding_Markdowns_Pandoc_and_YALM
Understanding_Markdowns_Pandoc_and_YALMHellen Gakuruh
 
Plomino plone conf2010
Plomino plone conf2010Plomino plone conf2010
Plomino plone conf2010ebrehault
 
Think components. March 2017
Think components. March 2017Think components. March 2017
Think components. March 2017Ivan Babak
 

Similar to Html2 presentation (20)

How Does Angular Work?
How Does Angular Work?How Does Angular Work?
How Does Angular Work?
 
What is html xml and xhtml
What is html xml and xhtmlWhat is html xml and xhtml
What is html xml and xhtml
 
Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)
 
Document Summarizer
Document SummarizerDocument Summarizer
Document Summarizer
 
Putting DITA Localization into Practice
Putting DITA Localization into PracticePutting DITA Localization into Practice
Putting DITA Localization into Practice
 
Automatic Text Summarization: A Critical Review
Automatic Text Summarization: A Critical ReviewAutomatic Text Summarization: A Critical Review
Automatic Text Summarization: A Critical Review
 
Html Concept
Html ConceptHtml Concept
Html Concept
 
0506-django-web-framework-for-python.pdf
0506-django-web-framework-for-python.pdf0506-django-web-framework-for-python.pdf
0506-django-web-framework-for-python.pdf
 
IRJET- Resume Information Extraction Framework
IRJET- Resume Information Extraction FrameworkIRJET- Resume Information Extraction Framework
IRJET- Resume Information Extraction Framework
 
Database Website on Django
Database Website on DjangoDatabase Website on Django
Database Website on Django
 
Python.pptx
Python.pptxPython.pptx
Python.pptx
 
Html basic tags
Html basic tagsHtml basic tags
Html basic tags
 
UNIT-2web technologybchelor .pptx
UNIT-2web technologybchelor        .pptxUNIT-2web technologybchelor        .pptx
UNIT-2web technologybchelor .pptx
 
HTML5 - Introduction
HTML5 - IntroductionHTML5 - Introduction
HTML5 - Introduction
 
Protocols and standards (http , html, xhtml, cgi, xml, wml, c html, etc)
Protocols and standards (http , html, xhtml, cgi, xml, wml, c html, etc)Protocols and standards (http , html, xhtml, cgi, xml, wml, c html, etc)
Protocols and standards (http , html, xhtml, cgi, xml, wml, c html, etc)
 
Exploring PHP's Built-in Functions
Exploring PHP's Built-in FunctionsExploring PHP's Built-in Functions
Exploring PHP's Built-in Functions
 
Understanding_Markdowns_Pandoc_and_YALM
Understanding_Markdowns_Pandoc_and_YALMUnderstanding_Markdowns_Pandoc_and_YALM
Understanding_Markdowns_Pandoc_and_YALM
 
Plomino plone conf2010
Plomino plone conf2010Plomino plone conf2010
Plomino plone conf2010
 
Think components. March 2017
Think components. March 2017Think components. March 2017
Think components. March 2017
 
93 peter butterfield
93 peter butterfield93 peter butterfield
93 peter butterfield
 

Recently uploaded

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Html2 presentation

  • 1. HTML2Presentation IRE Major Project - Chandan Singh - Harsh Vardhan Shukla - Nehal J Wani - Rahul Patidar
  • 2. Introduction ● This tool was designed to summarize HTML version of the papers published in the proceedings of CHI 96 - Conference on Human Factors in Computing Systems, 1996: http://sigchi.org/chi96/proceedings/papers.htm ● Since we've written a parser of our own to parse the HTML source, we realize that its not very generic and may not work for all the inputs apart from the ones in these proceedings. ● Since this is just a Proof-Of-Concept application, don't expect too much of error handling. But we try to provide some basic error messages when something fails.
  • 3. How does it work? ● First, we parse the HTML of the paper, so as to distinguish between HTML tags and the actual text. ● Next, divide the paper into sections and subsections based on the heading in the paper. For instance, text in between first <h1> becomes section 1, text under first <h2> becomes section 1.1 and so on. ● Now, we extract the actual text for each subsection and ignore any other tags like <div>, <span>, etc.
  • 4. How does it work? ● We pass the extracted plain text for each subsection to the summarizer so that we get a brief summary of each subsection. The size of the summary should be limited to be about 4-5 sentences. ● Along with the text, we also extract relevant images and tables from the paper and insert them into the presentation under relevant sections. ● Once we have heading for each section and content under it from the parser, we just need to pass the appropriate arguments to Latexslides
  • 5. How does it work? (a Python tool) which generates the presentation in LaTeX. ● Finally we obtain the presentation in ‘.pdf’ format from the LaTeX source using ‘pdflatex’.
  • 6. Features of the parser ● Grabs all the tags and their relevant text in a well formatted html page. ● Classifies the text into proper sections. Like: ● Intelligently detects the type of tag and assigns proper at attribute to it. ● Output is out in as a well-formatted JSON array, so that it can be used independently in other applications. ● It also takes care of hyperlinks and includes them in the appropriate section/subsection.
  • 7. Features of the summarizer ● Summarizer takes as input a blob of text as input and outputs an array of sentences that summarizes the given text. ● The summarizer takes as input maximum number of sentences to be returned as output, so its flexible in this regard. ● Calculates the importance of a sentence by comparing it with all other sentences in the given text and assigning appropriate weights to all the words. ● Since we’re using stop-words from nltk(natural language toolkit), it can be extended to any natural language.
  • 8. Features of the PDF Generator ● It takes a JSON file as input, so it is independent in way that it can work with any JSON input, the only condition being that the JSON file must follow our standard format. ● If a particular section is appearing several times in the input with different contents, it automatically combines their content into one section. ● It generates the `tex` file before converting it to PDF so user can download the PDF. Also, he can edit the `tex` file itself if he wants so.
  • 9. Web Interface ● We’ve also developed a web interface to use this tool in an easy manner: http://web.iiit.ac.in/~chandan.singh/html2presentation/
  • 10. Possible Use Cases ● Can be used to automatically generate presentations of one’s paper. ● Can also be tweaked easily to summarize blogs. ● Since the code is written in a modular manner, more modules can easily be added or removed to enhance the user interface.
  • 11. Thank You! ● Any feedback/suggestions are welcome. ● You can contact us via our contact page: http://web.iiit.ac.in/~chandan. singh/html2presentation/team/