SlideShare a Scribd company logo
1 of 25
Conversion Word to HTML
REALITY
STEPS FOR CONVERSION
1) Extract the content from Indesign or Pdf.
Client Input: PDF
2) Arrange the text as per project requirement.
Word file: Doc
3) Apply the style in the Word Document. (e.g. Heading 1, Heading 2,
Heading 3)
4) Apply Auto Tags in the Word Document. (e.g. <h1>, <h2>, <h3>, <p>,
<sup>, <b>, <i>)
Word file: Formatted Doc
5) Convert Word file to XML through Developer-Schema.
Final output: HTML File
How to Apply the Style:
• Select the content-
• Apply the Style:-
Apply the Auto-Tagging:-
1) Find ¶
• 2) Replace the element </p>¶<p> with change
in properties of styles:
3) Apply <h1>, <h2>, <h3>, <sup>, <b>, <i> elements by
finding the individual styles of headings:
4) Replace <h1>, <h2>, <h3>, <sup>, <b>, <i> elements:
5) Cleanup the file (remove the extra tags in the documents):
• Remove Unnecessary Tags:
Convert Word to XML
1) Select the Developer option from menu bar
2) After that select the Schema Option and allow the check
box in below mentioned snap:
3) Select the templates option and attach right template:
4) Apply root element tag <html>, <body> then select the xml
option and tick the save data only option.
5) Go to File Save as Other Formats then change
the Save as type Doc to Word 2003 xml Document (*.xml).
6) Open Xml file.
Find the &lt; and &gt; entity.
Replace with normal < and > sign.
7) Arrange the html tags by using regex then valid the xml file
on xml copy editor.
Using regular expressions for Apply the tags
and Linking process.
Findings by using Regex:
1) ([0-9]) Using this regex expression for find only
numbers.
2) ([A-Za-z]) Using this regex expression for find only alpha
characters.
3) ([^>]*) Using this regex expression for find anything in
between specific tag.
4) ([^*]*) Using this for find all the content till new line of
the any paragraph in the file.
Replacement by using Regex:
1) 1 Using in replacement properties for replacement the
first regex expression.
Replacement by using Regex:
1) 1 Result.
2) i <expression> Substitute a sequence number.
Expression: Effect:
i Replace with numbers starting
from 1, incrementing by 1.
i(10) Replace with numbers starting
from 10, incrementing by 1.
i(0,10) Replace with numbers starting
from 0, incrementing by 10.
i(100,-10) Replace with numbers starting
from 100, decrementing by -
10.
3) We using this i expression in our document for linking to
footnotes & generate the unique number to each
footnotes.
Example:
Content &
Element
<sup1>a</sup1>
Find <sup1>([^>]*)</sup1>
Replace <a id="ntfia"
href="Ch000.html#ntfi"><sup1>[i]</sup1></a>
Result <a id="ntf1a"
href="Ch000.html#ntf1"><sup1>[1]</sup1></a>
Snap shots for i expression:-
Result of i expression:
Thank You

More Related Content

Viewers also liked

20121122 verbale collegio docenti galilei
20121122 verbale collegio docenti galilei20121122 verbale collegio docenti galilei
20121122 verbale collegio docenti galileiMarco Simoncini
 
Fernando eduardo silva[1]
Fernando eduardo silva[1]Fernando eduardo silva[1]
Fernando eduardo silva[1]Brendan McCrory
 
Koolitus 2013 Mats soomre "Juhtimisest teise nurga alt e. ära looda inimest...
Koolitus 2013 Mats soomre   "Juhtimisest teise nurga alt e. ära looda inimest...Koolitus 2013 Mats soomre   "Juhtimisest teise nurga alt e. ära looda inimest...
Koolitus 2013 Mats soomre "Juhtimisest teise nurga alt e. ära looda inimest...KoostööKunstiKool Belbin Eesti
 
exposicion
exposicionexposicion
exposicionjaslopez
 
Test 2
Test 2Test 2
Test 2tdpcoe
 
Uso De La Web 2 0 Adriana
Uso De La Web 2 0 AdrianaUso De La Web 2 0 Adriana
Uso De La Web 2 0 Adrianagalatadriana
 
Superstar golf products_powerpoint 33
Superstar golf products_powerpoint 33Superstar golf products_powerpoint 33
Superstar golf products_powerpoint 33waterhorse64
 
New City Covenant - Vision Presentation (10/13/12)
New City Covenant - Vision Presentation (10/13/12)New City Covenant - Vision Presentation (10/13/12)
New City Covenant - Vision Presentation (10/13/12)T. C. Moore
 
Distro arminareka perdana 2013
Distro arminareka perdana 2013Distro arminareka perdana 2013
Distro arminareka perdana 2013Distro Spot
 
exposicion
exposicionexposicion
exposicionjaslopez
 
Cg lab cse-vii
Cg lab cse-viiCg lab cse-vii
Cg lab cse-viisajjan93
 
Todo Lo Aprendí En El KíNder
Todo Lo Aprendí En El KíNderTodo Lo Aprendí En El KíNder
Todo Lo Aprendí En El KíNderberyl15
 
one travel on bus (on spanish)
one travel on bus (on spanish)one travel on bus (on spanish)
one travel on bus (on spanish)joseluissalas
 
เล่าเรื่องเร้าพลัง
เล่าเรื่องเร้าพลังเล่าเรื่องเร้าพลัง
เล่าเรื่องเร้าพลังSakkasem Promsorn
 
Hukum jual beli kredit dalam
Hukum jual beli kredit dalamHukum jual beli kredit dalam
Hukum jual beli kredit dalamDade Nak'forcik
 

Viewers also liked (20)

Introduction open learning
Introduction open learningIntroduction open learning
Introduction open learning
 
20121122 verbale collegio docenti galilei
20121122 verbale collegio docenti galilei20121122 verbale collegio docenti galilei
20121122 verbale collegio docenti galilei
 
Fernando eduardo silva[1]
Fernando eduardo silva[1]Fernando eduardo silva[1]
Fernando eduardo silva[1]
 
Koolitus 2013 Mats soomre "Juhtimisest teise nurga alt e. ära looda inimest...
Koolitus 2013 Mats soomre   "Juhtimisest teise nurga alt e. ära looda inimest...Koolitus 2013 Mats soomre   "Juhtimisest teise nurga alt e. ära looda inimest...
Koolitus 2013 Mats soomre "Juhtimisest teise nurga alt e. ära looda inimest...
 
Metroparks
Metroparks Metroparks
Metroparks
 
Digital citizenship
Digital citizenshipDigital citizenship
Digital citizenship
 
Mixed up
Mixed upMixed up
Mixed up
 
exposicion
exposicionexposicion
exposicion
 
Test 2
Test 2Test 2
Test 2
 
Uso De La Web 2 0 Adriana
Uso De La Web 2 0 AdrianaUso De La Web 2 0 Adriana
Uso De La Web 2 0 Adriana
 
Superstar golf products_powerpoint 33
Superstar golf products_powerpoint 33Superstar golf products_powerpoint 33
Superstar golf products_powerpoint 33
 
New City Covenant - Vision Presentation (10/13/12)
New City Covenant - Vision Presentation (10/13/12)New City Covenant - Vision Presentation (10/13/12)
New City Covenant - Vision Presentation (10/13/12)
 
Distro arminareka perdana 2013
Distro arminareka perdana 2013Distro arminareka perdana 2013
Distro arminareka perdana 2013
 
exposicion
exposicionexposicion
exposicion
 
McDonald, Sims and Phillips
McDonald, Sims and PhillipsMcDonald, Sims and Phillips
McDonald, Sims and Phillips
 
Cg lab cse-vii
Cg lab cse-viiCg lab cse-vii
Cg lab cse-vii
 
Todo Lo Aprendí En El KíNder
Todo Lo Aprendí En El KíNderTodo Lo Aprendí En El KíNder
Todo Lo Aprendí En El KíNder
 
one travel on bus (on spanish)
one travel on bus (on spanish)one travel on bus (on spanish)
one travel on bus (on spanish)
 
เล่าเรื่องเร้าพลัง
เล่าเรื่องเร้าพลังเล่าเรื่องเร้าพลัง
เล่าเรื่องเร้าพลัง
 
Hukum jual beli kredit dalam
Hukum jual beli kredit dalamHukum jual beli kredit dalam
Hukum jual beli kredit dalam
 

Similar to Test1

DITA and Translation Best Praticices
DITA and Translation Best PraticicesDITA and Translation Best Praticices
DITA and Translation Best PraticicesAndrzej Zydroń MBCS
 
Working with xml data
Working with xml dataWorking with xml data
Working with xml dataaspnet123
 
Reengineering PDF-Based Documents Targeting Complex Software Specifications
Reengineering PDF-Based Documents Targeting Complex Software SpecificationsReengineering PDF-Based Documents Targeting Complex Software Specifications
Reengineering PDF-Based Documents Targeting Complex Software SpecificationsMoutasm Tamimi
 
Web data management (chapter-1)
Web data management (chapter-1)Web data management (chapter-1)
Web data management (chapter-1)Dhaval Asodariya
 
IRE- Algorithm Name Detection in Research Papers
IRE- Algorithm Name Detection in Research PapersIRE- Algorithm Name Detection in Research Papers
IRE- Algorithm Name Detection in Research PapersSriTeja Allaparthi
 
Twig internals - Maksym MoskvychevTwig internals maksym moskvychev
Twig internals - Maksym MoskvychevTwig internals   maksym moskvychevTwig internals - Maksym MoskvychevTwig internals   maksym moskvychev
Twig internals - Maksym MoskvychevTwig internals maksym moskvychevDrupalCampDN
 
Django tech-talk
Django tech-talkDjango tech-talk
Django tech-talkdtdannen
 
RPE - Template formating, style and stylesheet usage
RPE - Template formating, style and stylesheet usageRPE - Template formating, style and stylesheet usage
RPE - Template formating, style and stylesheet usageGEBS Reporting
 
JavaScript - Chapter 12 - Document Object Model
  JavaScript - Chapter 12 - Document Object Model  JavaScript - Chapter 12 - Document Object Model
JavaScript - Chapter 12 - Document Object ModelWebStackAcademy
 
Xml and xml processor
Xml and xml processorXml and xml processor
Xml and xml processorHimanshu Soni
 
Xml and xml processor
Xml and xml processorXml and xml processor
Xml and xml processorHimanshu Soni
 
Document object model
Document object modelDocument object model
Document object modelAmit kumar
 
SURE Research Report
SURE Research ReportSURE Research Report
SURE Research ReportAlex Sumner
 
chapter 4 web authoring unit 4 xml.pptx
chapter 4 web authoring  unit 4 xml.pptxchapter 4 web authoring  unit 4 xml.pptx
chapter 4 web authoring unit 4 xml.pptxamare63
 

Similar to Test1 (20)

Xml
XmlXml
Xml
 
DITA and Translation Best Praticices
DITA and Translation Best PraticicesDITA and Translation Best Praticices
DITA and Translation Best Praticices
 
Working with xml data
Working with xml dataWorking with xml data
Working with xml data
 
Understanding XML DOM
Understanding XML DOMUnderstanding XML DOM
Understanding XML DOM
 
OAXAL
OAXALOAXAL
OAXAL
 
Reengineering PDF-Based Documents Targeting Complex Software Specifications
Reengineering PDF-Based Documents Targeting Complex Software SpecificationsReengineering PDF-Based Documents Targeting Complex Software Specifications
Reengineering PDF-Based Documents Targeting Complex Software Specifications
 
Web data management (chapter-1)
Web data management (chapter-1)Web data management (chapter-1)
Web data management (chapter-1)
 
Xml Lecture Notes
Xml Lecture NotesXml Lecture Notes
Xml Lecture Notes
 
IRE- Algorithm Name Detection in Research Papers
IRE- Algorithm Name Detection in Research PapersIRE- Algorithm Name Detection in Research Papers
IRE- Algorithm Name Detection in Research Papers
 
Twig internals - Maksym MoskvychevTwig internals maksym moskvychev
Twig internals - Maksym MoskvychevTwig internals   maksym moskvychevTwig internals - Maksym MoskvychevTwig internals   maksym moskvychev
Twig internals - Maksym MoskvychevTwig internals maksym moskvychev
 
Django tech-talk
Django tech-talkDjango tech-talk
Django tech-talk
 
RPE - Template formating, style and stylesheet usage
RPE - Template formating, style and stylesheet usageRPE - Template formating, style and stylesheet usage
RPE - Template formating, style and stylesheet usage
 
Dom
Dom Dom
Dom
 
JavaScript - Chapter 12 - Document Object Model
  JavaScript - Chapter 12 - Document Object Model  JavaScript - Chapter 12 - Document Object Model
JavaScript - Chapter 12 - Document Object Model
 
Xml and xml processor
Xml and xml processorXml and xml processor
Xml and xml processor
 
Xml and xml processor
Xml and xml processorXml and xml processor
Xml and xml processor
 
IR with lucene
IR with luceneIR with lucene
IR with lucene
 
Document object model
Document object modelDocument object model
Document object model
 
SURE Research Report
SURE Research ReportSURE Research Report
SURE Research Report
 
chapter 4 web authoring unit 4 xml.pptx
chapter 4 web authoring  unit 4 xml.pptxchapter 4 web authoring  unit 4 xml.pptx
chapter 4 web authoring unit 4 xml.pptx
 

Test1

  • 1. Conversion Word to HTML REALITY
  • 2. STEPS FOR CONVERSION 1) Extract the content from Indesign or Pdf. Client Input: PDF 2) Arrange the text as per project requirement. Word file: Doc 3) Apply the style in the Word Document. (e.g. Heading 1, Heading 2, Heading 3) 4) Apply Auto Tags in the Word Document. (e.g. <h1>, <h2>, <h3>, <p>, <sup>, <b>, <i>) Word file: Formatted Doc 5) Convert Word file to XML through Developer-Schema. Final output: HTML File
  • 3. How to Apply the Style: • Select the content-
  • 4. • Apply the Style:-
  • 6. • 2) Replace the element </p>¶<p> with change in properties of styles:
  • 7. 3) Apply <h1>, <h2>, <h3>, <sup>, <b>, <i> elements by finding the individual styles of headings:
  • 8. 4) Replace <h1>, <h2>, <h3>, <sup>, <b>, <i> elements:
  • 9. 5) Cleanup the file (remove the extra tags in the documents):
  • 11. Convert Word to XML 1) Select the Developer option from menu bar 2) After that select the Schema Option and allow the check box in below mentioned snap:
  • 12. 3) Select the templates option and attach right template:
  • 13. 4) Apply root element tag <html>, <body> then select the xml option and tick the save data only option.
  • 14. 5) Go to File Save as Other Formats then change the Save as type Doc to Word 2003 xml Document (*.xml).
  • 15.
  • 16. 6) Open Xml file. Find the &lt; and &gt; entity. Replace with normal < and > sign.
  • 17. 7) Arrange the html tags by using regex then valid the xml file on xml copy editor.
  • 18. Using regular expressions for Apply the tags and Linking process. Findings by using Regex: 1) ([0-9]) Using this regex expression for find only numbers. 2) ([A-Za-z]) Using this regex expression for find only alpha characters. 3) ([^>]*) Using this regex expression for find anything in between specific tag. 4) ([^*]*) Using this for find all the content till new line of the any paragraph in the file.
  • 19. Replacement by using Regex: 1) 1 Using in replacement properties for replacement the first regex expression.
  • 20. Replacement by using Regex: 1) 1 Result.
  • 21. 2) i <expression> Substitute a sequence number. Expression: Effect: i Replace with numbers starting from 1, incrementing by 1. i(10) Replace with numbers starting from 10, incrementing by 1. i(0,10) Replace with numbers starting from 0, incrementing by 10. i(100,-10) Replace with numbers starting from 100, decrementing by - 10.
  • 22. 3) We using this i expression in our document for linking to footnotes & generate the unique number to each footnotes. Example: Content & Element <sup1>a</sup1> Find <sup1>([^>]*)</sup1> Replace <a id="ntfia" href="Ch000.html#ntfi"><sup1>[i]</sup1></a> Result <a id="ntf1a" href="Ch000.html#ntf1"><sup1>[1]</sup1></a>
  • 23. Snap shots for i expression:-
  • 24. Result of i expression: