SlideShare a Scribd company logo
1 of 25
Conversion Word to HTML
REALITY
STEPS FOR CONVERSION
1) Extract the content from Indesign or Pdf.
Client Input: PDF
2) Arrange the text as per project requirement.
Word file: Doc
3) Apply the style in the Word Document. (e.g. Heading 1, Heading 2,
Heading 3)
4) Apply Auto Tags in the Word Document. (e.g. <h1>, <h2>, <h3>, <p>,
<sup>, <b>, <i>)
Word file: Formatted Doc
5) Convert Word file to XML through Developer-Schema.
Final output: HTML File
How to Apply the Style:
• Select the content-
• Apply the Style:-
Apply the Auto-Tagging:-
1) Find ¶
• 2) Replace the element </p>¶<p> with change
in properties of styles:
3) Apply <h1>, <h2>, <h3>, <sup>, <b>, <i> elements by
finding the individual styles of headings:
4) Replace <h1>, <h2>, <h3>, <sup>, <b>, <i> elements:
5) Cleanup the file (remove the extra tags in the documents):
• Remove Unnecessary Tags:
Convert Word to XML
1) Select the Developer option from menu bar
2) After that select the Schema Option and allow the check
box in below mentioned snap:
3) Select the templates option and attach right template:
4) Apply root element tag <html>, <body> then select the xml
option and tick the save data only option.
5) Go to File Save as Other Formats then change
the Save as type Doc to Word 2003 xml Document (*.xml).
6) Open Xml file.
Find the &lt; and &gt; entity.
Replace with normal < and > sign.
7) Arrange the html tags by using regex then valid the xml file
on xml copy editor.
Using regular expressions for Apply the tags
and Linking process.
Findings by using Regex:
1) ([0-9]) Using this regex expression for find only
numbers.
2) ([A-Za-z]) Using this regex expression for find only alpha
characters.
3) ([^>]*) Using this regex expression for find anything in
between specific tag.
4) ([^*]*) Using this for find all the content till new line of
the any paragraph in the file.
Replacement by using Regex:
1) 1 Using in replacement properties for replacement the
first regex expression.
Replacement by using Regex:
1) 1 Result.
2) i <expression> Substitute a sequence number.
Expression: Effect:
i Replace with numbers starting
from 1, incrementing by 1.
i(10) Replace with numbers starting
from 10, incrementing by 1.
i(0,10) Replace with numbers starting
from 0, incrementing by 10.
i(100,-10) Replace with numbers starting
from 100, decrementing by -
10.
3) We using this i expression in our document for linking to
footnotes & generate the unique number to each
footnotes.
Example:
Content &
Element
<sup1>a</sup1>
Find <sup1>([^>]*)</sup1>
Replace <a id="ntfia"
href="Ch000.html#ntfi"><sup1>[i]</sup1></a>
Result <a id="ntf1a"
href="Ch000.html#ntf1"><sup1>[1]</sup1></a>
Snap shots for i expression:-
Result of i expression:
Thank You

More Related Content

Viewers also liked

Port%20 authorities%20act%201963%20(act%20488)
Port%20 authorities%20act%201963%20(act%20488)Port%20 authorities%20act%201963%20(act%20488)
Port%20 authorities%20act%201963%20(act%20488)
lpjdangerousgoods
 
D7 IP BU Project - Hungry-Man
D7 IP BU Project - Hungry-ManD7 IP BU Project - Hungry-Man
D7 IP BU Project - Hungry-Man
Kobi Magnezi
 
Weekly Announcements
Weekly AnnouncementsWeekly Announcements
Weekly Announcements
stjamesame
 
Exploring differentiated assessment by student interest
Exploring differentiated assessment by student interestExploring differentiated assessment by student interest
Exploring differentiated assessment by student interest
Jeremy
 
يا ربنا ما أمجد اسمك
يا ربنا ما أمجد اسمكيا ربنا ما أمجد اسمك
يا ربنا ما أمجد اسمك
tarnemagadeda
 
อเมริกาเหนือ
อเมริกาเหนืออเมริกาเหนือ
อเมริกาเหนือ
krunimsocial
 
Long Beach Island Wedding Photographer New Jersey
Long Beach Island Wedding Photographer New JerseyLong Beach Island Wedding Photographer New Jersey
Long Beach Island Wedding Photographer New Jersey
Enchanted Celebrations
 
Resumen redes de información
Resumen redes de informaciónResumen redes de información
Resumen redes de información
Liceo Javier
 

Viewers also liked (19)

Port%20 authorities%20act%201963%20(act%20488)
Port%20 authorities%20act%201963%20(act%20488)Port%20 authorities%20act%201963%20(act%20488)
Port%20 authorities%20act%201963%20(act%20488)
 
D7 IP BU Project - Hungry-Man
D7 IP BU Project - Hungry-ManD7 IP BU Project - Hungry-Man
D7 IP BU Project - Hungry-Man
 
Presentazione
PresentazionePresentazione
Presentazione
 
Weekly Announcements
Weekly AnnouncementsWeekly Announcements
Weekly Announcements
 
Blok f0 baru
Blok f0 baruBlok f0 baru
Blok f0 baru
 
Introductie Media Bungy 13 9 2012
Introductie Media Bungy 13 9 2012Introductie Media Bungy 13 9 2012
Introductie Media Bungy 13 9 2012
 
Mixed up
Mixed upMixed up
Mixed up
 
Exploring differentiated assessment by student interest
Exploring differentiated assessment by student interestExploring differentiated assessment by student interest
Exploring differentiated assessment by student interest
 
يا ربنا ما أمجد اسمك
يا ربنا ما أمجد اسمكيا ربنا ما أمجد اسمك
يا ربنا ما أمجد اسمك
 
Brandnooz Nooz Magazin Ausgabe 02/2014
Brandnooz Nooz Magazin Ausgabe 02/2014Brandnooz Nooz Magazin Ausgabe 02/2014
Brandnooz Nooz Magazin Ausgabe 02/2014
 
Marissa
MarissaMarissa
Marissa
 
อเมริกาเหนือ
อเมริกาเหนืออเมริกาเหนือ
อเมริกาเหนือ
 
Long Beach Island Wedding Photographer New Jersey
Long Beach Island Wedding Photographer New JerseyLong Beach Island Wedding Photographer New Jersey
Long Beach Island Wedding Photographer New Jersey
 
Nursing care of DSF
Nursing care of DSF Nursing care of DSF
Nursing care of DSF
 
4 estrategias de marketing para redes sociales 2011 twitter
4 estrategias de marketing para redes sociales 2011 twitter4 estrategias de marketing para redes sociales 2011 twitter
4 estrategias de marketing para redes sociales 2011 twitter
 
Tα δικαιώματα των ασθενών στην εποχή της Τρόϊκας
Tα δικαιώματα των ασθενών στην εποχή της ΤρόϊκαςTα δικαιώματα των ασθενών στην εποχή της Τρόϊκας
Tα δικαιώματα των ασθενών στην εποχή της Τρόϊκας
 
Resumen redes de información
Resumen redes de informaciónResumen redes de información
Resumen redes de información
 
Coordination for Event Driven OOD
Coordination for Event Driven OODCoordination for Event Driven OOD
Coordination for Event Driven OOD
 
Abu rizal
Abu rizalAbu rizal
Abu rizal
 

Similar to Test1

DITA and Translation Best Praticices
DITA and Translation Best PraticicesDITA and Translation Best Praticices
DITA and Translation Best Praticices
Andrzej Zydroń MBCS
 
Working with xml data
Working with xml dataWorking with xml data
Working with xml data
aspnet123
 
RPE - Template formating, style and stylesheet usage
RPE - Template formating, style and stylesheet usageRPE - Template formating, style and stylesheet usage
RPE - Template formating, style and stylesheet usage
GEBS Reporting
 
SURE Research Report
SURE Research ReportSURE Research Report
SURE Research Report
Alex Sumner
 
chapter 4 web authoring unit 4 xml.pptx
chapter 4 web authoring  unit 4 xml.pptxchapter 4 web authoring  unit 4 xml.pptx
chapter 4 web authoring unit 4 xml.pptx
amare63
 

Similar to Test1 (20)

Xml
XmlXml
Xml
 
DITA and Translation Best Praticices
DITA and Translation Best PraticicesDITA and Translation Best Praticices
DITA and Translation Best Praticices
 
Working with xml data
Working with xml dataWorking with xml data
Working with xml data
 
Understanding XML DOM
Understanding XML DOMUnderstanding XML DOM
Understanding XML DOM
 
OAXAL
OAXALOAXAL
OAXAL
 
Reengineering PDF-Based Documents Targeting Complex Software Specifications
Reengineering PDF-Based Documents Targeting Complex Software SpecificationsReengineering PDF-Based Documents Targeting Complex Software Specifications
Reengineering PDF-Based Documents Targeting Complex Software Specifications
 
Web data management (chapter-1)
Web data management (chapter-1)Web data management (chapter-1)
Web data management (chapter-1)
 
Xml Lecture Notes
Xml Lecture NotesXml Lecture Notes
Xml Lecture Notes
 
IRE- Algorithm Name Detection in Research Papers
IRE- Algorithm Name Detection in Research PapersIRE- Algorithm Name Detection in Research Papers
IRE- Algorithm Name Detection in Research Papers
 
Twig internals - Maksym MoskvychevTwig internals maksym moskvychev
Twig internals - Maksym MoskvychevTwig internals   maksym moskvychevTwig internals - Maksym MoskvychevTwig internals   maksym moskvychev
Twig internals - Maksym MoskvychevTwig internals maksym moskvychev
 
Django tech-talk
Django tech-talkDjango tech-talk
Django tech-talk
 
RPE - Template formating, style and stylesheet usage
RPE - Template formating, style and stylesheet usageRPE - Template formating, style and stylesheet usage
RPE - Template formating, style and stylesheet usage
 
Dom
Dom Dom
Dom
 
JavaScript - Chapter 12 - Document Object Model
  JavaScript - Chapter 12 - Document Object Model  JavaScript - Chapter 12 - Document Object Model
JavaScript - Chapter 12 - Document Object Model
 
Xml and xml processor
Xml and xml processorXml and xml processor
Xml and xml processor
 
Xml and xml processor
Xml and xml processorXml and xml processor
Xml and xml processor
 
IR with lucene
IR with luceneIR with lucene
IR with lucene
 
Document object model
Document object modelDocument object model
Document object model
 
SURE Research Report
SURE Research ReportSURE Research Report
SURE Research Report
 
chapter 4 web authoring unit 4 xml.pptx
chapter 4 web authoring  unit 4 xml.pptxchapter 4 web authoring  unit 4 xml.pptx
chapter 4 web authoring unit 4 xml.pptx
 

Test1

  • 1. Conversion Word to HTML REALITY
  • 2. STEPS FOR CONVERSION 1) Extract the content from Indesign or Pdf. Client Input: PDF 2) Arrange the text as per project requirement. Word file: Doc 3) Apply the style in the Word Document. (e.g. Heading 1, Heading 2, Heading 3) 4) Apply Auto Tags in the Word Document. (e.g. <h1>, <h2>, <h3>, <p>, <sup>, <b>, <i>) Word file: Formatted Doc 5) Convert Word file to XML through Developer-Schema. Final output: HTML File
  • 3. How to Apply the Style: • Select the content-
  • 4. • Apply the Style:-
  • 6. • 2) Replace the element </p>¶<p> with change in properties of styles:
  • 7. 3) Apply <h1>, <h2>, <h3>, <sup>, <b>, <i> elements by finding the individual styles of headings:
  • 8. 4) Replace <h1>, <h2>, <h3>, <sup>, <b>, <i> elements:
  • 9. 5) Cleanup the file (remove the extra tags in the documents):
  • 11. Convert Word to XML 1) Select the Developer option from menu bar 2) After that select the Schema Option and allow the check box in below mentioned snap:
  • 12. 3) Select the templates option and attach right template:
  • 13. 4) Apply root element tag <html>, <body> then select the xml option and tick the save data only option.
  • 14. 5) Go to File Save as Other Formats then change the Save as type Doc to Word 2003 xml Document (*.xml).
  • 15.
  • 16. 6) Open Xml file. Find the &lt; and &gt; entity. Replace with normal < and > sign.
  • 17. 7) Arrange the html tags by using regex then valid the xml file on xml copy editor.
  • 18. Using regular expressions for Apply the tags and Linking process. Findings by using Regex: 1) ([0-9]) Using this regex expression for find only numbers. 2) ([A-Za-z]) Using this regex expression for find only alpha characters. 3) ([^>]*) Using this regex expression for find anything in between specific tag. 4) ([^*]*) Using this for find all the content till new line of the any paragraph in the file.
  • 19. Replacement by using Regex: 1) 1 Using in replacement properties for replacement the first regex expression.
  • 20. Replacement by using Regex: 1) 1 Result.
  • 21. 2) i <expression> Substitute a sequence number. Expression: Effect: i Replace with numbers starting from 1, incrementing by 1. i(10) Replace with numbers starting from 10, incrementing by 1. i(0,10) Replace with numbers starting from 0, incrementing by 10. i(100,-10) Replace with numbers starting from 100, decrementing by - 10.
  • 22. 3) We using this i expression in our document for linking to footnotes & generate the unique number to each footnotes. Example: Content & Element <sup1>a</sup1> Find <sup1>([^>]*)</sup1> Replace <a id="ntfia" href="Ch000.html#ntfi"><sup1>[i]</sup1></a> Result <a id="ntf1a" href="Ch000.html#ntf1"><sup1>[1]</sup1></a>
  • 23. Snap shots for i expression:-
  • 24. Result of i expression: