[Mentor Graphics] A Perforce-based Automatic Document Generation System


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

[Mentor Graphics] A Perforce-based Automatic Document Generation System

  1. 1.  MERGE 2013 THE PERFORCE CONFERENCE SAN FRANCISCO • APRIL 24−26White PaperThis paper describes the automatic documentationgeneration system that DVT Technical Publicationsuses to generate and bundle the InfoHubdocumentation libraries for our product distributionsoftware. The backbone of this system is ourPerforce installation, which provides the documentcontrol and management portion of our system.A Perforce-Based AutomaticDocumentation Generation SystemChris Shaw, Mentor Graphics Corporation
  2. 2. 2 A Perforce-Based Automatic Documentation Generation System  This is text for annotations in footer. Similar to footnotes treatment.IntroductionMentor Graphics® Technical Publications provides InfoHub™ documentation environments toits customers. Each InfoHub is a product-specific documentation library that gives thecustomer the choice of consulting both HTML and PDF versions of every manual. InfoHubs arehighly interconnected; they have many inter- and intra-manual links and a powerful searchcapability. Authors create documents with the FrameMaker® document editor and use a cloud-based, multiprocessing utility to generate the HTML, PDF and InfoHub targets from theFrameMaker source documents. In a different vein, the engineering groups in MentorGraphics’ Design Verification Technology (DVT) division use the Perforce Software VersionManagement system for their software code.Both processes hint at a synergistic opportunity where we combine them in a singledocumentation generation and control system. This white paper describes the automaticdocumentation generation system that DVT Technical Publications uses to generate andbundle the InfoHub documentation libraries for our product distribution software. The backboneof this system is our Perforce installation, which provides the document control andmanagement portion of our system.Mentor Graphics InfoHubsQuesta® Formal Technology is a Mentor Graphics family of design verification products thatanalyze various assertions about the IC design units being verified. The documentation for thisproduct suite is packaged as an InfoHub—a Javascript-based browser page of the documentsin the products’ library (see Figure 1) [1]Figure 1: Questa Formal Technology InfoHubEach manual is available in both HTML and PDF, plus the InfoHub has a sophisticated searchcapability. The documents are highly interconnected with hyperlinks, and the software GUIshave links from dialog Help buttons to command pages in references.
  3. 3. 3 A Perforce-Based Automatic Documentation Generation System  This is text for annotations in footer. Similar to footnotes treatment.DOCGEN FacilityWriters author Technical Publications’ documentation in Adobe® FrameMaker (version 8). Thesource files for these documents are .fm files and imported graphics (typically JPEGs).Mentor Graphics’ Technical Publications Support team supplies a DOCGEN facility thatgenerates the HTML and PDF targets from the source FrameMaker files [1]. Jobs consisting ofmultiple manuals can be submitted. The facility farms out the manual translations to a grid ofservers to handle in parallel.The DOCGEN facility is accessed either through a Linux utility (docgen) or via a web site.Publications groups who use the web site typically generate the HTML/PDFs at the end of arelease cycle. This process might take about a week.The DVT publications group uses a different approach based on the Linux docgen utility. Withour documentation generation system, docgen runs automatically as documents are editedand are submitted to the depository. This mechanism results in a “correct-by-construction”InfoHub image. It has the advantage of being continually ready to promote into the distributionsoftware package. Last-minute changes can make it into the release and the final target isready to go within minutes.The final step in the generic document generation process is to update the InfoHub. MentorGraphics’ Technical Publications Support team provides a Linux utility (dmerge) to do just that.Documents in PerforceDVT source documents are stored in a techpubs depot in the division’s dvt depot (see Figure2). Each techpubs subdirectory corresponds to a documentation library (except for bin andarchive). Some document libraries are packaged as InfoHubs. Others are not; instead, theyare inserted into multiple InfoHubs.Figure 2: Techpubs depot in PerforceThe source files for the InfoHub shown in Figure 1 are located in the //dvt/techpubs/zin10.1  depot (see Figure 3) [2]. Each subdirectory corresponds to a single manual. For example, thecommand_ref  subdirectory contains the source files for the Questa  Formal  Technology  Command  
  4. 4. 4 A Perforce-Based Automatic Documentation Generation System  This is text for annotations in footer. Similar to footnotes treatment.Reference.    Figure 3: zin10.1 Contains source for InfoHubThe Technical Publications organization has strict rules on naming conventions and manualdirectory organization. Figure 4  shows an example.Figure 4: command_ref StructureAll .fm  and .book  files are stored at the top level of the manual directory; imported graphics arestored in the graphics  subdirectory. The man.book  file is the “book” file for the manual. README  is a control file used by the DOCGEN facility.
  5. 5. 5 A Perforce-Based Automatic Documentation Generation System  This is text for annotations in footer. Similar to footnotes treatment.The documentation administrator maintains the depot, adds manuals, and sets up the InfoHubsupport files. Each author can check out a manual, edit and write, then check in the manualwhen done. That is the entire flow from the Perforce perspective. So, a writer with minimalPerforce knowledge can be productive right away.pubs4d  Utility  The pubs4d  utility is a Perl program that provides the glue for the automated system. It is adaemon program that runs continually; it occasionally wakes up and looks for newly-submittedmanuals; it calls the docgen  and dmerge  utilities; and it performs some clean up and sanitychecks. The pubs4d  utility is a wrapper—DOCGEN does the heavy lifting.The pubs4d  utility displays two xterm windows showing the log and the detailed log of thesession. The detailed log shows the output from the docgen/dmerge  utilities. The log showstop-level information on the progress of the translation. Authors also can display the logwindows with a separate utility.The following LOG1 transcript shows a translation of two manuals. The utility sends the job todocgen, waits, and then 5 minutes later, the HTML/PDF targets are ready.Thursday, Oct 18, 2012 5:39 PM: Processing docgen job with 2books.0.0 Syncing FM files in /zin/pubs/docs/build directory.zin10.1 quickstart_autocheck_user (10.1c_1)zin10.1 quickstart_cdc_user (10.1c_1)0.1 Sending jobs to docgen.5.2 -->zin10.1 quickstart_cdc_user....OK.Fixing HTML for /zin/pubs/docs/dev/zin10.1/htmldocs.Copying conversion reports to dev/zin10.1/LOG.Checking for Warnings.Found 2 warnings.Updating master build report.5.3 -->zin10.1 quickstart_autocheck_user....OK.Fixing HTML for /zin/pubs/docs/dev/zin10.1/htmldocs.Copying conversion reports to dev/zin10.1/LOG.Checking for Warnings.Found 1 warning.Updating master build report.The double-xterm method of showing results is quite useful. It is done by calling with the following Perlsubroutine twice (once for each log):#---- Sub: display_xterm_log <geometry>, <title>, <logfile>sub display_xterm_log {if (fork == 0){system xterm, -sb, -sl, 4000, -geometry,@_[0], -title, @_[1], -e, /usr/bin/perl,-e,qq^$ptr = 0;while (1){$| = 1;sleep 3;next unless open LOG, "@_[2]";seek LOG, $ptr, 0;while (<LOG>){print}$ptr = tell LOG;close LOG}sleep; ^;exit} return}The pubs4d  utility uses p4  filelog  to find the manuals submitted since the last docgen  job. A build  directory and doc build client are used to handle the doc builds. Here is the Perl code:
  6. 6. 6 A Perforce-Based Automatic Documentation Generation System  This is text for annotations in footer. Similar to footnotes treatment.$_ = `p4 -c cshaw-build sync -p $depot/$dir/... 2>&1 |tee -a $log2`;if ($? != 0 or /cant sync/s){print LOG1 "(Error: sync failed. Skipping....)n";next MANUAL}After all changed manuals are synced, the utility calls docgen  and waits:printf LOG1 "%4.1ftSending jobs to docgen.",(time - $time)/60;$_ = `docgen -source $docgen_driver_file 2>&1 |tee -a $log2`;sleep 240; #--- give docgen a chance to get startedprint LOG1 "n";As each manual completes, pubs4d  copies the HTML, PDF, logs and reports to a dev  directoryand updates a Build Reports web page. After all docs in the job are processed, the utilitychecks to see if any other manuals have changed since the last job and processes them. Onceall pending manuals are built, the utility runs dmerge  to update changed InfoHubs.$_ = `dmerge $devdir/$infohub_type$version/htmldocs -add_global_elements 2>&1 |tee -a $log2`;The utility also runs a checklinks  subroutine that checks all hypertext links in the InfoHub’sdocuments and verifies that their targets exist and links are well formed. This subroutinecreates a report for each InfoHub. Here is an excerpt (for product releases, you want to resolveall hypertext link issues):zin10.1 InfoHub Links ReportTopics: 9021XRef: 12590; Missing Targets: 11GoTo: 2878; Vacuous Links: 2;Missing Links: 237; Bad Targets: 49Ambiguous GoTo target: autocheck_user ==> CASE_DEFAULTfrom command_ref/tcl11.htmlMissing GoTo link: -togglecountlimit inquesta_sim_ref/a_commands_vsim1.htmlMissing GoTo link: -t inquesta_sim_ref/a_commands_p_vcn067.html. . .Missing GoTo target: autocheck_user ==> ONE_COLD fromzeroin_rh/highlights3.htmlMissing GoTo target: autocheck_user ==> ONE_HOT fromzeroin_rh/highlights3.html. . .Missing Xref target: a_functions157.html#CRefID59132" fromquesta_sim_fli/a_catalog1.htmlMissing Xref target: a_ver_plan16.html#CRefID32138 fromquesta_sim_vm/a_test_track3.html. . .Vacuous GoTo link: before inquesta_sim_user/a_sdf_timing21.htmlVacuous GoTo link: before -learn <rin questa_sim_ref/a_commands_vsim1.htmlTech  Pubs  Build  Reports  A Tech Pubs Build Reports web page (see Figure  5) is the hub for information about the resultsof document generation. As pubs4d  processes a manual, it updates this page with informationabout the manual’s translation. The utility also copies ancillary reports generated by docgen  tolocations linked to on this web page.
  7. 7. 7 A Perforce-Based Automatic Documentation Generation System  This is text for annotations in footer. Similar to footnotes treatment.    Figure 5: Tech Pubs build reports web pageA red (X) next to the depot name indicates the translation was unsuccessful. It links to a logreturned by docgen.{DG} 13:21:29 >> DocGen job started on Mon Sep 10 13:21:...{DG} 13:21:29 >> DocGen Last Updated: Tue Aug 21 12:54:1...{DG} 13:21:29 >> DocGen Job Host: sofa{DG} 13:21:30 >> FrameMaker File: man.book{DG} 13:21:30 >> check_book_md5sum output:{DG} 13:21:32 >> WARNING: Handle quick_ref found in thisfile does not match handle qstatic_rn foundin framemaker project.{DG} 13:21:33 >> Check Links status: w{DG} 13:21:36 >> Copying output files to destination:sje:/zin/pubs/docs/dev/qstatic10.2ERROR: PDF file not generated: /wv/techpubs/docgen/jobs...{DG} 13:21:37 >> Copying output files to exact destination:sje:/zin/pubs/docs/dev/qstatic10.2/LOGERROR: PDF file not generated: /wv/techpubs/docgen/jobs...{DG} 13:21:37 >> Job will be skipped because FrameMakersource has not changed since last DocGenconversion.{DG} 13:21:37 >> DocGen job finished with exit code: 100A red (X) before the conversion timestamp links to a page of HTML translation warnings.These are typically broken references that are also caught by checklinks. A green check (ü)indicates HTML translation had no warnings. The document handle (for example command_ref)links to the HTML Conversion Report (see Figure 6) generated by docgen.
  8. 8. 8 A Perforce-Based Automatic Documentation Generation System  This is text for annotations in footer. Similar to footnotes treatment.Figure 6: HTML conversion report for a manualNative-drawn FrameMaker graphics must follow strict rules to ensure HTML counterparts arerendered properly. These rules are defined by the Technical Publications Support team.However, figures still are prone to mistakes and corrupt rendering. A useful docgen  reportshows only the generated graphics images in the corresponding manual. The graphics link inthe Build Report entry for the manual displays this report (see Figure  7) [3].    Figure 7: Graphics report for a manual
  9. 9. 9 A Perforce-Based Automatic Documentation Generation System  This is text for annotations in footer. Similar to footnotes treatment.The Tech Pubs Build Reports page has a side bar that links to various InfoHub-relatedinformation and pubs4d  session logs. Each entry in the InfoHubs  section brings up itsassociated InfoHub. Authors can check how their edits appear in the documentation set in(virtual) real time. The Checklinks  entries bring up the checklinks reports for the associatedInfoHubs. The Build  Logs  are the LOG1 and LOG2 outputs of the pubs4d  utility.Author  Work  Flow  The work flow for authors is remarkably simple. We use the Perforce system (either command-line or P4V  interface, or typically both). The author checks out a manual, performs edits, andchecks the manual back in. The pubs4d  daemon (with the help of docgen) performs thetranslations and InfoHub update. Meanwhile, the author monitors translation progress from anxterm log.When translation and InfoHub generation are complete, the author checks the Tech Pubs BuildReports web page for errors and warnings. The author also can check the graphics report,bring up the associated InfoHub to check document edits and look at the checklinks report tofind and debug bad hypertext links.Administrator  Work  Flow  The documentation system administrator performs the manual tasks. Surprisingly, these areminimal and uncomplicated. Aside from maintaining the build/release utilities (pubs4d  andpubs4), the administrator handles adding and removing manuals and InfoHubs.Creating a custom InfoHub is a Tech Pubs procedure and just entails copying an existing huband modifying several files. Work to insert the new InfoHub into the Documentation GenerationSystem consists of putting the InfoHub into the dev  area, updating the Build Reports pagemanually, and updating pubs4d  to recognize the depots that contain manuals to process.Adding a manual to the system consists of adding an entry for it in the Build Reports page andadding the initial document to the Perforce techpubs  depot.Adjustments can be made by authors as well as the administrator: .fm  files and the graphicsfiles are added to, or removed from, the depot. Our experience is that administratorintervention is only needed when major changes happen, such as rolling over the software fora major release (which happens once a year).History  The DVT Tech Pubs Documentation Generation System has been in operation for about 6years. In that time, it has evolved considerably. It originally started out running HTML and PDFtranslators separately and had loads of sanity-checking code. It only ran on old Solaris boxes.In addition to being painfully slow, it ran translations sequentially. An extensive documentrebuild—for example, 8 large books—might take more than 10 hours to complete.This was OK. We could wait overnight and hope the build performed without error. But, thesystem still replaced the tedious manual task of running translators directly and comparinglogs.Since then, the Technical Publications Support team created docgen, which moved FM-to-
  10. 10. 10A Perforce-Based Automatic Documentation Generation System  This is text for annotations in footer. Similar to footnotes treatment.HTML/PDF translation to a Solaris grid (of old but big machines). The interface is now Linuxbased and the translations are performed in parallel. The docgen  utility even supports entry andtarget points at our various sites around the world.Now, an extensive document rebuild of more than 8 manuals might take an hour—if themanuals are huge and have many graphics and the grid has a lot of traffic. But typical usage is5 to 15 minutes for the average multiple-manual translation.Plans  for  the  Future  The system is scalable. In addition to new writers, we are opening the system up toengineering authors.For example, our Verification IP group creates testbench IP that exercises tests of IC busesand interfaces. Protocols for these interfaces are meticulous and arcane. Documentation forthese products is detailed and tedious and constitutes thousands of pages. We are slowlysetting up engineering authors and technical experts to author directly in our FrameMakersource files. They not only author topic sections, but they also add embedded comments fortheir writers to resolve. Since they invariably have prior knowledge of the Perforce system, theramp up for authoring is quick. We plan to continue to roll out this methodology to otherengineers.Other  Issues  Some issues are beyond the scope of this paper, including:• LocksOur organizational setup precludes collisions—writers tend to work on separatemanuals and chapters. When collisions are possible—for example, when a writer workswith an engineering author—the team members agree to lock files when they arechecked out. Although not currently necessary, we might consider system locks withcheckouts. With FrameMaker, DIFFs are more difficult than with text files. However,FrameMaker does have a document  compare  facility, which makes resolving documentcollisions a simple, albeit manual process.• TriggersUsing a trigger to wake up the pubs4d  daemon is probably the way to go. But, we optedto have a periodically- waking, sleeping daemon for various innocuous reasons.• PromotionGenerating the InfoHub targets is only part of the process of delivering documentationto the development software location, which also gets built into the distribution software.Secondary processes shape the deliverable documentation targets. These are bakedinto a multipurpose Perl script we call pubs4.This script “promotes” the dev  image to the release  image. Along the way, it scans theHTML and creates support files for the GUIs so Help buttons on dialogs link properly tothe corresponding topics in the documentation. Generated GUI files also include textextracted from tables in documents that are displayed by hover  help. This process helps
  11. 11. 11A Perforce-Based Automatic Documentation Generation System  This is text for annotations in footer. Similar to footnotes treatment.keep GUI prompts totally consistent with the documentation. For example, errormessages from a Messages  Reference  expand the terse information returned by thesoftware.• Other facilitiesThe pubs4  utility is indeed multipurpose. In addition to promotion, the script can be usedto generate a final release version of the documentation for a product family. This imageis ready to import to the award-winning Mentor Graphics SupportNet® customer supportweb site.As mentioned above, the pubs4  utility also displays the dual-xterm logs that authors useto monitor the progress of document generation.Conclusion  The DVT Technical Publications group at Mentor Graphics Corporation developed a “wrapper”for the document generation facility supplied by the Technical Publications Support group.Eventually—since our engineering teams use Perforce—we incorporated the wrapper into aPerforce-based documentation source control methodology. This system has been inoperation more than 6 years.Along the way, the system has evolved. Old Solaris-based utilities were replaced by docgen, acloud-based multiprocessing utility. The corresponding speedup was on the order of 10X. Plus,the sanity-checking code was replaced by much more sophisticated checking internal to thedocgen  utility.Additional features include hypertext link checking and GUI data extraction.Perforce is the backbone of the system. It offers a simple mechanism for checking documentsout and in. Interfacing with Perforce and the document depots through Perl is easy andefficient. The visual Perforce application (P4V) provides an elegant interface for writers andother authors to use as a cockpit for their documentation management tasks.Off-loading usually-manual processes to an automated under-the-hood methodology frees ourauthors from the tedious process of preparing documentation targets for the distributionsoftware. Instead of performing this process at the end of a long release cycle, we create acorrect-by-construction documentation image ready to go at the “drop of a hat.”This image is also integrated into nightly builds so developers can see relevant portions of thedocumentation in “real time” rather than waiting for some end game to finish. Thisdocumentation image is metaphorically a “living document,” which evolves with the softwareand reflects dynamic information such as comments from development and field engineers.Such a system has freed our authors to do what we do best—write.References  [1] Documentation Processes at Mentor Graphics, internal document, Mentor Graphics Corp.(2012).[2] Perforce 2013.1 P4 User’s Guide, Perforce Software (2013).
  12. 12. 12A Perforce-Based Automatic Documentation Generation System  This is text for annotations in footer. Similar to footnotes treatment.[3] Chris Shaw, Questa CDC User Guide V10.2, Mentor Graphics Corp. (2013).Mentor Graphics® and Questa® are registered trademarks of Mentor Graphics Corporation.InfoHub™ is a trademark of Mentor Graphics Corporation. Adobe® and FrameMaker® areregistered trademarks of Adobe Systems Inc.Perforce™ is a trademark of Perforce Software.