• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content







Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    transcoding.ppt transcoding.ppt Presentation Transcript

    • Subproject 4: HTML-WML Transcoding System Jia-Shung Wang Computer Science Department National Tsing Hua University March 27, 2001
    • Outline
      • Motivation and Issues
      • Examples of Transcoding
      • System Overview and Translation Flow
      • Some HTML to WML Conversion Strategies
    • Information Appliances
      • Different design constraints based on intended use, enhances ease of use
        • Desktop PC
        • Mobile PC
        • Desktop “Smart” Phone
        • Mobile Telephone
        • Personal Digital Assistant
        • Set-top Box
        • Digital VCR
      • Implications:
        • Shift from computer design to consumer design
        • Heterogeneous “standards,” hybrid networking
        • Interactive networking, access on demand, QoS
    • Motivation
      • Rapidly growing diversity of wireless communication devices
      • The incredible growing of the amount of available HTML web pages on the Internet
      • Solutions for mobile devices with WML browsers to access the existing HTML or WML pages on the Internet.
    • Issues
      • Device-enabled service for WML mobile devices with different types of screen
      • Bandwidth-driven transmission for rapid response and fast delivery speed
      • The usage of browsing behavior
      • The resizing of images /icons
      • The compression of the resulting WML data
    • Demos of Transcoding
      • Contents from
          • enYES 鉅亨網
          • USAtoday
          • CS, NTHU
          • NTHU
          • VOD
    • Discussions
      • enYES provides two versions: regular HTML and WAP to serve PC users and mobile device users separately.
      • USAtoday also provides content (simplified version) for users with Palm.
      • NTHU, CS-NTHU homepages : If we keep the original figure for saving the link information, then the page layout becomes old. (using HTML browser with:Browse-It).
      • VOD homepage, one-column text: no significant difference after transcoding.
    • Usage of Browsing Behavior
      • The automatic translation seems complicated because of the diversity of content posted on an HTML page.
      • It is unlikely to have a universal conversion strategy to translate every HTML page to sequences of WML decks effectively.
      • However, it seems a good idea to categorize the browsing behavior to classify the HTML page to be translated first.
    • Usage of Browsing Behavior (cont’d)
      • After doing that we may realize what the client requires. Then we can have a corresponding conversion to extract the acquired content step-by-step and translate them into some predictable and small sized WML documents.
      • We believe that there would be some adequate conversions for some kinds of web pages after classification.
    • Related Works Transcoding Proxy of IBM alphaWorks
      • It has a goal to manager different version of contents with different fidelities and modalities in order to adapt the delivery to different client device.
    • Related Works Intel Quick Web Technology
      • New software capability that helps Internet providers and digital distribution companies increase the delivery speed of Web pages containing photos, drawings and other graphics.
      • It uses two key techniques, “Compresses” and “Caches”.
    • Related Works Spyglass Prism
      • Spyglass Prism dynamically adapts Web content to match various non-PC devices.
      • It functions as a proxy server, caches the converted content, and dynamically converting standard HTML to WML.
    • Related Works Proxy Architecture for Efficient Web Browsing over Cellular Networks
      • Decreases the access time of browsing WWW in narrow-band wireless environment.
      • It adopts persistent connection and pipelining technique based on proxy architecture to improve the HTTP process between the client and the proxy server.
    • Comparisons between HTML and WML
      • Both make use of tags and attributes.
      • Similar character set, syntax and data types.
      • Two special elements of WML structure
        • Deck and Card
      • Different design goal
        • HTML: To Publish hypertext on the World Wide Web
        • WML: For narrow network bandwidth devices with small displays, limited memory and fewer computational resources.
    • Examples of HTML and WML WML <wml> <deck> <card> <p> <do type=&quot;accept&quot;> <go href=&quot;#card2&quot;/> </do> This is the first card... </p> </card> <card id=&quot;card2&quot;> <p> This is the second card. </p> < /card > </deck> </wml> HTML <html> <head> <title> Example page. </title> </head> <body> <h1> This is a headline. </h1> <p> This is a paragraph. </p> </body> </html >
    • System Overview Web Server Multimedia Content Translation Server WML Generator WML WML Browser Etc. HTTP HTML Parser WAP HTML-WML Translator HTML, WML Documents HTTP CGI Scripts etc. Client
    • Features
      • An HTML-WML Translator on the Translation Server
      • Both HTTP and WAP requests are acceptable.
      • Java Servlet API compatible
      • Server- and platform-independent
    • Translation Server: Components and Flow Network Protocol Proxy HTML Parser Filter Document Analyzer Decks & Cards WML Generator Link Builder Request Request Response Response
    • Components
      • Gateway
        • Accept requests from clients
        • Return appropriate responses
      • Proxy Servlet
        • Get the requested remote documents
        • Determine to pass or convert
        • Cache the converted results
    • Components (cont’d)
      • HTML Parser
        • Parse the HTML document as a parse tree
      • Document Analyzer
        • Analyze the parse tree
      • Filter
        • Filter any objects unnecessary or not supported by the client device
        • Image/icon resizing
    • Components (cont’d)
      • Content Divider
        • Split a document into multiple, small-size documents
      • Link Maker
        • Insert extra links to make small documents reach one another
      • WML Generator
        • Produce well-formed WML documents and return them to Proxy Servlet
    • HTML to WML Conversion Tools
      • Semi-automatic:
        • Used for rich HTML documents
        • The conversion form is designated manually with the help of analysis and editing tools.
        • The resulting forms are distributed to the gateway servers.
      • Automatic:
        • Used for simple documents, such as News and BBS, …
    • HTML to WML Conversion Strategies
      • Strategy I: Tables to Lists
        • Simply removing all layout elements such as table
        • Let all the contents arrange into only one column with a fixed width
      • Strategy II: One Table One Deck
        • Extracting each table to form a deck
    • HTML to WML Conversion Strategies (cont’d)
      • Strategy III: Preview First
        • a. One Table One Deck
        • b. Collect all the first card of every deck as preview cards
        • c. Arrange these preview cards to form an preview deck, which will be transmitted first, every preview card will have a link to its corresponding deck
    • Original Document <document> <table> <table> <table> < section 4> <section 1> <section 2> < section 3> <content 1_1> <content 1_2> <content 4_1> <content 2_1> <content 2_2> <content 2_3> <content 2_4> <content 3_5> <content 3_6> <content 3_7> <content 2_5> <content 3_1> <content 3_2> <content 3_3> <content 3_4>
    • Tables to Lists <document> <deck> <content 1_1> <content 1_2> <content 2_1> <content 2_2> <content 2_3> <deck> <deck> <content 2_4> <content 2_5> <content 3_1> <content 3_2> <content 3_3> <content 4_1> <content 3_5> <content 3_6> <content 3_7> <content 3_4>
    • One Table One Deck <document> <deck> <content 1_1> <content 1_2> <content 2_1> <content 2_2> <content 2_3> <deck> <deck> <content 2_4> <content 2_5> <content 3_1> <content 3_2> <content 3_3> <content 4_1> <content 3_5> <content 3_6> <content 3_7> <content 3_4> <deck> <deck>
    • Preview First <document> <deck> <content 1_1> <content 1_2> <content 2_1> <content 2_2> <content 2_3> <deck> <deck> <content 2_4> <content 2_5> <content 3_1> <content 3_2> <content 3_3> <content 4_1> <content 3_5> <content 3_6> <content 3_7> <content 3_4> <deck> <deck>
    • Strategy Evaluation
      • Assuming we have S sections in a document and the document is translated to N WML cards.
      • Every deck contains at most C cards.
      • Assuming that the contents in the same tables are similar.
    • Evaluation of Searching After Translation Preview First One Table One Deck Tables to Lists Good Best Worst User Friendly S/2C S/2 N/2 Average Deck Access Time
    • Performance Evaluation 5.4% 57.2% 16,891 7.4% 46.7% 11,232 3.5% 22.0% 7,440 280,727 8,325 21,203 126,740 6,137 17,937 176,361 9,471 24,359 Experiment #1 Experiment #2 Experiment #3 Headers Text Source (bytes) Images (bytes) With Images Without Images Reduction HTML Pages WML Decks (bytes) 25.2% 40.3% 12,062 17,966 20,363 9,568 Experiment #4
    • Performance Evaluation (Experiment #1: What’s WAP ) Preview Deck 1 Deck 3.2 Deck 3.1 What’s WAP Preview Deck 3 Deck 2 Deck 1 WAP Forum
    • Performance Evaluation (Experiment #2: NTHU Web Page) Preview NTHU Preview Deck 1 Preview Deck 1 Deck 2.1 Deck 2.2 Current Status Preview Deck 1 Deck 2.1 Deck 2.2 History Deck 3.1 Deck 3.2 About NTHU
    • Performance Evaluation (Experiment #3, NTHU CS Web Page) Preview Deck 1 Deck 3.2 Deck 3.1 Faculty Preview Deck 1 NTHU CS Deck 3.4 Deck 3.3 Deck 3.6 Deck 3.5
    • Performance Evaluation (Experiment #4, IETF Web Page) Preview Deck 1 IETF Preview Deck 1 Deck 2.1 Deck 2.2 Internet-Drafts Preview Deck 1 Deck 2.2 Deck 2.1 Internet-Drafts Index Deck 2.4 Deck 2.3 Deck 2.5 Preview Deck 1 Deck 2.2 Deck 2.1 DNSOP Deck 2.4 Deck 2.3 Deck 2.5
    • Implementation
      • Goal: Portability, reusability, and crash protection.
      • Translation server: under Java environment with Java Servlet, Java HTML Tidy, and XML Parser for Java.
      • Servlet-enable server: Avenida Web Server and Nokia WAP Server
      • Microsoft Windows NT Workstation 4.0 with Service Pack 5
    • Summary
      • Design an HTML to WML transcoding system with
        • Analyzing and filtering HTML contents
        • Image/icon resizing
        • WML browsing mode design and WML conversion tool
        • compression and decompression modules of the WML data.
        • WML transmission control