Subproject 4: HTML-WML Transcoding System - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Subproject 4: HTML-WML Transcoding System

Description:

... the delivery speed of Web pages containing photos, drawings and other graphics. ... This is a headline. /h1 p This is a paragraph. /p /body /html ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 54
Provided by: vcCsNt
Category:

less

Transcript and Presenter's Notes

Title: Subproject 4: HTML-WML Transcoding System


1
Subproject 4 HTML-WML Transcoding System
  • Jia-Shung Wang
  • Computer Science Department
  • National Tsing Hua University
  • March 27, 2001

2
Outline
  • Motivation and Issues
  • Examples of Transcoding
  • System Overview and Translation Flow
  • Some HTML to WML Conversion Strategies

3
Information Appliances
  • Different design constraints based on intended
    use, enhances ease of use
  • Desktop PC
  • Mobile PC
  • Desktop Smart Phone
  • Mobile Telephone
  • Personal Digital Assistant
  • Set-top Box
  • Digital VCR
  • Implications
  • Shift from computer design to consumer design
  • Heterogeneous standards, hybrid networking
  • Interactive networking, access on demand, QoS

4
Motivation
  • Rapidly growing diversity of wireless
    communication devices
  • The incredible growing of the amount of available
    HTML web pages on the Internet
  • Solutions for mobile devices with WML browsers to
    access the existing HTML or WML pages on the
    Internet.

5
Issues
  • Device-enabled service for WML mobile devices
    with different types of screen
  • Bandwidth-driven transmission for rapid response
    and fast delivery speed
  • The usage of browsing behavior
  • The resizing of images /icons
  • The compression of the resulting WML data

6
Demos of Transcoding
  • Contents from
  • enYES ???
  • USAtoday
  • CS, NTHU
  • NTHU
  • VOD

7
Discussions
  • enYES provides two versions regular HTML and WAP
    to serve PC users and mobile device users
    separately.
  • USAtoday also provides content (simplified
    version) for users with Palm.
  • NTHU, CS-NTHU homepagesIf we keep the original
    figure for saving the link information, then the
    page layout becomes old. (using HTML browser
    withBrowse-It).
  • VOD homepage, one-column text no significant
    difference after transcoding.

8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
Usage of Browsing Behavior
  • The automatic translation seems complicated
    because of the diversity of content posted on an
    HTML page.
  • It is unlikely to have a universal conversion
    strategy to translate every HTML page to
    sequences of WML decks effectively.
  • However, it seems a good idea to categorize the
    browsing behavior to classify the HTML page to be
    translated first.

25
Usage of Browsing Behavior (contd)
  • After doing that we may realize what the client
    requires. Then we can have a corresponding
    conversion to extract the acquired content
    step-by-step and translate them into some
    predictable and small sized WML documents.
  • We believe that there would be some adequate
    conversions for some kinds of web pages after
    classification.

26
Related Works Transcoding Proxy of IBM alphaWorks
  • It has a goal to manager different version of
    contents with different fidelities and modalities
    in order to adapt the delivery to different
    client device.

27
(No Transcript)
28
Related Works Intel Quick Web Technology
  • New software capability that helps Internet
    providers and digital distribution companies
    increase the delivery speed of Web pages
    containing photos, drawings and other graphics.
  • It uses two key techniques, Compresses and
    Caches.

29
(No Transcript)
30
Related WorksSpyglass Prism
  • Spyglass Prism dynamically adapts Web content to
    match various non-PC devices.
  • It functions as a proxy server, caches the
    converted content, and dynamically converting
    standard HTML to WML.

31
Related WorksProxy Architecture for Efficient
Web Browsing over Cellular Networks
  • Decreases the access time of browsing WWW in
    narrow-band wireless environment.
  • It adopts persistent connection and pipelining
    technique based on proxy architecture to improve
    the HTTP process between the client and the proxy
    server.

32
(No Transcript)
33
Comparisons betweenHTML and WML
  • Both make use of tags and attributes.
  • Similar character set, syntax and data types.
  • Two special elements of WML structure
  • Deck and Card
  • Different design goal
  • HTML To Publish hypertext on the World Wide Web
  • WML For narrow network bandwidth devices with
    small displays, limited memory and fewer
    computational resources.

34
Examples of HTML and WML
WML ltwmlgt ltdeckgt ltcardgt
ltpgt ltdo type"accept"gt
ltgo href"card2"/gt
lt/dogt This is the first card...
lt/pgt lt/cardgt ltcard id"card2"gt
ltpgt This is the second card.
lt/pgt lt/cardgt lt/deckgt lt/wmlgt
HTML lthtmlgt ltheadgt lttitlegt
Example page. lt/titlegt lt/headgt
ltbodygt lth1gt This is a
headline. lt/h1gt ltpgt
This is a paragraph. lt/pgt
lt/bodygt lt/html gt
35
System Overview
Web Server
Translation Server
HTML, WML Documents
HTML Parser
WAP
WML
HTTP
HTML-WML Translator
WML Browser
HTTP
Multimedia Content
WML Generator
Etc.
36
Features
  • An HTML-WML Translator on the Translation Server
  • Both HTTP and WAP requests are acceptable.
  • Java Servlet API compatible
  • Server- and platform-independent

37
Translation Server Components and Flow
Network Protocol
Proxy
Request
Request
Response
Response
HTML Parser
WML Generator
Link Builder
Filter
Document Analyzer
Decks Cards
38
Components
  • Gateway
  • Accept requests from clients
  • Return appropriate responses
  • Proxy Servlet
  • Get the requested remote documents
  • Determine to pass or convert
  • Cache the converted results

39
Components (contd)
  • HTML Parser
  • Parse the HTML document as a parse tree
  • Document Analyzer
  • Analyze the parse tree
  • Filter
  • Filter any objects unnecessary or not supported
    by the client device
  • Image/icon resizing

40
Components (contd)
  • Content Divider
  • Split a document into multiple, small-size
    documents
  • Link Maker
  • Insert extra links to make small documents reach
    one another
  • WML Generator
  • Produce well-formed WML documents and return them
    to Proxy Servlet

41
HTML to WMLConversion Tools
  • Semi-automatic
  • Used for rich HTML documents
  • The conversion form is designated manually with
    the help of analysis and editing tools.
  • The resulting forms are distributed to the
    gateway servers.
  • Automatic
  • Used for simple documents, such as News and BBS,

42
HTML to WMLConversion Strategies
  • Strategy I Tables to Lists
  • Simply removing all layout elements such as table
  • Let all the contents arrange into only one column
    with a fixed width
  • Strategy II One Table One Deck
  • Extracting each table to form a deck

43
HTML to WMLConversion Strategies (contd)
  • Strategy III Preview First
  • a. One Table One Deck
  • b. Collect all the first card of every deck as
    preview cards
  • c. Arrange these preview cards to form an preview
    deck, which will be transmitted first, every
    preview card will have a link to its
    corresponding deck

44
Original Document
ltcontent 1_1gt
lttablegt
ltsection 1gt
ltcontent 1_2gt
ltcontent 2_1gt
ltcontent 2_2gt
ltsection 2gt
ltcontent 2_3gt
ltcontent 2_4gt
ltcontent 2_5gt
ltdocumentgt
lttablegt
ltcontent 3_1gt
ltcontent 3_2gt
ltcontent 3_3gt
lttablegt
lt section 3gt
ltcontent 3_4gt
ltcontent 3_5gt
ltcontent 3_6gt
lt section 4gt
ltcontent 4_1gt
ltcontent 3_7gt
45
Tables to Lists
ltcontent 1_1gt
ltcontent 1_2gt
ltcontent 2_1gt
ltdeckgt
ltcontent 2_2gt
ltcontent 2_3gt
ltcontent 2_4gt
ltcontent 2_5gt
ltdocumentgt
ltdeckgt
ltcontent 3_1gt
ltcontent 3_2gt
ltcontent 3_3gt
ltcontent 3_4gt
ltcontent 3_5gt
ltdeckgt
ltcontent 3_6gt
ltcontent 3_7gt
ltcontent 4_1gt
46
One Table One Deck
ltcontent 1_1gt
ltdeckgt
ltcontent 1_2gt
ltcontent 2_1gt
ltcontent 2_2gt
ltcontent 2_3gt
ltdeckgt
ltcontent 2_4gt
ltcontent 2_5gt
ltdocumentgt
ltcontent 3_1gt
ltcontent 3_2gt
ltdeckgt
ltcontent 3_3gt
ltcontent 3_4gt
ltcontent 3_5gt
ltcontent 3_6gt
ltdeckgt
ltcontent 3_7gt
ltcontent 4_1gt
ltdeckgt
47
Preview First
ltcontent 1_2gt
ltdeckgt
ltcontent 2_2gt
ltcontent 1_1gt
ltcontent 2_3gt
ltcontent 2_1gt
ltdeckgt
ltdocumentgt
ltcontent 2_4gt
ltdeckgt
ltcontent 3_1gt
ltcontent 2_5gt
ltcontent 4_1gt
ltcontent 3_2gt
ltcontent 3_3gt
ltdeckgt
ltcontent 3_4gt
ltcontent 3_5gt
ltcontent 3_6gt
ltdeckgt
ltcontent 3_7gt
48
Strategy Evaluation
  • Assuming we have S sections in a document and the
    document is translated to N WML cards.
  • Every deck contains at most C cards.
  • Assuming that the contents in the same tables are
    similar.

49
Evaluation of Searching After Translation
Preview First
One Table One Deck
Tables to Lists
Good
Best
Worst
User Friendly
S/2C
S/2
N/2
Average Deck Access Time
50
Performance Evaluation
Reduction
HTML Pages
WML Decks (bytes)
Source (bytes)
Images (bytes)
With Images
Without Images
Headers
Text
3.5
22.0
7,440
176,361
9,471
24,359
Experiment 1
7.4
46.7
11,232
126,740
6,137
17,937
Experiment 2
5.4
57.2
16,891
280,727
8,325
21,203
Experiment 3
25.2
40.3
12,062
17,966
20,363
9,568
Experiment 4
51
Performance Evaluation (Experiment 1 Whats WAP)
Whats WAP
WAP Forum
Preview
Preview
Deck 1
Deck 3.1
Deck 3
Deck 2
Deck 1
Deck 3.2
52
Performance Evaluation (Experiment 2 NTHU Web
Page)
Current Status
History
NTHU
Preview
Preview
Preview
About NTHU
Deck 1
Deck 2.1
Deck 1
Deck 2.1
Deck 3.1
Preview
Deck 1
Deck 2.2
Deck 2.2
Deck 3.2
53
Performance Evaluation (Experiment 3, NTHU CS
Web Page)
NTHU CS
Faculty
Preview
Preview
Deck 1
Deck 3.1
Deck 3.3
Deck 3.5
Deck 1
Deck 3.2
Deck 3.4
Deck 3.6
54
Performance Evaluation (Experiment 4, IETF Web
Page)
IETF
Internet-Drafts
Internet-Drafts Index
DNSOP
Preview
Preview
Preview
Preview
Deck 2.1
Deck 1
Deck 1
Deck 2.1
Deck 2.1
Deck 1
Deck 1
Deck 2.2
Deck 2.2
Deck 2.4
Deck 2.2
Deck 2.4
Deck 2.3
Deck 2.5
Deck 2.3
Deck 2.5
55
Implementation
  • Goal Portability, reusability, and crash
    protection.
  • Translation server under Java environment with
    Java Servlet, Java HTML Tidy, and XML Parser for
    Java.
  • Servlet-enable server Avenida Web Server and
    Nokia WAP Server
  • Microsoft Windows NT Workstation 4.0 with Service
    Pack 5

56
Summary
  • Design an HTML to WML transcoding system with
  • Analyzing and filtering HTML contents
  • Image/icon resizing
  • WML browsing mode design and WML conversion tool
  • compression and decompression modules of the WML
    data.
  • WML transmission control
Write a Comment
User Comments (0)
About PowerShow.com