Master2-Library and Information Sciences: HTML versus XML- Dr. Jawad Makki

Saturday, April 13, 2013

HTML versus XML- Dr. Jawad Makki

HTML versus XML: Similarities

Both use tags(e.g. <h2> and </year>)
Tags may be nested (tags within tags)
Human users can read and interpret both HTML and XML representations quite easily

… But how about machines?

Problems with Automated Interpretation of HTML Document

An intelligent agent trying to retrieve the names of the authors of the book

Authors’ names could appear immediately after the title
or immediately after the word by
Are there two authors?
Or just one, called “V. Marek and M. Truszczynski”?

HTML vs XML: Structural Information

HTML documents do not contain structural information: pieces of the document and their relationships.
XML more easily accessible to machines because

Every piece of information is described.
Relations are also defined through the nesting structure.
E.g., the <author>tags appear within the <book>tags, so they describe properties of the particular book.

A machine processing the XML document would be able to deduce that

the authorelement refers to the enclosing bookelement
rather than by proximity considerations

XML allows the definition of constraints on values

E.g.a year must be a number of four digits

HTML vs XML: Formatting

The HTML representation provides more than the XML representation:

The formatting of the document is also described

2. Τhe main use of an HTML document is to display information: it must define formatting

3. XML: separation of content from display

same information can be displayed in different ways

Master2-Library and Information Sciences

Search This Blog

Saturday, April 13, 2013

HTML versus XML- Dr. Jawad Makki

No comments:

Post a Comment