logo

XML Mini-Tutorial

There are many problems with this approach: the semantics is encoded into text formatting tags; l there is no means of checking that a recipe is encoded correctly; l it is difficult to change the layout of recipes (CSS is not enough). l It would be much better to invent a special recipe markup language:
XML Mini-Tutorial XML Mini-Tutorial Michael I. Schwartzbach Copyright © 2000 BRICS, University of Aarhus http://www.brics.dk/~mis/ITU/XML/ What is XML? HTML vs. XML A conceptual view of XML A concrete view of XML Applications of XML XML technologies Namespaces The recipe example Schema languages A schema for recipes XLink, XPointer, and XPath Pointing at recipes XML-QL Querying the recipes XSLT A style sheet for recipes Exercises http://www.brics.dk/~mis/ITU/XML/ [18/09/2000 14:24:26] HTML, JavaScript, and XML Mini-Tutorials HTML, JavaScript, and XML Mini-Tutorials Michael I. Schwartzbach Copyright © 2000 BRICS, University of Aarhus http://www.brics.dk/~mis/ITU/ These mini-tutorials are created as part of the course Internet Programming at the IT-University of Copenhagen. HTML (PDF) JavaScript (PDF) XML (PDF) http://www.brics.dk/~mis/ITU/XML/info.html [18/09/2000 14:24:28] XML: what is it? What is XML? XML is a framework for defining markup languages: q there is no fixed collection of markup tags; q each XML language is targeted at different application domains; q the languages will share many features; q there is a common set of tools for processing such languages. XML is not a replacement for HTML: q HTML should ideally be just another XML language; q in fact, XHTML is just that; XHTML is a (very popular) XML language for hypertext markup. q XML is designed to: q seperate syntax from semantics; q support internationalization (Unicode) and platform independence; q be the future of structured information, including databases. http://www.brics.dk/~mis/ITU/XML/whatis.html [18/09/2000 14:24:29] XML vs. HTML HTML vs. XML Consider the following recipe collection published in HTML: Rhubarb Cobbler [email protected] Wed, 14 Jun 95 Rhubarb Cobbler made with bananas as the main sweetener. It was delicious. Basicly it was 2 1/2 cups diced rhubarb (blanched with boiling water, drain) 2 tablespoons sugar 2 fairly ripe bananas sliced 1/4" round 1/4 teaspoon cinnamon dash of nutmeg Combine all and use as cobbler, pie, or crisp. Related recipes: Garden Quiche There are many problems with this approach: q the semantics is encoded into text formatting tags; q there is no means of checking that a recipe is encoded correctly; q it is difficult to change the layout of recipes (CSS is not enough). It would be much better to invent a special recipe markup language: Rhubarb Cobbler [email protected] Wed, 14 Jun 95 Rhubarb Cobbler made with bananas as the main sweetener. It was delicious. ... http://www.brics.dk/~mis/ITU/XML/htmlvsxml.html (1 of 2) [18/09/2000 14:24:30] XML vs. HTML Combine all and use as cobbler, pie, or crisp. Garden Quiche This example illustrates: q the markup tags are chosen purely for logical structure; q this is just one choice of markup detail level; q we need a kind of "grammar" for XML recipe collections; q we need a stylesheet to define presentation semantics. http://www.brics.dk/~mis/ITU/XML/htmlvsxml.html (2 of 2) [18/09/2000 14:24:30] XML: a conceptual view A conceptual view of XML An XML document is a labeled tree. q a leaf node is r character data (a text string) - the actual data, r a processing instruction - annotations for various processors, typically in document header, r a comment - never any semantics attached, r an entity declaration - simple macros. q an internal node is an element, which is labeled with r a name, and r a set of attributes, each consisting of a name and a value. Often, comments and entity declarations are not explicitly represented in the tree. http://www.brics.dk/~mis/ITU/XML/conceptual.html [18/09/2000 14:24:31] XML: a concrete view A concrete view of XML An XML document is a (Unicode) text with markup tags and other meta-information. Markup tags denote elements: ......... | | | | | | | a matching element end tag | | the contents of the element | an attribute with name attr and value val, values enclosed by ' or " an element start tag with name foo There is a short-hand notation for empty elements: ...... Note: XML is case sensitive!! An XML document must be well-formed: q start and end tags must match; q element tags must be properly nested; q and some more subtle syntactical requirements. Special characters can be escaped using Unicode character references: q & yields &; q < and < both yield ]]> The strange syntax is a legacy from SGML... The following service checks well-formedness of an XML document (given a full URL): process clear http://www.brics.dk/~mis/ITU/XML/concrete.html [18/09/2000 14:24:32] XML: applications Applications of XML There are already hundreds of serious applications of XML. XHTML W3C's XMLization of HTML 4.0. Example XHTML document: Hello world! foobar CML Chemical Markup Language. Example CML document snippet: C O H H H H -0.748 0.558 -1.293 -1.263 -0.699 0.716 WML Wireless Markup Language for WAP services: Hello World There is a long list of many other XML applications. http://www.brics.dk/~mis/ITU/XML/applications.html [18/09/2000 14:24:33] XML: technologies XML technologies Just a notation for trees is not enough: q the real force of XML is generic languages and tools! The XML vision offers: namespaces - to avoid name clashes when a document uses several "sub-languages"; schemas - grammars to define classes of documents; linking between documents - a generalization of HTML anchors and links; addressing parts of documents - it is not enough that only the author can place anchors; transformation - conversion from one document class to another; querying - extraction of information. The site www.xmlsoftware.com has a comprehensive list of available XML tools. http://www.brics.dk/~mis/ITU/XML/tech.html [18/09/2000 14:24:34] XML: namespaces Namespaces Consider an XML language WidgetML which uses XHTML as a sublanguage for help messages: Description of gadget Gadget A gadget contains a big gizmo We have some problems here: q the meaning of head and big depends on the context; q this complicates things for processors and might even cause ambiguities; q the root of the problem is: one common name-space. The solution is to introduce explicit namespace declarations: Description of gadget Gadget A gadget contains a big gizmo Do not be confused by the use of URI for namespaces: q they are not supposed to point to anything; q it is simply the cheapest way of getting unqiue names; http://www.brics.dk/~mis/ITU/XML/namespaces.html (1 of 2) [18/09/2000 14:24:35] XML: namespaces qwe rely on existing organizations that control domain names. All XML technologies (are supposed to) respect namespaces. http://www.brics.dk/~mis/ITU/XML/namespaces.html (2 of 2) [18/09/2000 14:24:35] XML: recipe example The recipe example Consider the following raw data describing some (Danish) recipes: q citrontærte; q farsbrød; q hornfisk; q islagkage; q laksemousse; q nougattoppe; q rabarberdessert; q smørrebrød. We can represent this collection as an XML document. http://www.brics.dk/~mis/ITU/XML/recipe.html [18/09/2000 14:24:35] XML: schemas Schema languages The syntax of a new XML language must be formalized: q this is similar to the formal syntax of a programming language; q however, usual context-free grammars are not expressive enough; q XML languages are described using schemas. A modern schema language: q is itself an XML language (and can be used to describe itself); q imposes constraints on the contents of elements; q is context-sensitive and very fine-grained; q can be processed efficiently. A schema processor: q checks that an application document satisfies the schema; q such a document is called valid. http://www.brics.dk/~mis/ITU/XML/schemas.html [18/09/2000 14:24:36] XML: schema for recipes A schema for recipes The following is a complete schema for the recipe example, written in the DSD schema language: http://www.brics.dk/~mis/ITU/XML/schemarecipe.html (1 of 3) [18/09/2000 14:24:37] XML: schema for recipes http://www.brics.dk/~mis/ITU/XML/schemarecipe.html (2 of 3) [18/09/2000 14:24:37] XML: schema for recipes http://www.brics.dk/~mis/ITU/XML/schemarecipe.html (3 of 3) [18/09/2000 14:24:37] XML: XLink, XPointer, and XPath XML: XLink, XPointer, and XPath XLink, XPointer, and XPath are three related mechanisms: qthey generalize the link mechanisms from HTML; q XPath points from without to a set of nodes in an XML document; q XPointer uses XPath to directly generalize HTML links; q XLink uses XPointer to vastly generalize HTML links. HTML links are just too simple: q an anchor must be placed at every link destination (problem with read-only documents) - we want to express relative locations; q the link definition must be at the same location as the link source - we want out-of-line links ("link databases"); q only individual nodes can be linked to - we want links to whole tree fragments; q a link always has one source and one destination - we want links with multiple sources and destinations. The XLink pointer model looks like this: These technologies are not really supported by any browsers today. http://www.brics.dk/~mis/ITU/XML/xpath.html [18/09/2000 14:24:38] XML: pointing at recipes Pointing at recipes The following simple XPath expressions point to parts of the XML recipe document: //ingrediens[@navn="radiser i små tern"]/@antal 200 //ingrediens[@antal="100" and @enhed="g"]/@navn flødeost med løg og urter blødt smør i mindre stykker Feta ost 45+ smeltet overtrækschokolade //titel[text()="Citrontærte"] /following-sibling::ingrediens[@navn="dej"]/tilberedning/text() Bland mel og sukker i en skål. Skær smørret i mindre stykker og smuldr det i melblandingen, til den ligner revet ost. Tilsæt vand og saml hurtigt dejen. Tryk den ud i en smurt springform (ca. 22 cm i diameter). Lad dejen gå halvt op ad formens side. Stil den tildækket i køleskabet i mindst 1 time. Forbag bunden midt i ovnen i 12 minutter ved 200 grader. XPath expressions navigate step by step through the XML tree. http://www.brics.dk/~mis/ITU/XML/xpathrecipe.html [18/09/2000 14:24:39] XML: XML-QL XML-QL XML-QL is a query language for XML documents: q XML document can be seen as generalizations of database relations; q XML-QL is a similar generalization of SQL; q it can extract data from exisiting XML documents and construct new XML documents. Relations are special, restricted cases of XML trees: XML query languages are not released until 2001. http://www.brics.dk/~mis/ITU/XML/xmlql.html [18/09/2000 14:24:40] XML: querying the recipes Querying the recipes The following XML-QL queries extract information from the XML recipe document: WHERE $t IN "karoline.xml" CONSTRUCT $t
DMCA.com Protection Status Copyright by webtailieu.net