Académique Documents
Professionnel Documents
Culture Documents
3/22/2012
XML Introduction DTD and Validity Attribute Declarations in a DTD Entities and External DTD subsets Embedding non XML data Namespaces DOM and SAX Parsers Schemas
by FaaDoOEngineers.com
WHAT IS XML ?
XML stands for eXtensible Markup Language It is a set of rules for defining semantic tags. It is a markup language which defines a syntax to define domainspecific, semantic and structured markup languages
by FaaDoOEngineers.com
3/22/2012
INTRODUCTION
3/22/2012
XML
Technology for creating markup languages Enables document authors to describe data of any type Allows creating new tags
by FaaDoOEngineers.com
WHY XML ?
3/22/2012
Light weighted document Supports cross platform User friendly language Supports Unicode Describes structure and semantics Does not focuses on formatting Can validate the document
by FaaDoOEngineers.com
by FaaDoOEngineers.com
Extension .xml
An Example Intro.xml
1 2 3 4 5 6 <myMessage> 7 <message>Welcome to XML!</message> 8 </myMessage> Line numbers are not part of XML document. We include them for clarity. Document begins with <?xml version = "1.0"?> declaration that specifies XML version 1.0 <!-- Fig.: intro.xml --> Comments <!-- Simple introduction to XML markup -->
3/22/2012 by FaaDoOEngineers.com
XML documents
by FaaDoOEngineers.com
Attempting to create more than one root element is erroneous Incorrect: <x><y>hello</x></y> Correct: <x><y>hello</y></x>
XML parser
by FaaDoOEngineers.com
Reads XML document Checks syntax Reports errors (if any) Allows programmatic access to documents contents
by FaaDoOEngineers.com
Single root element Each element has start tag and end tag Tags properly nested Attribute (discussed later) values in quotes Proper capitalization
Case sensitive
by FaaDoOEngineers.com
Builds tree structure containing document data in memory Generates events when tags, comments, etc. are encountered
XML document
Contains data Does not contain formatting information Load XML document into Internet Explorer
by FaaDoOEngineers.com
Document is parsed by msxml. Places plus (+) or minus (-) signs next to container elements
Plus sign indicates that all child elements are hidden Clicking plus sign expands container element
Displays children
Minus sign indicates that all child elements are visible Clicking minus sign collapses container element
Hides children
3/22/2012 by FaaDoOEngineers.com
CHARACTERS
3/22/2012
by FaaDoOEngineers.com
Markup text
by FaaDoOEngineers.com
Character data
XML adhere to a series of rules about how they look like. There are two levels of conformity to the XML standard.
Well formedness Validity
by FaaDoOEngineers.com
RELATED TECHNOLOGIES
XML doesnt operate in a vacuum. Using XML as more than a data format requires interaction with a no. of related technologies. This includes:
by FaaDoOEngineers.com
3/22/2012
HTML CSS XSL URLs and URIs XLinks and XPointers The Unicode Character Set
3/22/2012 by FaaDoOEngineers.com
Data sent along with a DTD is known as valid XML. Data sent without a DTD is known as well-formed XML. With both valid and well-formed XML, XML encoded data is selfdescribing since descriptive tags are intermixed with the data.
DTDs help ensure that different people and programs can read each others files. The DTD defines exactly what is and is not allowed to appear inside a document.
3/22/2012 by FaaDoOEngineers.com
A DTD consists of a left square bracket character ([) followed by a series of markup declarations, followed by a right square bracket character (]).
by FaaDoOEngineers.com
3/22/2012
]
> <Simple> This is the most simplest XML document I have ever seen </Simple>
DTD DECLARATIONS
3/22/2012
Element type declarations Attribute-list declarations Entity declarations Notation declarations Processing declarations Comments Parameter entity references
by FaaDoOEngineers.com
<?xml version=1.0 standalone=yes?> <! DOCTYPE GREETING [ <! ELEMENT GREETING (#PCDATA)> ]> <GREETING> Hello XML! </GREETING>
by FaaDoOEngineers.com
The first step to creating a DTD appropriate for a particular document is to understand the structure of the information that will be encoded using the elements defined in the DTD.
by FaaDoOEngineers.com
3/22/2012
<?xml version=1.0 standalone=yes ?> <Root> <Element 1> <Element 11> </Element 11> </Element 1>
</Root>
ELEMENT DECLARATIONS
Each tag used in a valid XML document must be declared with an element declaration in the DTD. This specifies the name and possible contents of an element. This list of contents is also called the content specification. * - may occur more than once (Zero or More Children) ? may or may not occur (Zero or One Children) + - must occur at least once (One or More Children)
by FaaDoOEngineers.com
3/22/2012
by FaaDoOEngineers.com
3/22/2012
Instead of specifying an explicit default attribute value, an attribute declaration can be provided a value, allow the value to be omitted completely, or even always use the default values. These requirements are specified with the three keywords
by FaaDoOEngineers.com
3/22/2012
3/22/2012 by FaaDoOEngineers.com
by FaaDoOEngineers.com
3/22/2012
NOTATIONS
The first problem that we encounter when working with non-XML data in an XML document is identifying the format of the data and telling the XML application how to read and display the nonXML data. For ex., it would be inappropriate to try to draw an MP3 sound file on the screen. Furthermore, no application understands all possible file formats. Ideally, we want documents to tell the application the format of the external entity so you dont have to rely on the application recognizing the file type by a magic number or a potentially unreliable file formats.
by FaaDoOEngineers.com
3/22/2012
NOTATION
It is used to provide a fixed and mandatory value to an attribute. The value is declared in the notation which can have a path using SYSTEM or a string using PUBLIC.
by FaaDoOEngineers.com
3/22/2012
EXAMPLE OF NOTATION
3/22/2012 by FaaDoOEngineers.com
<?xml version="1.0" encoding="UTF-8"?> <!ELEMENT IMAGES (IMAGE+) > <!ELEMENT IMAGE (#PCDATA) > <!NOTATION iPATH SYSTEM "C:\windows\a.bmp" > <!ATTLIST IMAGE SRC NOTATION (iPATH) #REQUIRED>
USING NOTATION
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE IMAGES SYSTEM "C:\N.dtd"> <IMAGES> <IMAGE SRC="iPATH">abc</IMAGE> </IMAGES> Note Because XML processor cannot parse bmp files, we need to use an external program for displaying or editing them. When the parser encounters a usage of the notation line name, it will simply provide the path of the application.
by FaaDoOEngineers.com 3/22/2012
CONDITIONAL SECTIONS
3/22/2012
Include declarations
by FaaDoOEngineers.com
Exclude declarations
Preceded by percent character (%) Creates entities specific to DTD Can be used only inside DTD in which they are declared
3/22/2012 by FaaDoOEngineers.com
XML NAMESPACES
by FaaDoOEngineers.com
3/22/2012
CONFLICTING ISSUES
Namespaces ensure that element names do not conflict, and clarify who defined which term.
Namespaces do not give instructions on how to process the elements.
3/22/2012
by FaaDoOEngineers.com
Readers still need to know what the elements mean and decide how to process them.
Namespaces simply keep the names straight.
XML NAMESPACES
3/22/2012
Naming collisions
Two different elements have same name <subject>Math</subject> <subject>Thrombosis</subject> Differentiate elements that have same name <school:subject>Math</school:subject>
by FaaDoOEngineers.com
Namespaces
Prepended to elements and attribute names Tied to uniform resource identifier (URI)
Creating namespaces
by FaaDoOEngineers.com
Creates two namespace prefixes text and image urn:deitel:textInfo is URI for prefix text urn:deitel:imageInfo is URI for prefix image Default namespaces
3/22/2012 by FaaDoOEngineers.com
XML SCHEMAS
by FaaDoOEngineers.com
3/22/2012
XML SCHEMA
To define the structure of an XML document. defines the list of elements and attributes than can be used in an XML Document It also specifies the order in which these elements appear in the XML document and their datatypes Microsoft has developed this XML Schema Definition (XSD) language It has become w3c recommendation for creating valid XML documents.
by FaaDoOEngineers.com
3/22/2012
INTRODUCTION
3/22/2012
XML Path Language (XPath) Syntax for locating information in XML document
e.g., attribute values Not structural language like XML XSLT XPointer
39
LOCATING NODES
3/22/2012
XML documents can be represented as a tree view of nodes XPath uses a pattern expression to identify nodes in an XML document. An XPath pattern is a slash-separated list of child element names that describe a path through the XML document. The pattern "selects" elements that match the path. The following XPath expression selects all the price elements of all the cd elements of the catalog element:
/catalog/cd/price
WHAT IS XSLT ?
XSLT stands for XSL Transformations XSLT is the most important part of XSL XSLT transforms an XML document into another XML document XSLT uses XPath to navigate in XML documents XSLT is a W3C Recommendation
L & K India - Education
42
3/22/2012
43
XSL
44
PRESENTING XML
There are two style sheet languages available for use with XML in Internet Explorer
3/22/2012
An important point to consider in choosing a style sheet language for a particular document is whether the structure of the XML document is suitable for display. With CSS, the structure of the XML content must be virtually identical to the structure of the presentation. Since one of the goals of XML is a complete separation of content from display, many XML documents are difficult to display as you might wish using CSS.
45
Mozilla Firefox
3/22/2012
As of version 1.0.2, Firefox has support for XML and XSLT (and CSS).
Mozilla
Mozilla includes Expat for XML parsing and has support to display XML + CSS. Mozilla also has some support for Namespaces. Mozilla is available with an XSLT implementation.
Netscape
As of version 8, Netscape uses the Mozilla engine, and therefore it has the same XML / XSLT support as Mozilla. As of version 9, Opera has support for XML and XSLT (and CSS). Version 8 supports only XML + CSS.
Opera
Internet Explorer
As of version 6, Internet Explorer supports XML, Namespaces, CSS, XSLT, and XPath. Version 5 is NOT compatible with the official W3C XSL Recommendation.
46
XML does not use predefined tags (we can use any tagnames we like), and the meaning of these tags are not well understood. A <table> element could mean an HTML table, a piece of furniture, or something else - and a browser does not know how to display it. XSL describes how the XML document should be displayed!