Vous êtes sur la page 1sur 22

ActiveNET ™ Always first in Race

XML (eXtensible Markup Language)


XML

Powered
By
Surya

Always first in race.

Surya 1
ActiveNET ™ Always first in Race

Introduction:
XML developed by an XML working group in 1996
Under auspices of the World Wide Web Consortium (W3C)
Chaired by Jon Bosak - Sun Microsystems

Standards:
SGML (ISO 8879:1986) - by definition, well-formed XML documents are conformant SGML
documents
Unicode and ISO/IEC 10646 - this specification define the encodings and meanings of the
characters
IETF RFC 1738 and RFC 1808 - these define the syntax and semantics of Uniform Resource
Locators (URLs)

Is a subset of SGML

The two known products derived from SGML are HTML and XML

HTML is used to present data on web browsers

XML is used to represent data before presentation (little bit abstraction isn't see the next sections)

HTML knows how best to present data on web browsers

Whereas if the same data want to be presented on different systems is it possible with the help of
HTML, certainly No. Then is there any neutral language available which presents the data on
different systems using markup languages like for example if you take Mobile support WML
(Wireless Markup Language), PDA (Personal Digital Assistance) requires PDAML, Browser
supports HTML.

XML instead of embedding data in static presentation format like HTML, it formats the data in a
structured format so that depends on the device we are using the structured data will be
converted into various presentation formats using translators.

XML is having many more advantages we will discuss in the later sections.

XML standards are developed by W3C (World Wide Web Consortium) is a consortium (non-
profitable organization) working for setting up standards for markup languages. W3C is formed by
professionals, consultants from various organizations.

XML Objects:
Different objects used in XML document are:
Node - All types of objects used in XML document are generally refered with this name
Element - Instructions embedded in <> (angular braces)
Attribute - Parameters used in elements
Text - Data placed in between start and end element
PI (Processing Instruction) - Contains xml version, encoding etc
DOCTYPE (Document Type) - Contains DTD path
Comment - Used to supress the data/to mark unused data
CDATA (unparsed-Character DATA section) - useful data but want to ignored by XML parser
Entity - General and Parameter Entities
Entities

Okay now we will visit how to operate on XML document.

Surya 2
ActiveNET ™ Always first in Race

Even though XML document is a structured document more than that it is a plain document.
Reading content from XML document needs an extra talent and energy be means of using XML
parsers.

XML Parsers:
XML parser is a tool which read, understand and process content of the XML document. More
clarity will given in the next coming sections.

First of all you tell me your requirement is whether to read or update XML document.

Kinds of Parsers:

SAX (Simple API for XML parsing)


-Is the first parser introduced in the market to operate on XML documents
-Functionality limted to only read operations
-Event based parser (That means events will be fired on application when different nodes found
of the XML document are found during processing)
-Faster in execution
-Cannot have control over on the processing of XML document (It starts firing events from
begining of the XML document to till end)

XML SAX Parser DocumentHandler


XML startDocument(){}
1. Loads XML document
Document startElement(){}
2. Well-formedness
characters(){}
3. Validness
endElement(){}
DTD/Schem 4. Understands XML document
endDocument(){}
a 5. Fires event for each node

DOM (Document Object Model)


-Released next to SAX
-Functionality supports all types of operations on XML document such as (read, insert, append,
update, replace and delete)
-Tree based paring (That means it reads node by node in a heirarchial manner)
-Slower in execution (because it instead of firing events on application, application must read
each node of the XML document)
-Developer can have full control on XML document (because functions are available in the API to
navigate from one node to another node in any traversal order [front-back/back-front])

XML DOM Parser


XML 1. Loads XML document DOM O/P
Document 2. Well-formedness
3. Validness
DTD/Schem 4. Understands XML document
a 5. Returns DOM object
DOM support
read, insert,
update, delete,
replace and append
operations

Surya 3
ActiveNET ™ Always first in Race

Each parser is of two types again:


Non-validating parser
-Non validating parser verifies only the well-formedness of the XML document
-But won't verify document structure compared with DTD (Validness)

Validating parser
-Verifies both well-formedness and Validness
-That means it also checks the "structure of the XML document used in the document" compared
with "document structure definition files like DTD/Schema".

What is well-formedness?
Well-formedness is a set of following rules that every XML document must have to satisfy: Rules
are as follows
-Document must begin with PI (<?xml version="1.0"?>)
-Document must begin with root element and ends with the same (<EMPS> </EMPS>)
-Every element must have both start and end element (<EMP> </EMP>)
-Attribute values must be placed in quotes ( ' or ") (<DESIG cadre="01">)
-Elements must be properly nested
(
<DESIG>
<NAME>Director</NAME>
<SENIORITY>1</SENIORITY>
</DESIG>
)
-Elements are case sensitive (both start and end elements must use the same case)

What is Validness?
-Verifiying structure used in XML document with the structure defined in DTD/Schema
-The path of the DTD/Schema included on top of the XML document next to PI declaration

If DTD as follows:
<!DOCTYPE root_ele IDENTIFIER 'FPI' 'dtd_uri'>
IDENTIFERs are two: i) PUBLIC ii) SYSTEM
If PUBLIC identifier is used then FPI must be included
FPI (Formal Public Indentifier)
Ex:
<!DOCTYPE EMPS SYSTEM 'emps.dtd'>
<!DOCTYPE EMPS PUBLIC '-/Indian Government/Income Tax/en' 'emps.dtd'>

If Schema as follows:
<EMPS href="EMPS.xsd">

XML XML DOM O/P HTML


Parser

DTD PDAML
XSL
Schema XSL Transformer WML

Surya 4
ActiveNET ™ Always first in Race

How XML is used in Java?

DTDs:
Brief look at DTDs:
-Element declaration: <!ELEMENT ele_name (sub_ele1, sub_ele2, sub_ele3)>
Ex: emps.dtd
<!ELEMENT EMPS (EMP*)>
<!ELEMENT EMP (EMPNO, ENAME, SAL, DESIG, ADDRESS, (PHONE|MOBILE), EMAIL)>
<!ELEMENT EMPNO (#PCDATA)>
<!ELEMENT ENAME (#PCDATA)>
<!ELEMENT SAL (#PCDATA)>
<!ELEMENT DESIG (#PCDATA)>
<!ELEMENT ADDRESS (#PCDATA)>
<!ELEMENT PHONE (#PCDATA)>
<!ELEMENT MOBILE (#PCDATA)>
<!ELEMENT EMAIL (#PCDATA)>

In the above given document some characters are used in suffix to element declaration, They
are:
No Sign - only one occurance (1)
? - Zero or one occurance (0/1)
* - Zero or many occurances (>=0)
+ - Once or More occurances (>=1)

Attributes are used to associate name-value pairs with elements


-Attributes may appear only within start-tags
-Attribute-list declarations may be used:
-to define the set of attributes pertaining to a given element type
-to establish a set of type constraints on these attributes
-to provide default values for attributes

Surya 5
ActiveNET ™ Always first in Race

Attribute declaration:
syntax:
<!ATTLIST ele_name attr1_name attr1_type attr1_constraint>
Some of the examples are:
<!ATTLIST EMP id ID #REQUIRED>
<!ATTLIST EMP last_name CDATA #IMPLIED>
<!ATTLIST temparature units (Celsius|Fahrenheit) CDATA #IMPLIED>
<!ATTLIST account MIN_BAL #FIXED "1000.00">

attr_types:
ID - Restricts the attribute value used only once in the document
Ex: <EMP id="1">
<!ATTLIST EMP id ID #REQUIRED>
IDREF -
IDREFs
CDATA
NMTOKEN
NMTOKENS

attr_constraints:
IMPLIED - Optional
REQUIRED - Mandatory
FIXED - The value of this type of attribute is FIXED/CONSTANT/FINAL

Schemas:
Schema is another form of defining the structure of XML document in place of DTD.

Schema is having some advantages over DTD explained as follows:


-DTD is an another form of document when defining the structure whereas schema is same as
XML, hence Schema is included as a part of XML specification itself.
-XML parser requires additional energy to verify DTD syntaxes first and then followed by XML
document compared with DTD, whereas Schema requires the same parser toverify both schema
document syntaxes as well XML document compared with Schema structure definition.
-Schema is suitable for defining complex XML document structures for example
PurchaseOrder.xml
-Schema supports open & closed model so that we can easily switch over from Element Only
Element to Any Element and viceversa
-Unlike DTD, Schema supports min & max occurances of elements
-Schema supports attribute groups
-Schema supports to decalre user defined data types
-Schema supports all primitive and secondary data types as well

Elements available in Schema to define document structures


<!--
po.xml
-->
<?xml version="1.0"?>
<purchaseOrder orderDate="2004-10-20">
<shipTo country="US">
<name>Alice Smith</name>
<street>123 Maple Street</street>
<city>Mill Valley</city>
<state>CA</state>
<zip>90952</zip>
</shipTo>
<billTo country="US">

Surya 6
ActiveNET ™ Always first in Race

<name>Robert Smith</name>
<street>8 Oak Avenue</street>
<city>Old Town</city>
<state>PA</state>
<zip>95819</zip>
</billTo>
<comment>Hurry, my lawn is going wild!</comment>
<items>
<item partNum="872-AA">
<productName>Lawnmower</productName>
<quantity>1</quantity>
<USPrice>148.95</USPrice>
<comment>Confirm this is electric</comment>
</item>
<item partNum="926-AA">
<productName>Baby Monitor</productName>
<quantity>1</quantity>
<USPrice>39.98</USPrice>
<shipDate>2004-05-21</shipDate>
</item>
</items>
</purchaseOrder>

<!--
po.xsd
-->
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<xsd:annotation>
<xsd:documentation xml:lang="en">
Purchase order schema for Example.com.
Copyright 2004 Example.com. All rights reserved.
</xsd:documentation>
</xsd:annotation>

<xsd:element name="purchaseOrder" type="PurchaseOrderType"/>

<xsd:element name="comment" type="xsd:string"/>

<xsd:complexType name="PurchaseOrderType">
<xsd:sequence>
<xsd:element name="shipTo" type="USAddress"/>
<xsd:element name="billTo" type="USAddress"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="items" type="Items"/>
</xsd:sequence>
<xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>

<xsd:complexType name="USAddress">
<xsd:sequence>
<xsd:element name="name" type="xsd:string"/>
<xsd:element name="street" type="xsd:string"/>
<xsd:element name="city" type="xsd:string"/>
<xsd:element name="state" type="xsd:string"/>
<xsd:element name="zip" type="xsd:decimal"/>

Surya 7
ActiveNET ™ Always first in Race

</xsd:sequence>
<xsd:attribute name="country" type="xsd:NMTOKEN"
fixed="US"/>
</xsd:complexType>

<xsd:complexType name="Items">
<xsd:sequence>
<xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="productName" type="xsd:string"/>
<xsd:element name="quantity">
<xsd:simpleType>
<xsd:restriction base="xsd:positiveInteger">
<xsd:maxExclusive value="100"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element name="USPrice" type="xsd:decimal"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="partNum" type="SKU" use="required"/>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>

<!-- Stock Keeping Unit, a code for identifying products -->


<xsd:simpleType name="SKU">
<xsd:restriction base="xsd:string">
<xsd:pattern value="\d{3}-[A-Z]{2}"/>
</xsd:restriction>
</xsd:simpleType>

</xsd:schema>

The purchase order consists of a main element, purchaseOrder, and the subelements shipTo,
billTo, comment, and items. These subelements (except comment) in turn contain other
subelements, and so on, until a subelement such as USPrice contains a number rather than any
subelements. Elements that contain subelements or carry attributes are said to have complex
types, whereas elements that contain numbers (and strings, and dates, etc.) but do not contain
any subelements are said to have simple types. Some elements have attributes; attributes always
have simple types.

The purchase order schema consists of a schema element and a variety of subelements, most
notably element, complexType, and simpleType which determine the appearance of elements
and their content in instance documents.

Each of the elements in the schema has a prefix xsd: which is associated with the XML Schema
namespace through the declaration, xmlns:xsd="http://www.w3.org/2001/XMLSchema", that
appears in the schema element. The prefix xsd: is used by convention to denote the XML
Schema namespace, although any prefix can be used.

Surya 8
ActiveNET ™ Always first in Race

In XML Schema, there is a basic difference between complex types which allow elements in their
content and may carry attributes, and simple types which cannot have element content and
cannot carry attributes.

Defining the USAddress Type


<xsd:complexType name="USAddress" >
<xsd:sequence>
<xsd:element name="name" type="xsd:string"/>
<xsd:element name="street" type="xsd:string"/>
<xsd:element name="city" type="xsd:string"/>
<xsd:element name="state" type="xsd:string"/>
<xsd:element name="zip" type="xsd:decimal"/>
</xsd:sequence>
<xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/>
</xsd:complexType>

Defining PurchaseOrderType
<xsd:complexType name="PurchaseOrderType">
<xsd:sequence>
<xsd:element name="shipTo" type="USAddress"/>
<xsd:element name="billTo" type="USAddress"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="items" type="Items"/>
</xsd:sequence>
<xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>

Occurance constraints:
Occurance constraints for Elements and Attributes
Element: minOccurs, maxOccurs, fixed, default
Attribute: use, fixed, default

Simple Types Built In to XML Schema:


string, normalizedString, token, byte, unsignedByte, base64Binary, hexBinary, integer,
positiveInteger, negativeInteger, nonNegativeInteger, nonPositiveInteger, int, unsignedInt, long,
unsignedLong, short, unsignedShort, decimal, float, double, boolean, time, dateTime, duration,
date, gMonth, gYear, gYearMonth, gDay, gMonthDay, Name, QName, NCName, anyURI,
language, ID, IDREF, IDREFS, ENTITY, ENTITIES, NOTATION, NMTOKEN, NMTOKENS,

New simple types are defined by deriving them from existing simple types. We use the
simpleType element to define and name the new simple type. We use the restriction element to
indicate the existing (base) type, and to identify the "facets" that constrain the range of values.

Suppose we wish to create a new type of integer called myInteger whose range of values is
between 10000 and 99999 (inclusive). We base our definition on the built-in simple type integer,
whose range of values also includes integers less than 10000 and greater than 99999. To define
myInteger, we restrict the range of the integer base type by employing two facets called
minInclusive and maxInclusive:

Defining myInteger, Range 10000-99999


<xsd:simpleType name="myInteger">
<xsd:restriction base="xsd:integer">
<xsd:minInclusive value="10000"/>
<xsd:maxInclusive value="99999"/>
</xsd:restriction>
</xsd:simpleType>

Surya 9
ActiveNET ™ Always first in Race

The purchase order schema contains another, more elaborate, example of a simple type
definition. A new simple type called SKU is derived (by restriction) from the simple type string.
Furthermore, we constrain the values of SKU using a facet called pattern in conjunction with the
regular expression "\d{3}-[A-Z]{2}" that is read "three digits followed by a hyphen followed by two
upper-case ASCII letters":

Defining the Simple Type "SKU"


<xsd:simpleType name="SKU">
<xsd:restriction base="xsd:string">
<xsd:pattern value="\d{3}-[A-Z]{2}"/>
</xsd:restriction>
</xsd:simpleType>

XML Schema defines fifteen facets which are listed in Appendix B. Among these, the
enumeration facet is particularly useful and it can be used to constrain the values of almost every
simple type, except the boolean type. The enumeration facet limits a simple type to a set of
distinct values. For example, we can use the enumeration facet to define a new simple type
called USState, derived from string, whose value must be one of the standard US state
abbreviations:

Using the Enumeration Facet


<xsd:simpleType name="USState">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="AK"/>
<xsd:enumeration value="AL"/>
<xsd:enumeration value="AR"/>
<!-- and so on ... -->
</xsd:restriction>
</xsd:simpleType>

USState would be a good replacement for the string type currently used in the state element
declaration. By making this replacement, the legal values of a state element, i.e. the state
subelements of billTo and shipTo, would be limited to one of AK, AL, AR, etc. Note that the
enumeration values specified for a particular type must be unique.

List Types:
Creating a List of myInteger's
<xsd:simpleType name="listOfMyIntType">
<xsd:list itemType="myInteger"/>
</xsd:simpleType>

And an element in an instance document whose content conforms to listOfMyIntType is:


<listOfMyInt>20003 15037 95977 95945</listOfMyInt>

Several facets can be applied to list types: length, minLength, maxLength, and enumeration. For
example, to define a list of exactly six US states (SixUSStates), we first define a new list type
called USStateList from USState, and then we derive SixUSStates by restricting USStateList to
only six items:

List Type for Six US States


<xsd:simpleType name="USStateList">
<xsd:list itemType="USState"/>
</xsd:simpleType>

<xsd:simpleType name="SixUSStates">

Surya 10
ActiveNET ™ Always first in Race

<xsd:restriction base="USStateList">
<xsd:length value="6"/>
</xsd:restriction>
</xsd:simpleType>

Elements whose type is SixUSStates must have six items, and each of the six items must be one
of the (atomic) values of the enumerated type USState, for example:

<sixStates>PA NY CA NY LA AK</sixStates>

Union Types:
Union Type for Zipcodes
<xsd:simpleType name="zipUnion">
<xsd:union memberTypes="USState listOfMyIntType"/>
</xsd:simpleType>

When we define a union type, the memberTypes attribute value is a list of all the types in the
union.

Now, assuming we have declared an element called zips of type zipUnion, valid instances of the
element are:

<zips>CA</zips>
<zips>95630 95977 95945</zips>
<zips>AK</zips>

Two facets, pattern and enumeration, can be applied to a union type.

Anonymous Type Definitions:


Two Anonymous Type Definitions
<xsd:complexType name="Items">
<xsd:sequence>
<xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="productName" type="xsd:string"/>
<xsd:element name="quantity">
<xsd:simpleType>
<xsd:restriction base="xsd:positiveInteger">
<xsd:maxExclusive value="100"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element name="USPrice" type="xsd:decimal"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="partNum" type="SKU" use="required"/>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>

Mixed Content:
Snippet of Customer Letter
<letterBody>

Surya 11
ActiveNET ™ Always first in Race

<salutation>Dear Mr.<name>Robert Smith</name>.</salutation>


Your order of <quantity>1</quantity> <productName>Baby
Monitor</productName> shipped from our warehouse on
<shipDate>1999-05-21</shipDate>. ....
</letterBody>

Snippet of Schema for Customer Letter


<xsd:element name="letterBody">
<xsd:complexType mixed="true">
<xsd:sequence>
<xsd:element name="salutation">
<xsd:complexType mixed="true">
<xsd:sequence>
<xsd:element name="name" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element name="quantity" type="xsd:positiveInteger"/>
<xsd:element name="productName" type="xsd:string"/>
<xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
<!-- etc. -->
</xsd:sequence>
</xsd:complexType>
</xsd:element>

Empty Content:
Now suppose that we want the internationalPrice element to convey both the unit of currency and
the price as attribute values rather than as separate attribute and content values. For example:

<internationalPrice currency="EUR" value="423.46"/

An Empty Complex Type


<xsd:element name="internationalPrice">
<xsd:complexType>
<xsd:complexContent>
<xsd:restriction base="xsd:anyType">
<xsd:attribute name="currency" type="xsd:string"/>
<xsd:attribute name="value" type="xsd:decimal"/>
</xsd:restriction>
</xsd:complexContent>
</xsd:complexType>
</xsd:element>

Shorthand for an Empty Complex Type


<xsd:element name="internationalPrice">
<xsd:complexType>
<xsd:attribute name="currency" type="xsd:string"/>
<xsd:attribute name="value" type="xsd:decimal"/>
</xsd:complexType>
</xsd:element>

Annotations
XML Schema provides three elements for annotating schemas for the benefit of both human
readers and applications. In the purchase order schema, we put a basic schema description and
copyright information inside the documentation element, which is the recommended location for
human readable material. We recommend you use the xml:lang attribute with any documentation

Surya 12
ActiveNET ™ Always first in Race

elements to indicate the language of the information. Alternatively, you may indicate the language
of all information in a schema by placing an xml:lang attribute on the schema element.

The appInfo element, which we did not use in the purchase order schema, can be used to provide
information for tools, stylesheets and other applications.

Both documentation and appInfo appear as subelements of annotation, which may itself appear
at the beginning of most schema constructions. To illustrate, the following example shows
annotation elements appearing at the beginning of an element declaration and a complex type
definition:

Annotations in Element Declaration & Complex Type Definition


<xsd:element name="internationalPrice">
<xsd:annotation>
<xsd:documentation xml:lang="en">
element declared with anonymous type
</xsd:documentation>
</xsd:annotation>
<xsd:complexType>
<xsd:annotation>
<xsd:documentation xml:lang="en">
empty anonymous type with 2 attributes
</xsd:documentation>
</xsd:annotation>
<xsd:complexContent>
<xsd:restriction base="xsd:anyType">
<xsd:attribute name="currency" type="xsd:string"/>
<xsd:attribute name="value" type="xsd:decimal"/>
</xsd:restriction>
</xsd:complexContent>
</xsd:complexType>
</xsd:element>

Building Content Models:


Nested Choice and Sequence Groups
<xsd:complexType name="PurchaseOrderType">
<xsd:sequence>
<xsd:choice>
<xsd:group ref="shipAndBill"/>
<xsd:element name="singleUSAddress" type="USAddress"/>
</xsd:choice>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="items" type="Items"/>
</xsd:sequence>
<xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>

<xsd:group name="shipAndBill">
<xsd:sequence>
<xsd:element name="shipTo" type="USAddress"/>
<xsd:element name="billTo" type="USAddress"/>
</xsd:sequence>
</xsd:group>

An 'All' Group
<xsd:complexType name="PurchaseOrderType">

Surya 13
ActiveNET ™ Always first in Race

<xsd:all>
<xsd:element name="shipTo" type="USAddress"/>
<xsd:element name="billTo" type="USAddress"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="items" type="Items"/>
</xsd:all>
<xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>

Illegal Example with an 'All' Group


<xsd:complexType name="PurchaseOrderType">
<xsd:sequence>
<xsd:all>
<xsd:element name="shipTo" type="USAddress"/>
<xsd:element name="billTo" type="USAddress"/>
<xsd:element name="items" type="Items"/>
</xsd:all>
<xsd:sequence>
<xsd:element ref="comment" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:sequence>
<xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>

Attribute Groups:
Adding Attributes to the Inline Type Definition
<xsd:element name="Item" minOccurs="0" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="productName" type="xsd:string"/>
<xsd:element name="quantity">
<xsd:simpleType>
<xsd:restriction base="xsd:positiveInteger">
<xsd:maxExclusive value="100"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element name="USPrice" type="xsd:decimal"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="partNum" type="SKU" use="required"/>
<!-- add weightKg and shipBy attributes -->
<xsd:attribute name="weightKg" type="xsd:decimal"/>
<xsd:attribute name="shipBy">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:enumeration value="air"/>
<xsd:enumeration value="land"/>
<xsd:enumeration value="any"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
</xsd:complexType>
</xsd:element>

Surya 14
ActiveNET ™ Always first in Race

Adding Attributes Using an Attribute Group


<xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="productName" type="xsd:string"/>
<xsd:element name="quantity">
<xsd:simpleType>
<xsd:restriction base="xsd:positiveInteger">
<xsd:maxExclusive value="100"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element name="USPrice" type="xsd:decimal"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
</xsd:sequence>

<!-- attributeGroup replaces individual declarations -->


<xsd:attributeGroup ref="ItemDelivery"/>
</xsd:complexType>
</xsd:element>

<xsd:attributeGroup name="ItemDelivery">
<xsd:attribute name="partNum" type="SKU" use="required"/>
<xsd:attribute name="weightKg" type="xsd:decimal"/>
<xsd:attribute name="shipBy">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:enumeration value="air"/>
<xsd:enumeration value="land"/>
<xsd:enumeration value="any"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
</xsd:attributeGroup>

DOM: (Document Object Model)


Since we previously discussed about DOM parser in the above of the document. The DOM
presents documents as a hierarchy of Node objects that also implement other, more specialized
interfaces. Some types of nodes may have child nodes of various types, and others are leaf
nodes that cannot have anything below them in the document structure. For XML and HTML, the
node types, and which node types they may have as children, are as follows:

Document -- Element (maximum of one), ProcessingInstruction, Comment, DocumentType


(maximum of one)
DocumentFragment -- Element, ProcessingInstruction, Comment, Text, CDATASection,
EntityReference
DocumentType -- no children
EntityReference -- Element, ProcessingInstruction, Comment, Text, CDATASection,
EntityReference
Element -- Element, Text, Comment, ProcessingInstruction, CDATASection, EntityReference
Attr -- Text, EntityReference
ProcessingInstruction -- no children
Comment -- no children
Text -- no children
CDATASection -- no children

Surya 15
ActiveNET ™ Always first in Race

Entity -- Element, ProcessingInstruction, Comment, Text, CDATASection, EntityReference


Notation -- no children

DOM interfaces API is supplied by W3C and implementation are provided by vendors like IBM,
Oracle, Sun, Microsoft and Apache. To understand DOM well read the following paragraphs.

DOM interface API supplied by


W3C vendor - org.w3c.dom

Implementations supplied by
IBM - XML4J - com.ibm.xml.parser
Oracle - XMLParserV2
Sun - JAXP
Microsoft - MSXMLDOM
Apache - Xerces

Node
getChildNodes()
getNodeName()
getNodeValue() NodeList
hasChildNodes() getLength()
item()

Element Document
createElement()
createAttribute()
createTextNode()
createComment()
createProcessingInstruction()

# Here is an example on how to create XML document using DOM API:


// DocumentGenerator.java
import com.ibm.xml.parser.*;
import org.w3c.dom.*;
import java.io.*;
public class DocumentGenerator
{
public static void main(String rags[]) throws Exception
{
TXDocument doc=new TXDocument();
Element rootEle=doc.createElement("DEPTS");

Element deptEle=doc.createElement("DEPT");
Element deptNoEle=doc.createElement("DEPTNO");
Element dnameEle=doc.createElement("DNAME");
deptNoEle.appendChild(doc.createTextNode("1"));
dnameEle.appendChild(doc.createTextNode("Production"));
deptEle.appendChild(deptNoEle);
deptEle.appendChild(dnameEle);

Surya 16
ActiveNET ™ Always first in Race

rootEle.appendChild(deptEle);
doc.appendChild(rootEle);
doc.printWithFormat(new FileWriter("depts.xml"));
}// main()
}// class

The generated copy of XML document from the above code is to:
<!--depts.xml-->
<DEPTS>
<DEPT>
<DEPTNO>1</DEPTNO>
<DNAME>Production</DNAME>
</DEPT>
</DEPTS>

// The below application reads XML document content using DOM API
// DocumentReader.java
import java.io.*;
import com.ibm.xml.parser.*;
import org.w3c.dom.*;
public class DocumentReader
{
public static void main(String rags[]) throws Exception
{
Parser p=new Parser("err");
Document doc=p.readStream(new FileInputStream("depts.xml"));
Element rootEle=doc.getDocumentElement();
NodeList nl1=rootEle.getChildNodes();
int len11=nl1.getLength();
for(int i=0;i<len1;i++)
{
Node n1=nl1.item(i);
NodeList nl2=n1.getChildNodes();
int len2=nl2.getLength();
for(int j=0;j<len2;j++)
{
Node n2=nl2.ietm(j);
System.out.println(n2.getNodeName()+":"+((Child)n2).getText());
}// for2
System.out.println();
}// for1
}// main()
}// class

Surya 17
ActiveNET ™ Always first in Race

SAX: (Simple API for XML Parsing)


Since we previously discussed about SAX parser in the above of the document.

W3C vendor supplied SAX Parser: org.xml.sax package


i) interface DocumentHandler
-Developer those who want to implement SAX Parsing must develop one class sub classing from
DocumentHandler interface
-The functions available in this interface are:
--void characters(char[], int start, int length)
This event will be fired by SAX parser on DocumentHandler sub class when text data is found in
the XML document
--endDocument()
This event fired only once in the lifecycle XML parsing
--endElement(String name)
--ignorableWhitespace(char[], int start, int length)
This event fired when element text is having more than one string
--processingInstruction(java.lang.String target, java.lang.String data)
This event will be fired PI is found
<?xml version="1.0" standalone="yes" encoding="UTF-8"?>
--void setDocumentLocator(Locator)
Gives information about <!DOCTYPE > declaration
--startDocument()
--startElement(String name, AttributeList)

ii) class HandlerBase


-Is a sub class of DocumentHandler also called Adapter class
-Most of the SAX developer sub classing their classes from this class only

// SAXExample.java
import org.xml.sax.*;
import java.io.*;
import com.ibm.xml.parser.*;
public class SAXExample extends HandlerBase
{
public void startDocument()
{
System.out.println("Document starts here");
}
public void startElement(String name, AttributeList list)
{
System.out.println("\t"+name+" element starts here");
int len1=list.getLength();
System.out.println("\t\t Attributes");
for(int i=0;i<len1;i++)
{
System.out.println("\t\t"+list.getName(i)+":"+list.getValue(i));
}// for()
System.out.println("\t\t Attributes ends here");
}// startElement()

public void characters(char ch[], int start, int len)


{
String text=new String(ch, start, len);
System.out.println("\t\t"+text);
}// characters()

Surya 18
ActiveNET ™ Always first in Race

public void endElement(String name)


{
System.out.println(name+" element ends here");
}

public void endDocument()


{
System.out.println("The END");
}

public static void main(String rags[]) throws Exception


{
// Instantiate SAXDriver (sub class of org.xml.sax.Parser interface)
SAXDriver sax=new SAXDriver();

// Instantiate DocumentHandler sub class


SAXExample ex=new SAXExample();

// DocumentHandler sub class must be registered with SAX parser (event destination)
sax.setDocumentHandler(ex);

// I/P xml document into SAX parser


sax.parse(new InputSource(new FileInputStream(rags[0])));
}// main()
}// class

XSL: eXtensible Stylesheet Language


In simple words XSL is used to transform the structure of XML document replaced with another
structure format by keeping the content as it is. For example see the following code snippet:
<EMPS>
<EMP>
<EMPNO>1</EMPNO>
<ENAME>ABC</ENAME>
</EMP>
</EMPS>

<TABLE>
<TR>
<TD>1</TD>
<TD>ABC</TD>
</TR>
</TABLE>

What you observed in the above given code? What i observed is that we want to replace
<EMPS> element with <TABLE> element, <EMP> element with <TR> and <EMPNO> element
with <TD> element whereas the data remain constant in above said code. The process of
converting XML tags replaced with HTML or any other markup language format is called as
transformation. The tool used to transform formats is called as Transformation Tool (XSLT - XSL
Transformation) and the document which contains transformation rules is called as XSL
document.

A stylesheet contains a set of template rules. A template rule has two parts:
*a pattern which is matched against nodes in the source tree (XML document)
*and a template which can be instantiated to form part of the result tree (XSL document
instructions).

Surya 19
ActiveNET ™ Always first in Race

This allows a stylesheet to be applicable to a wide class of documents that have similar source
tree structures.

XSLT makes use of the expression language defined by [XPath] for selecting elements for
processing, for conditional processing and for generating text.

Brief introduction to NameSpace:


In simple words we can say Namespace is nothing but the path of the WWW resource where the
syntaxes of the elements used in the XSL document are defined. It is also called as URN
(Uniform Resource Namespace).

The XSLT namespace has the URI http://www.w3.org/1999/XSL/Transform or


http://www.w3.org/TR/WD-xsl
[NOTE: The 1999 in the URI indicates the year in which the URI was allocated by the W3C. It
does not indicate the version of XSLT being used, which is specified by attributes]

In every XSL document this namespace must be mentioned in the root element. The root element
of XSL document is "xsl:stylesheet". Where xsl prefix is called as namespace prefix and
stylesheet is the root element name.

XSLT processors must use the XML namespaces mechanism [XML Names] to recognize
elements and attributes from this namespace. Elements from the XSLT namespace are
recognized only in the stylesheet not in the source document.

The list of sub elements available in xsl:stylesheet are:


The xsl:stylesheet element may contain the following types of elements:

xsl:import
xsl:include
xsl:strip-space
xsl:preserve-space
xsl:output
xsl:key
xsl:decimal-format
xsl:namespace-alias
xsl:attribute-set
xsl:variable
xsl:param
xsl:template

Here is one sample document on XSL document consisting of all the sub elements:
The xsl:stylesheet element may contain the following types of elements:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">


<xsl:import href="..."/>
<xsl:include href="..."/>
<xsl:strip-space elements="..."/>
<xsl:preserve-space elements="..."/>
<xsl:output method="..."/>
<xsl:key name="..." match="..." use="..."/>
<xsl:decimal-format name="..."/>
<xsl:namespace-alias stylesheet-prefix="..." result-prefix="..."/>
<xsl:attribute-set name="...">
...
</xsl:attribute-set>
<xsl:variable name="...">...</xsl:variable>

Surya 20
ActiveNET ™ Always first in Race

<xsl:param name="...">...</xsl:param>
<xsl:template match="...">
...
</xsl:template>
<xsl:template name="...">
...
</xsl:template>
</xsl:stylesheet>

XSL specification describes the details about two elements:


-XSL document
-XSLT (Transformation) Tool

Classification of XML languages:


i) Structured Language (.xml)
ii) Semantic Language (Schema, .xsd - XML Schema Description)
iii) Stylistic Language (.xsl) - contains transformation rules
iv) Linking Language (.xll)
used to provide hyperlinking between XML documents
v) WSDL (Web Services Description Language) - .wsdl
Application development platform independent business interface documents can be developed

XSL is mainly used to transform one form content into another form of content (XML-HTML, XML-
XML, XML-WML, XML-PDAML)

XSL technology uses two elements:


i) XSL document (contains stylesheet rules / transformation rules)
ii) XSLT - is a tool which transforms XML content into stylesheet specified format

XSL document structure is as follows:


<!--dept.xsl-->
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">

<xsl:template match="/">
<HTML>
<BODY>
<xsl:apply-templates/>
</BODY>
</HTML>
</xsl:template>

<xsl:template match="DEPTS">
<TABLE border="10" align="center">
<xsl:apply-templates/>
</TABLE>
</xsl:template>

<xsl:template match="DEPT">
<TR>
<xsl:apply-templates/>
</TR>
</xsl:template>

<xsl:template match="DEPT/*">
<TD><xsl:value-of select="."/></TD>

Surya 21
ActiveNET ™ Always first in Race

</xsl:template>
</xsl:stylesheet>

Procedure for browser in-built transformation:


<!--dept.xml-->
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="dept.xsl"?>
<DEPTS>
</DEPTS>

Procedure for programmatic transformation:


---------------------------------------------------------
The API i am using is from J2SDK 1.4.0

// XSLTExample.java
import javax.xml.transform.*;
import javax.xml.parsers.*;
import java.io.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;
import org.xml.sax.*;
import org.w3c.dom.*;

public class XSLTExample


{
public static void main(String rags[]) throws Exception
{
DocumentBuilderFactory factory=DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder=factory.newDocumentBuilder();
Document doc=docBuilder.parse(new InputSource(new FileInputStream(rags[1])));

TransformerFactory tf=TransformerFactory.newInstance();

Transformer t=tf.newTransformer(new DOMSource(doc));

t.transform(new
DOMSource(DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new
InputSource(new FileInputStream(rags[0])))), new StreamResult(new
FileOutputStream(rags[2])));
}// main()
}// class

java XSLTExample dept.xml dept.xsl dept.html

Surya 22

Vous aimerez peut-être aussi