Vous êtes sur la page 1sur 51

Both SAX and DOM are used to parse the XML document.

Both have advantages and


disadvantages and can be used in our programming depending on the situation.
SAX
DOM
Parses node by node
Stores the entire XML document into memory
before processing
Doesnt store the XML in memory
Occupies more memory
We cant insert or delete a node
We can insert or delete nodes
SAX is an event based parser
DOM is a tree model parser
SAX is a Simple API for XML
Document Object Model (DOM) API
Doesnt preserve comments
Preserves comments
SAX generally runs a little faster than DOM
SAX generally runs a little faster than DOM
Traverse in any direction.
. Which one is better should depends on the characteristics of your application (please refer to
some questions below).

1. Which parser can get better speed, DOM or SAX parsers?


SAX parser can get better speed.

1. What's the difference between tree-based API and event-based API?


A tree-based API is centered around a tree structure and therefore provides interfaces on
components of a tree (which is a DOM document) such
as Document interface,Node interface, NodeListinterface, Element interface, Attr interface
and so on. By contrast, however, an event-based API provides interfaces on handlers. There are
four handler
interfaces, ContentHandler interface,DTDHandler interface, EntityResolver interface
and ErrorHandler interface.

1. What is the difference between a DOMParser and a SAXParser?


DOM parsers and SAX parsers work in different ways.
o

A DOM parser creates a tree structure in memory from the input document and
then waits for requests from client. But a SAX parser does not create any internal
structure. Instead, it takes the occurrences of components of a input document as events,
and tells the client what it reads as it reads through the input document.

A DOM parser always serves the client application with the entire document no
matter how much is actually needed by the client. But a SAX parser serves the client
application always only with pieces of the document at any given time.

With DOM parser, method calls in client application have to be explicit and
forms a kind of chain. But with SAX, some certain methods (usually overriden by the
cient) will be invoked automatically (implicitly) in a way which is called "callback"
when some certain events occur. These methods do not have to be called explicitly by
the client, though we could call them explicitly.

2. What are some real world applications where using SAX parser is
advantageous than using DOM parser and vice versa?
What are the usual application for a DOM parser and for a SAX parser?
In the following cases, using SAX parser is advantageous than using DOM parser.
o

The input document is too big for available memory (actually in this case SAX is
your only choice)

You can process the document in small contiguous chunks of input. You do not
need the entire document before you can do useful work

You just want to use the parser to extract the information of interest, and all
your computation will be completely based on the data structures created by yourself.
Actually in most of our applications, we create data structures of our own which are
usually not as complicated as the DOM tree. From this sense, I think, the chance of
using a DOM parser is less than that of using a SAX parser.

In the following cases, using DOM parser is advantageous than using SAX parser.
o

Your application needs to access widely separately parts of the document at the
same time.

Your application may probably use a internal data structure which is almost as
complicated as the document itself.

o
o

Your application has to modify the document repeatedly.


Your application has to store the document for a significant amount of time
through many method calls.

Example (Use a DOM parser or a SAX parser?):


Assume that an instructor has an XML document containing all the personal information of the
students as well as the points his students made in his class, and he is now assigning final

grades for the students using an application. What he wants to produce, is a list with the SSN
and the grades. Also we assume that in his application, the instructor use no data structure
such as arrays to store the student personal information and the points.
If the instructor decides to give A's to those who earned the class average or above, and give
B's to the others, then he'd better to use a DOM parser in his application. The reason is that he
has no way to know how much is the class average before the entire document gets processed.
What he probably need to do in his application, is first to look through all the students' points
and compute the average, and then look through the document again and assign the final grade
to each student by comparing the points he earned to the class average.
If, however, the instructor adopts such a grading policy that the students who got 90 points or
more, are assigned A's and the others are assigned B's, then probably he'd better use a SAX
parser. The reason is, to assign each student a final grade, he do not need to wait for the entire
document to be processed. He could immediately assign a grade to a student once the SAX
parser reads the grade of this student.
In the above analysis, we assumed that the instructor created no data structure of his own.
What if he creates his own data structure, such as an array of strings to store the SSN and an
array of integers to sto re the points ? In this case, I think SAX is a better choice, before this
could save both memory and time as well, yet get the job done.
Well, one more consideration on this example. What if what the instructor wants to do is not to
print a list, but to save the original document back with the grade of each student updated ? In
this case, a DOM parser should be a better choice no matter what grading policy he is
adopting. He does not need to create any data structure of his own. What he needs to do is to
first modify the DOM tree (i.e., set value to the 'grade' node) and then save the whole modified
tree. If he choose to use a SAX parser instead of a DOM parser, then in this case he has to
create a data structure which is almost as complicated as a DOM tree before he could get the
job done.

DOM XML Parser in Java


DOM parser is a tree-based API. A tree-based API is centered around
a tree structure and therefore provides interfaces on components of a tree
(which is a DOM document) such
as Document interface,Node interface, NodeList interface, Elementint
erface, Attr interface and so on.

A DOM parser creates a tree structure in memory from the input document
and then waits for requests from client. A DOM parser always serves the client
application with the entire document no matter how much is
actually needed by the client. With DOM parser, method calls in client
application have to be explicit and forms a kind of chained method calls.

SAX XML Parser in Java


SAX parser is a event-based API. Usually an event-based API
provides interfaces on handlers. There are four handler
interfaces, ContentHandler interface, DTDHandler interface, EntityR
esolver interface and ErrorHandler interface.
SAX parser does not create any internal structure. Instead, it takes
the occurrences of components of a input document as events, and tells the
client what it reads as it reads through the input document. SAX parser serves
the client application always only with pieces of the document at
any given time. With SAX parser, some custom methods are called
[ callback methods ] when some certain events occur during parsing on
xml document. These methods do not have to be called explicitly by the client,
though we could call them explicitly.

Can SAX and DOM parsers be used at the


same time?
Yes, of course, because the use of a DOM parser and a SAX parser is
independent. For example, if your application needs to work on two XML
documents, and does different things on each document, you could use a

DOM parser on one document and a SAX parser on another, and then
combine the results or make the processing cooperate with each other.

Lets create a demo program to understand fully.

Step 1) Prepare xml file to be parsed


This xml file contains xml attributes also along with xml elements.
<users>
<user id="100">
<firstname>Tom</firstname>
<lastname>Hanks</lastname>
</user>
<user id="101">
<firstname>Lokesh</firstname>
<lastname>Gupta</lastname>
</user>
<user id="102">
<firstname>HowToDo</firstname>
<lastname>InJava</lastname>
</user>
</users>

Step 2) Create model class


package com.howtodoinjava.xml.sax;
/**
* Model class. Its instances will be populated using SAX parser.
* */
public class User
{
//XML attribute id
private int id;

//XML element fisrtName


private String firstName;
//XML element lastName
private String lastName;
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public String getFirstName() {
return firstName;
}
public void setFirstName(String firstName) {
this.firstName = firstName;
}
public String getLastName() {
return lastName;
}
public void setLastName(String lastName) {
this.lastName = lastName;
}
@Override
public String toString() {
return this.id + ":" + this.firstName +
}

":" +this.lastName ;

Step 3) Build the handler by extending


DefaultParser
Below the code for parse handler. I have put additional information in code
comments. Still, is you have any query, drop me a comment.
package com.howtodoinjava.xml.sax;
import java.util.ArrayList;
import java.util.Stack;
import org.xml.sax.Attributes;

import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class UserParserHandler extends DefaultHandler
{
//This is the list which shall be populated while parsing the XML.
private ArrayList userList = new ArrayList();
//As we read any XML element we will push that in this stack
private Stack elementStack = new Stack();
//As we complete one user block in XML, we will push the User instance in userList
private Stack objectStack = new Stack();
public void startDocument() throws SAXException
{
//System.out.println("start of the document
}

: ");

public void endDocument() throws SAXException


{
//System.out.println("end of the document document
}

: ");

public void startElement(String uri, String localName, String qName, Attributes attribu
{
//Push it in element stack
this.elementStack.push(qName);
//If this is start of 'user' element then prepare a new User instance and push it
if ("user".equals(qName))
{
//New User instance
User user = new User();
//Set all required attributes in any XML element here itself
if(attributes != null &amp;&amp; attributes.getLength() == 1)
{
user.setId(Integer.parseInt(attributes.getValue(0)));
}
this.objectStack.push(user);
}
}
public void endElement(String uri, String localName, String qName) throws SAXException
{
//Remove last added element
this.elementStack.pop();

//User instance has been constructed so pop it from object stack and push in userL
if ("user".equals(qName))
{
User object = this.objectStack.pop();
this.userList.add(object);
}
}
/**
* This will be called everytime parser encounter a value node
* */
public void characters(char[] ch, int start, int length) throws SAXException
{
String value = new String(ch, start, length).trim();
if (value.length() == 0)
{
return; // ignore white space
}
//handle the value based on to which element it belongs
if ("firstName".equals(currentElement()))
{
User user = (User) this.objectStack.peek();
user.setFirstName(value);
}
else if ("lastName".equals(currentElement()))
{
User user = (User) this.objectStack.peek();
user.setLastName(value);
}
}
/**
* Utility method for getting the current element in processing
* */
private String currentElement()
{
return this.elementStack.peek();
}
//Accessor for userList object
public ArrayList getUsers()
{
return userList;
}
}

Step 4) Write actual parser for our xml file


package com.howtodoinjava.xml.sax;
import java.io.IOException;
import java.io.InputStream;
import java.util.ArrayList;
import
import
import
import

org.xml.sax.InputSource;
org.xml.sax.SAXException;
org.xml.sax.XMLReader;
org.xml.sax.helpers.XMLReaderFactory;

public class UsersXmlParser


{
public ArrayList parseXml(InputStream in)
{
//Create a empty link of users initially
ArrayList<user> users = new ArrayList</user><user>();
try
{
//Create default handler instance
UserParserHandler handler = new UserParserHandler();
//Create parser from factory
XMLReader parser = XMLReaderFactory.createXMLReader();
//Register handler with parser
parser.setContentHandler(handler);
//Create an input source from the XML input stream
InputSource source = new InputSource(in);
//parse the document
parser.parse(source);

//populate the parsed users list in above created empty list; You can return f
users = handler.getUsers();
} catch (SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {

}
return users;
}
}

Step 5) Test the parser


Lets write some code to test whether our handler is actually working.
package com.howtodoinjava.xml.sax;
import
import
import
import

java.io.File;
java.io.FileInputStream;
java.io.FileNotFoundException;
java.util.ArrayList;

public class TestSaxParser


{
public static void main(String[] args) throws FileNotFoundException
{
//Locate the file
File xmlFile = new File("D:/temp/sample.xml");
//Create the parser instance
UsersXmlParser parser = new UsersXmlParser();
//Parse the file
ArrayList users = parser.parseXml(new FileInputStream(xmlFile));
//Verify the result
System.out.println(users);
}
}
Output:
[100:Tom:Hanks, 101:Lokesh:Gupta, 102:HowToDo:InJava]

Steps to Using DOM Parser


Lets note down some broad steps involved in using a DOM parser for parsing
any XML file in java.

DOM Parser in Action

Import XML-related packages


You will need to import below packages first in your application.
import org.w3c.dom.*;
import javax.xml.parsers.*;
import java.io.*;

Create a DocumentBuilder
Next step is to create the DocumentBuilder object.
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();

Create a Document from a file


Document document = builder.parse(new File( file ));

Validate Document structure


This is optional but good to have it before start parsing.
Schema schema = null;
try {
String language = XMLConstants.W3C_XML_SCHEMA_NS_URI;
SchemaFactory factory = SchemaFactory.newInstance(language);
schema = factory.newSchema(new File(name));
} catch (Exception e) {
e.printStackStrace();
}
Validator validator = schema.newValidator();
validator.validate(new DOMSource(document));

Extract the root element


You can get the root element from XML document using below code.
Element root = document.getDocumentElement();

Examine attributes
You can examine the node attributes using below methods.
element.getAttribute("attributeName") ;
//returns specific attribute
element.getAttributes();
//returns a Map (table) of names/values

Examine sub-elements
Child elements can inquired in below manner.
node.getElementsByTagName("subElementName") //returns a list of sub-elements of specified
node.getChildNodes()
//returns a list of all child nodes

Steps to Using DOM Parser


Lets note down some broad steps involved in using a DOM parser for parsing
any XML file in java.

DOM Parser in Action

Import XML-related packages


You will need to import below packages first in your application.
import org.w3c.dom.*;
import javax.xml.parsers.*;
import java.io.*;

Create a DocumentBuilder
Next step is to create the DocumentBuilder object.
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();

Create a Document from a file


Document document = builder.parse(new File( file ));

Validate Document structure


This is optional but good to have it before start parsing.
Schema schema = null;
try {
String language = XMLConstants.W3C_XML_SCHEMA_NS_URI;
SchemaFactory factory = SchemaFactory.newInstance(language);
schema = factory.newSchema(new File(name));
} catch (Exception e) {
e.printStackStrace();
}
Validator validator = schema.newValidator();
validator.validate(new DOMSource(document));

Extract the root element


You can get the root element from XML document using below code.
Element root = document.getDocumentElement();

Examine attributes
You can examine the node attributes using below methods.
element.getAttribute("attributeName") ;
//returns specific attribute
element.getAttributes();
//returns a Map (table) of names/values

Examine sub-elements
Child elements can inquired in below manner.
node.getElementsByTagName("subElementName") //returns a list of sub-elements of specified

node.getChildNodes()

//returns a list of all child nodes

Parsing known xml structure


In below example code, I am assuming that user is already aware of the
structure of employees.xml file (its nodes and attributes); So example directly
start fetching information and start printing it in console. I real life application,
you will use this information for some real purpose rather than printing it on
console and leave.
//Get Document Builder
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
//Build Document
Document document = builder.parse(new File("employees.xml"));
//Normalize the XML Structure; It's just too important !!
document.getDocumentElement().normalize();
//Here comes the root node
Element root = document.getDocumentElement();
System.out.println(root.getNodeName());
//Get all employees
NodeList nList = document.getElementsByTagName("employee");
System.out.println("============================");
for (int temp = 0; temp < nList.getLength(); temp++)
{
Node node = nList.item(temp);
System.out.println("");
//Just a separator
if (node.getNodeType() == Node.ELEMENT_NODE)
{
//Print each employee's detail
Element eElement = (Element) node;
System.out.println("Employee id : " + eElement.getAttribute("id"));

System.out.println("First Name : " + eElement.getElementsByTagName("firstName").item(0

}
}

System.out.println("Last Name : "

+ eElement.getElementsByTagName("lastName").item(0).

System.out.println("Location : "

+ eElement.getElementsByTagName("location").item(0).

Output:
employees
============================
Employee id : 111
First Name : Lokesh
Last Name : Gupta
Location : India
Employee id : 222
First Name : Alex
Last Name : Gussin
Location : Russia
Employee id : 333
First Name : David
Last Name : Feezor
Location : USA

What is XML Parsing?


Parsing XML refers to going through XML document to access data or to
modify data in one or other way.

What is XML Parser?


XML Parser provides way how to access or modify data present in an XML
document. Java provides multiple options to parse XML document. Following
are various types of parsers which are commonly used to parse XML
documents.

Dom Parser - Parses the document by loading the complete contents of


the document and creating its complete hiearchical tree in memory.

SAX Parser - Parses the document on event based triggers. Does not
load the complete document into the memory.

JDOM Parser - Parses the document in similar fashion to DOM parser


but in more easier way.

StAX Parser - Parses the document in similar fashion to SAX parser but
in more efficient way.

XPath Parser - Parses the XML based on expression and is used


extensively in conjuction with XSLT.

DOM4J Parser - A java library to parse XML, XPath and XSLT using Java
Collections Framework , provides support for DOM, SAX and JAXP.

There are JAXB and XSLT APIs available to handle XML parsing in Object
Oriented way.We'll elboborate each parser in detail in next chapters.

Java DOM Parser - Overview


The Document Object Model is an official recommendation of the World
Wide Web Consortium (W3C). It defines an interface that enables programs
to access and update the style, structure,and contents of XML documents.
XML parsers that support the DOM implement that interface.

When to use?
You should use a DOM parser when:

You need to know a lot about the structure of a document

You need to move parts of the document around (you might want to sort
certain elements, for example)

You need to use the information in the document more than once

What you get?


When you parse an XML document with a DOM parser, you get back a tree
structure that contains all of the elements of your document. The DOM

provides a variety of functions you can use to examine the contents and
structure of the document.

Advantages
The DOM is a common interface for manipulating document structures. One
of its design goals is that Java code written for one DOM-compliant parser
should run on any other DOM-compliant parser without changes.

DOM interfaces
The DOM defines several Java interfaces. Here are the most common
interfaces:

Node - The base datatype of the DOM.

Element - The vast majority of the objects you'll deal with are Elements.

Attr Represents an attribute of an element.

Text The actual content of an Element or Attr.

Document Represents the entire XML document. A Document object is


often referred to as a DOM tree.

Common DOM methods


When you are working with the DOM, there are several methods you'll use
often:

Document.getDocumentElement() - Returns the root element of the


document.

Node.getFirstChild() - Returns the first child of a given Node.

Node.getLastChild() - Returns the last child of a given Node.

Node.getNextSibling() - These methods return the next sibling of a


given Node.

Node.getPreviousSibling() - These methods return the previous sibling


of a given Node.

Node.getAttribute(attrName) - For a given Node, returns the attribute


with the requested name.

Java DOM Parser - Parse XML Document

Steps to Using DOM


Following are the steps used while parsing a document using DOM Parser.

Import XML-related packages.

Create a DocumentBuilder

Create a Document from a file or stream

Extract the root element

Examine attributes

Examine sub-elements

Import XML-related packages


import org.w3c.dom.*;
import javax.xml.parsers.*;
import java.io.*;

Create a DocumentBuilder

DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();

Create a Document from a file or stream


StringBuilder xmlStringBuilder = new StringBuilder();
xmlStringBuilder.append("<?xml version="1.0"?> <class> </class>");
ByteArrayInputStream input = new ByteArrayInputStream(
xmlStringBuilder.toString().getBytes("UTF-8"));
Document doc = builder.parse(input);

Extract the root element


Element root = document.getDocumentElement();

Examine attributes
//returns specific attribute
getAttribute("attributeName");
//returns a Map (table) of names/values
getAttributes();

Examine sub-elements
//returns a list of subelements of specified name
getElementsByTagName("subelementName");
//returns a list of all child nodes
getChildNodes();

Demo Example
Here is the input xml file we need to parse:

<?xml version="1.0"?>
<class>
<student rollno="393">
<firstname>dinkar</firstname>
<lastname>kad</lastname>
<nickname>dinkar</nickname>
<marks>85</marks>
</student>
<student rollno="493">
<firstname>Vaneet</firstname>
<lastname>Gupta</lastname>
<nickname>vinni</nickname>
<marks>95</marks>
</student>
<student rollno="593">
<firstname>jasvir</firstname>
<lastname>singn</lastname>
<nickname>jazz</nickname>
<marks>90</marks>
</student>
</class>

Demo Example:
DomParserDemo.java
package com.tutorialspoint.xml;

import java.io.File;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;

import org.w3c.dom.Node;
import org.w3c.dom.Element;

public class DomParserDemo {


public static void main(String[] args){

try {
File inputFile = new File("input.txt");
DocumentBuilderFactory dbFactory
= DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(inputFile);
doc.getDocumentElement().normalize();
System.out.println("Root element :"
+ doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("student");
System.out.println("----------------------------");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
System.out.println("\nCurrent Element :"
+ nNode.getNodeName());
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.println("Student roll no : "
+ eElement.getAttribute("rollno"));
System.out.println("First Name : "
+ eElement
.getElementsByTagName("firstname")
.item(0)

.getTextContent());
System.out.println("Last Name : "
+ eElement
.getElementsByTagName("lastname")
.item(0)
.getTextContent());
System.out.println("Nick Name : "
+ eElement
.getElementsByTagName("nickname")
.item(0)
.getTextContent());
System.out.println("Marks : "
+ eElement
.getElementsByTagName("marks")
.item(0)
.getTextContent());
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

This would produce the following result:


Root element :class
----------------------------

Current Element :student

Student roll no : 393


First Name : dinkar
Last Name : kad
Nick Name : dinkar
Marks : 85

Current Element :student


Student roll no : 493
First Name : Vaneet
Last Name : Gupta
Nick Name : vinni
Marks : 95

Current Element :student


Student roll no : 593
First Name : jasvir
Last Name : singn
Nick Name : jazz
Marks : 90

Java DOM Parser - Query XML Document

Demo Example
Here is the input xml file we need to query:
<?xml version="1.0"?>
<cars>
<supercars company="Ferrari">
<carname type="formula one">Ferarri 101</carname>
<carname type="sports car">Ferarri 201</carname>
<carname type="sports car">Ferarri 301</carname>
</supercars>
<supercars company="Lamborgini">

<carname>Lamborgini 001</carname>
<carname>Lamborgini 002</carname>
<carname>Lamborgini 003</carname>
</supercars>
<luxurycars company="Benteley">
<carname>Benteley 1</carname>
<carname>Benteley 2</carname>
<carname>Benteley 3</carname>
</luxurycars>
</cars>

Demo Example:
QueryXmlFileDemo.java
package com.tutorialspoint.xml;

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
import java.io.File;

public class QueryXmlFileDemo {

public static void main(String argv[]) {

try {
File inputFile = new File("input.txt");
DocumentBuilderFactory dbFactory =

DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(inputFile);
doc.getDocumentElement().normalize();
System.out.print("Root element: ");
System.out.println(doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("supercars");
System.out.println("----------------------------");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
System.out.println("\nCurrent Element :");
System.out.print(nNode.getNodeName());
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.print("company : ");
System.out.println(eElement.getAttribute("company"));
NodeList carNameList =
eElement.getElementsByTagName("carname");
for (int count = 0;
count < carNameList.getLength(); count++) {
Node node1 = carNameList.item(count);
if (node1.getNodeType() ==
node1.ELEMENT_NODE) {
Element car = (Element) node1;
System.out.print("car name : ");
System.out.println(car.getTextContent());
System.out.print("car type : ");
System.out.println(car.getAttribute("type"));
}

}
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

This would produce the following result:


Root element :cars
----------------------------

Current Element :supercars


company : Ferrari
car name : Ferarri 101
car type : formula one
car name : Ferarri 201
car type : sports car
car name : Ferarri 301
car type : sports car

Current Element :supercars


company : Lamborgini
car name : Lamborgini 001
car type :
car name : Lamborgini 002
car type :
car name : Lamborgini 003
car type :

Java DOM Parser - Create XML Document

Demo Example
Here is the XML we need to create:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<cars><supercars company="Ferrari">
<carname type="formula one">Ferrari 101</carname>
<carname type="sports">Ferrari 202</carname>
</supercars></cars>

Demo Example:
CreateXmlFileDemo.java
package com.tutorialspoint.xml;

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Attr;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import java.io.File;

public class CreateXmlFileDemo {

public static void main(String argv[]) {

try {
DocumentBuilderFactory dbFactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder =
dbFactory.newDocumentBuilder();
Document doc = dBuilder.newDocument();
// root element
Element rootElement = doc.createElement("cars");
doc.appendChild(rootElement);

// supercars element
Element supercar = doc.createElement("supercars");
rootElement.appendChild(supercar);

// setting attribute to element


Attr attr = doc.createAttribute("company");
attr.setValue("Ferrari");
supercar.setAttributeNode(attr);

// carname element
Element carname = doc.createElement("carname");
Attr attrType = doc.createAttribute("type");
attrType.setValue("formula one");
carname.setAttributeNode(attrType);
carname.appendChild(
doc.createTextNode("Ferrari 101"));
supercar.appendChild(carname);

Element carname1 = doc.createElement("carname");


Attr attrType1 = doc.createAttribute("type");
attrType1.setValue("sports");
carname1.setAttributeNode(attrType1);
carname1.appendChild(
doc.createTextNode("Ferrari 202"));
supercar.appendChild(carname1);

// write the content into xml file


TransformerFactory transformerFactory =
TransformerFactory.newInstance();
Transformer transformer =
transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StreamResult result =
new StreamResult(new File("C:\\cars.xml"));
transformer.transform(source, result);
// Output to console for testing
StreamResult consoleResult =
new StreamResult(System.out);
transformer.transform(source, consoleResult);
} catch (Exception e) {
e.printStackTrace();
}
}
}

This would produce the following result:


<?xml version="1.0" encoding="UTF-8" standalone="no"?>

<cars><supercars company="Ferrari">
<carname type="formula one">Ferrari 101</carname>
<carname type="sports">Ferrari 202</carname>
</supercars></cars>

Java DOM Parser - Modify XML Document

Demo Example
Here is the input xml file we need to modify:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<cars>
<supercars company="Ferrari">
<carname type="formula one">Ferrari 101</carname>
<carname type="sports">Ferrari 202</carname>
</supercars>
<luxurycars company="Benteley">
<carname>Benteley 1</carname>
<carname>Benteley 2</carname>
<carname>Benteley 3</carname>
</luxurycars>
</cars>

Demo Example:
ModifyXmlFileDemo.java
package com.tutorialspoint.xml;

import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Transformer;

import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

public class ModifyXmlFileDemo {

public static void main(String argv[]) {

try {
File inputFile = new File("input.xml");
DocumentBuilderFactory docFactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder =
docFactory.newDocumentBuilder();
Document doc = docBuilder.parse(inputFile);
Node cars = doc.getFirstChild();
Node supercar = doc.getElementsByTagName("supercars").item(0);
// update supercar attribute
NamedNodeMap attr = supercar.getAttributes();
Node nodeAttr = attr.getNamedItem("company");
nodeAttr.setTextContent("Lamborigini");

// loop the supercar child node


NodeList list = supercar.getChildNodes();

for (int temp = 0; temp < list.getLength(); temp++) {


Node node = list.item(temp);
if (node.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) node;
if ("carname".equals(eElement.getNodeName())){
if("Ferrari 101".equals(eElement.getTextContent())){
eElement.setTextContent("Lamborigini 001");
}
if("Ferrari 202".equals(eElement.getTextContent()))
eElement.setTextContent("Lamborigini 002");
}
}
}
NodeList childNodes = cars.getChildNodes();
for(int count = 0; count < childNodes.getLength(); count++){
Node node = childNodes.item(count);
if("luxurycars".equals(node.getNodeName()))
cars.removeChild(node);
}
// write the content on console
TransformerFactory transformerFactory =
TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
System.out.println("-----------Modified File-----------");
StreamResult consoleResult = new StreamResult(System.out);
transformer.transform(source, consoleResult);
} catch (Exception e) {
e.printStackTrace();

}
}
}

This would produce the following result:


-----------Modified File----------<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<cars>
<supercars company="Lamborigini">
<carname type="formula one">Lamborigini 001</carname>
<carname type="sports">Lamborigini 002</carname>
</supercars></cars>

Java SAX Parser - Overview


SAX (the Simple API for XML) is an event-based parser for xml
documents.Unlike a DOM parser, a SAX parser creates no parse tree. SAX is
a streaming interface for XML, which means that applications using SAX
receive event notifications about the XML document being processed an
element, and attribute, at a time in sequential order starting at the top of
the document, and ending with the closing of the ROOT element.

Reads an XML document from top to bottom, recognizing the tokens that
make up a well-formed XML document

Tokens are processed in the same order that they appear in the
document

Reports the application program the nature of tokens that the parser has
encountered as they occur

The application program provides an "event" handler that must be


registered with the parser

As the tokens are identified, callback methods in the handler are invoked
with the relevant information

When to use?
You should use a SAX parser when:

You can process the XML document in a linear fashion from the top down

The document is not deeply nested

You are processing a very large XML document whose DOM tree would
consume too much memory.Typical DOM implementations use ten bytes
of memory to represent one byte of XML

The problem to be solved involves only part of the XML document

Data is available as soon as it is seen by the parser, so SAX works well for
an XML document that arrives over a stream

Disadvantages of SAX

We have no random access to an XML document since it is processed in a


forward-only manner

If you need to keep track of data the parser has seen or change the order
of items, you must write the code and store the data on your own

ContentHandler Interface
This interface specifies the callback methods that the SAX parser uses to
notify an application program of the components of the XML document that
it has seen.

void startDocument() - Called at the beginning of a document.

void endDocument() - Called at the end of a document.

void startElement(String uri, String localName, String qName,


Attributes atts) - Called at the beginning of an element.

void endElement(String uri, String localName,String qName) Called at the end of an element.

void characters(char[] ch, int start, int length) - Called when


character data is encountered.

void ignorableWhitespace( char[] ch, int start, int length) - Called


when a DTD is present and ignorable whitespace is encountered.

void processingInstruction(String target, String data) - Called


when a processing instruction is recognized.

void setDocumentLocator(Locator locator)) - Provides a Locator that


can be used to identify positions in the document.

void skippedEntity(String name) - Called when an unresolved entity


is encountered.

void startPrefixMapping(String prefix, String uri) - Called when a


new namespace mapping is defined.

void endPrefixMapping(String prefix) - Called when a namespace


definition ends its scope.

Attributes Interface
This interface specifies methods for processing the attributes connected to
an element.

int getLength() - Returns number of attributes.

String getQName(int index)

String getValue(int index)

String getValue(String qname)

Demo Example
Here is the input xml file we need to parse:
<?xml version="1.0"?>
<class>
<student rollno="393">
<firstname>dinkar</firstname>
<lastname>kad</lastname>
<nickname>dinkar</nickname>
<marks>85</marks>
</student>
<student rollno="493">
<firstname>Vaneet</firstname>
<lastname>Gupta</lastname>
<nickname>vinni</nickname>
<marks>95</marks>
</student>
<student rollno="593">
<firstname>jasvir</firstname>
<lastname>singn</lastname>
<nickname>jazz</nickname>
<marks>90</marks>
</student>
</class>

UserHandler.java
package com.tutorialspoint.xml;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class UserHandler extends DefaultHandler {

boolean bFirstName = false;


boolean bLastName = false;
boolean bNickName = false;
boolean bMarks = false;

@Override
public void startElement(String uri,
String localName, String qName, Attributes attributes)
throws SAXException {
if (qName.equalsIgnoreCase("student")) {
String rollNo = attributes.getValue("rollno");
System.out.println("Roll No : " + rollNo);
} else if (qName.equalsIgnoreCase("firstname")) {
bFirstName = true;
} else if (qName.equalsIgnoreCase("lastname")) {
bLastName = true;
} else if (qName.equalsIgnoreCase("nickname")) {
bNickName = true;
}
else if (qName.equalsIgnoreCase("marks")) {
bMarks = true;
}

@Override
public void endElement(String uri,
String localName, String qName) throws SAXException {
if (qName.equalsIgnoreCase("student")) {
System.out.println("End Element :" + qName);
}
}

@Override
public void characters(char ch[],
int start, int length) throws SAXException {
if (bFirstName) {
System.out.println("First Name: "
+ new String(ch, start, length));
bFirstName = false;
} else if (bLastName) {
System.out.println("Last Name: "
+ new String(ch, start, length));
bLastName = false;
} else if (bNickName) {
System.out.println("Nick Name: "
+ new String(ch, start, length));
bNickName = false;
} else if (bMarks) {
System.out.println("Marks: "
+ new String(ch, start, length));
bMarks = false;

}
}
}

SAXParserDemo.java
package com.tutorialspoint.xml;

import java.io.File;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class SAXParserDemo {


public static void main(String[] args){

try {
File inputFile = new File("input.txt");
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
UserHandler userhandler = new UserHandler();
saxParser.parse(inputFile, userhandler);
} catch (Exception e) {
e.printStackTrace();
}
}
}

class UserHandler extends DefaultHandler {

boolean bFirstName = false;


boolean bLastName = false;
boolean bNickName = false;
boolean bMarks = false;

@Override
public void startElement(String uri,
String localName, String qName, Attributes attributes)
throws SAXException {
if (qName.equalsIgnoreCase("student")) {
String rollNo = attributes.getValue("rollno");
System.out.println("Roll No : " + rollNo);
} else if (qName.equalsIgnoreCase("firstname")) {
bFirstName = true;
} else if (qName.equalsIgnoreCase("lastname")) {
bLastName = true;
} else if (qName.equalsIgnoreCase("nickname")) {
bNickName = true;
}
else if (qName.equalsIgnoreCase("marks")) {
bMarks = true;
}
}

@Override
public void endElement(String uri,

String localName, String qName) throws SAXException {


if (qName.equalsIgnoreCase("student")) {
System.out.println("End Element :" + qName);
}
}

@Override
public void characters(char ch[],
int start, int length) throws SAXException {
if (bFirstName) {
System.out.println("First Name: "
+ new String(ch, start, length));
bFirstName = false;
} else if (bLastName) {
System.out.println("Last Name: "
+ new String(ch, start, length));
bLastName = false;
} else if (bNickName) {
System.out.println("Nick Name: "
+ new String(ch, start, length));
bNickName = false;
} else if (bMarks) {
System.out.println("Marks: "
+ new String(ch, start, length));
bMarks = false;
}
}
}

This would produce the following result:

Roll No : 393
First Name: dinkar
Last Name: kad
Nick Name: dinkar
Marks: 85
End Element :student
Roll No : 493
First Name: Vaneet
Last Name: Gupta
Nick Name: vinni
Marks: 95
End Element :student
Roll No : 593
First Name: jasvir
Last Name: singn
Nick Name: jazz
Marks: 90
End Element :student

Java SAX Parser - Create XML Document


It is better to use StAX parser for creating XML than using SAX
parser. Please refer the Java StAX Parser section for the same.\

Java SAX Parser - Modify XML Document

Demo Example
Here is the input xml file we need to Modify by appending
<Result>Pass<Result/>
at the end of </marks> tag
<?xml version="1.0"?>

<class>
<student rollno="393">
<firstname>dinkar</firstname>
<lastname>kad</lastname>
<nickname>dinkar</nickname>
<marks>85</marks>
</student>
<student rollno="493">
<firstname>Vaneet</firstname>
<lastname>Gupta</lastname>
<nickname>vinni</nickname>
<marks>95</marks>
</student>
<student rollno="593">
<firstname>jasvir</firstname>
<lastname>singn</lastname>
<nickname>jazz</nickname>
<marks>90</marks>
</student>
</class>

SAXModifyDemo.java
package com.tutorialspoint.xml;

import java.io.*;
import org.xml.sax.*;
import javax.xml.parsers.*;
import org.xml.sax.helpers.DefaultHandler;

public class SAXModifyDemo extends DefaultHandler {


static String displayText[] = new String[1000];
static int numberLines = 0;

static String indentation = "";

public static void main(String args[]) {

try {
File inputFile = new File("input.txt");
SAXParserFactory factory =
SAXParserFactory.newInstance();
SAXModifyDemo obj = new SAXModifyDemo();
obj.childLoop(inputFile);
FileWriter filewriter = new FileWriter("newfile.xml");
for(int loopIndex = 0; loopIndex < numberLines; loopIndex++){
filewriter.write(displayText[loopIndex].toCharArray());
filewriter.write('\n');
System.out.println(displayText[loopIndex].toString());
}
filewriter.close();
}
catch (Exception e) {
e.printStackTrace(System.err);
}
}

public void childLoop(File input){


DefaultHandler handler = this;
SAXParserFactory factory = SAXParserFactory.newInstance();
try {
SAXParser saxParser = factory.newSAXParser();
saxParser.parse(input, handler);

} catch (Throwable t) {}
}

public void startDocument() {


displayText[numberLines] = indentation;
displayText[numberLines] += "<?xml version=\"1.0\" encoding=\""+
"UTF-8" + "\"?>";
numberLines++;
}

public void processingInstruction(String target,


String data) {
displayText[numberLines] = indentation;
displayText[numberLines] += "<?";
displayText[numberLines] += target;
if (data != null && data.length() > 0) {
displayText[numberLines] += ' ';
displayText[numberLines] += data;
}
displayText[numberLines] += "?>";
numberLines++;
}

public void startElement(String uri, String localName,


String qualifiedName, Attributes attributes) {
displayText[numberLines] = indentation;

indentation += "

";

displayText[numberLines] += '<';
displayText[numberLines] += qualifiedName;
if (attributes != null) {
int numberAttributes = attributes.getLength();
for (int loopIndex = 0; loopIndex < numberAttributes;
loopIndex++){
displayText[numberLines] += ' ';
displayText[numberLines] += attributes.getQName(loopIndex);
displayText[numberLines] += "=\"";
displayText[numberLines] += attributes.getValue(loopIndex);
displayText[numberLines] += '"';
}
}
displayText[numberLines] += '>';
numberLines++;
}

public void characters(char characters[],


int start, int length) {
String characterData = (new String(characters, start, length)).trim();
if(characterData.indexOf("\n") < 0 && characterData.length() > 0) {
displayText[numberLines] = indentation;
displayText[numberLines] += characterData;
numberLines++;
}
}

public void endElement(String uri, String localName,


String qualifiedName) {

indentation = indentation.substring(0, indentation.length() - 4) ;


displayText[numberLines] = indentation;
displayText[numberLines] += "</";
displayText[numberLines] += qualifiedName;
displayText[numberLines] += '>';
numberLines++;

if (qualifiedName.equals("marks")) {
startElement("", "Result", "Result", null);
characters("Pass".toCharArray(), 0, "Pass".length());
endElement("", "Result", "Result");
}
}
}

This would produce the following result:


<?xml version="1.0" encoding="UTF-8"?>
<class>
<student rollno="393">
<firstname>
dinkar
</firstname>
<lastname>
kad
</lastname>
<nickname>
dinkar
</nickname>
<marks>
85
</marks>

<Result>
Pass
</Result>
</student>
<student rollno="493">
<firstname>
Vaneet
</firstname>
<lastname>
Gupta
</lastname>
<nickname>
vinni
</nickname>
<marks>
95
</marks>
<Result>
Pass
</Result>
</student>
<student rollno="593">
<firstname>
jasvir
</firstname>
<lastname>
singn
</lastname>
<nickname>
jazz
</nickname>
<marks>
90
</marks>

<Result>
Pass
</Result>
</student>
</class>

DOM vs SAX Parser in Java - XML Parsing in Java


DOM vs SAX parser in Java
DOM and SAX parser are two most popular parser used in Java programming language to
parse XML documents. DOM and SAX concept are originally XML concept and Java
programming language just provide an API to implement these parser. Despite both DOM
and SAX are used in XML parsing, they are completely different to each other. In
fact difference between DOM and SAX parser is a popular Java interview question asked
during Java and XML interviews. DOM and SAX parser has different way of working, which
makes Java programmer to understand difference between DOM and SAX parser even more
important. Careless use of DOM parser may result in java.lang.OutOfMemoryError if you
try to parse a huge XML file and with small heap size while careless use of SAX parser may
result in poor performance while parsing small and medium sized XML files with good
enough heap space in Java. In this Java article we will compare DOM and SAX parser and
learn difference between them.

SAX vs DOM parser - Java


In this section we will see some behavioral difference between DOM and SAX parser.
These difference will help you to choose DOM over SAX or vice-versa based upon size
of XML files and availability of heap memory in JVM.
1) First and major difference between DOM vs SAX parser is how they work. DOM parser
load full XML file in memory and creates a tree representation of XML document, while
SAX is an event based XML parser and doesn't load whole XML document into memory.
2) For small and medium sized XML documents DOM is much faster than SAX because of
in memory operation.
3) DOM stands for Document Object Model while SAX stands for Simple API for XML
parsing.

4) Another difference between DOM vs SAX is that, learning where to use DOM parser and
where to use SAX parser. DOM parser is better suited for small XML file with sufficient
memory, while SAX parser is better suited for large XML files.
That's all on DOM vs SAX parser in XML and Java. These are best option for XML parsing
in Java and requires careful decision while choosing DOM or SAX.

Vous aimerez peut-être aussi