Académique Documents
Professionnel Documents
Culture Documents
A DOM parser creates a tree structure in memory from the input document and
then waits for requests from client. But a SAX parser does not create any internal
structure. Instead, it takes the occurrences of components of a input document as events,
and tells the client what it reads as it reads through the input document.
A DOM parser always serves the client application with the entire document no
matter how much is actually needed by the client. But a SAX parser serves the client
application always only with pieces of the document at any given time.
With DOM parser, method calls in client application have to be explicit and
forms a kind of chain. But with SAX, some certain methods (usually overriden by the
cient) will be invoked automatically (implicitly) in a way which is called "callback"
when some certain events occur. These methods do not have to be called explicitly by
the client, though we could call them explicitly.
2. What are some real world applications where using SAX parser is
advantageous than using DOM parser and vice versa?
What are the usual application for a DOM parser and for a SAX parser?
In the following cases, using SAX parser is advantageous than using DOM parser.
o
The input document is too big for available memory (actually in this case SAX is
your only choice)
You can process the document in small contiguous chunks of input. You do not
need the entire document before you can do useful work
You just want to use the parser to extract the information of interest, and all
your computation will be completely based on the data structures created by yourself.
Actually in most of our applications, we create data structures of our own which are
usually not as complicated as the DOM tree. From this sense, I think, the chance of
using a DOM parser is less than that of using a SAX parser.
In the following cases, using DOM parser is advantageous than using SAX parser.
o
Your application needs to access widely separately parts of the document at the
same time.
Your application may probably use a internal data structure which is almost as
complicated as the document itself.
o
o
grades for the students using an application. What he wants to produce, is a list with the SSN
and the grades. Also we assume that in his application, the instructor use no data structure
such as arrays to store the student personal information and the points.
If the instructor decides to give A's to those who earned the class average or above, and give
B's to the others, then he'd better to use a DOM parser in his application. The reason is that he
has no way to know how much is the class average before the entire document gets processed.
What he probably need to do in his application, is first to look through all the students' points
and compute the average, and then look through the document again and assign the final grade
to each student by comparing the points he earned to the class average.
If, however, the instructor adopts such a grading policy that the students who got 90 points or
more, are assigned A's and the others are assigned B's, then probably he'd better use a SAX
parser. The reason is, to assign each student a final grade, he do not need to wait for the entire
document to be processed. He could immediately assign a grade to a student once the SAX
parser reads the grade of this student.
In the above analysis, we assumed that the instructor created no data structure of his own.
What if he creates his own data structure, such as an array of strings to store the SSN and an
array of integers to sto re the points ? In this case, I think SAX is a better choice, before this
could save both memory and time as well, yet get the job done.
Well, one more consideration on this example. What if what the instructor wants to do is not to
print a list, but to save the original document back with the grade of each student updated ? In
this case, a DOM parser should be a better choice no matter what grading policy he is
adopting. He does not need to create any data structure of his own. What he needs to do is to
first modify the DOM tree (i.e., set value to the 'grade' node) and then save the whole modified
tree. If he choose to use a SAX parser instead of a DOM parser, then in this case he has to
create a data structure which is almost as complicated as a DOM tree before he could get the
job done.
A DOM parser creates a tree structure in memory from the input document
and then waits for requests from client. A DOM parser always serves the client
application with the entire document no matter how much is
actually needed by the client. With DOM parser, method calls in client
application have to be explicit and forms a kind of chained method calls.
DOM parser on one document and a SAX parser on another, and then
combine the results or make the processing cooperate with each other.
":" +this.lastName ;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class UserParserHandler extends DefaultHandler
{
//This is the list which shall be populated while parsing the XML.
private ArrayList userList = new ArrayList();
//As we read any XML element we will push that in this stack
private Stack elementStack = new Stack();
//As we complete one user block in XML, we will push the User instance in userList
private Stack objectStack = new Stack();
public void startDocument() throws SAXException
{
//System.out.println("start of the document
}
: ");
: ");
public void startElement(String uri, String localName, String qName, Attributes attribu
{
//Push it in element stack
this.elementStack.push(qName);
//If this is start of 'user' element then prepare a new User instance and push it
if ("user".equals(qName))
{
//New User instance
User user = new User();
//Set all required attributes in any XML element here itself
if(attributes != null && attributes.getLength() == 1)
{
user.setId(Integer.parseInt(attributes.getValue(0)));
}
this.objectStack.push(user);
}
}
public void endElement(String uri, String localName, String qName) throws SAXException
{
//Remove last added element
this.elementStack.pop();
//User instance has been constructed so pop it from object stack and push in userL
if ("user".equals(qName))
{
User object = this.objectStack.pop();
this.userList.add(object);
}
}
/**
* This will be called everytime parser encounter a value node
* */
public void characters(char[] ch, int start, int length) throws SAXException
{
String value = new String(ch, start, length).trim();
if (value.length() == 0)
{
return; // ignore white space
}
//handle the value based on to which element it belongs
if ("firstName".equals(currentElement()))
{
User user = (User) this.objectStack.peek();
user.setFirstName(value);
}
else if ("lastName".equals(currentElement()))
{
User user = (User) this.objectStack.peek();
user.setLastName(value);
}
}
/**
* Utility method for getting the current element in processing
* */
private String currentElement()
{
return this.elementStack.peek();
}
//Accessor for userList object
public ArrayList getUsers()
{
return userList;
}
}
org.xml.sax.InputSource;
org.xml.sax.SAXException;
org.xml.sax.XMLReader;
org.xml.sax.helpers.XMLReaderFactory;
//populate the parsed users list in above created empty list; You can return f
users = handler.getUsers();
} catch (SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
}
return users;
}
}
java.io.File;
java.io.FileInputStream;
java.io.FileNotFoundException;
java.util.ArrayList;
Create a DocumentBuilder
Next step is to create the DocumentBuilder object.
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Examine attributes
You can examine the node attributes using below methods.
element.getAttribute("attributeName") ;
//returns specific attribute
element.getAttributes();
//returns a Map (table) of names/values
Examine sub-elements
Child elements can inquired in below manner.
node.getElementsByTagName("subElementName") //returns a list of sub-elements of specified
node.getChildNodes()
//returns a list of all child nodes
Create a DocumentBuilder
Next step is to create the DocumentBuilder object.
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Examine attributes
You can examine the node attributes using below methods.
element.getAttribute("attributeName") ;
//returns specific attribute
element.getAttributes();
//returns a Map (table) of names/values
Examine sub-elements
Child elements can inquired in below manner.
node.getElementsByTagName("subElementName") //returns a list of sub-elements of specified
node.getChildNodes()
}
}
+ eElement.getElementsByTagName("lastName").item(0).
System.out.println("Location : "
+ eElement.getElementsByTagName("location").item(0).
Output:
employees
============================
Employee id : 111
First Name : Lokesh
Last Name : Gupta
Location : India
Employee id : 222
First Name : Alex
Last Name : Gussin
Location : Russia
Employee id : 333
First Name : David
Last Name : Feezor
Location : USA
SAX Parser - Parses the document on event based triggers. Does not
load the complete document into the memory.
StAX Parser - Parses the document in similar fashion to SAX parser but
in more efficient way.
DOM4J Parser - A java library to parse XML, XPath and XSLT using Java
Collections Framework , provides support for DOM, SAX and JAXP.
There are JAXB and XSLT APIs available to handle XML parsing in Object
Oriented way.We'll elboborate each parser in detail in next chapters.
When to use?
You should use a DOM parser when:
You need to move parts of the document around (you might want to sort
certain elements, for example)
You need to use the information in the document more than once
provides a variety of functions you can use to examine the contents and
structure of the document.
Advantages
The DOM is a common interface for manipulating document structures. One
of its design goals is that Java code written for one DOM-compliant parser
should run on any other DOM-compliant parser without changes.
DOM interfaces
The DOM defines several Java interfaces. Here are the most common
interfaces:
Element - The vast majority of the objects you'll deal with are Elements.
Create a DocumentBuilder
Examine attributes
Examine sub-elements
Create a DocumentBuilder
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Examine attributes
//returns specific attribute
getAttribute("attributeName");
//returns a Map (table) of names/values
getAttributes();
Examine sub-elements
//returns a list of subelements of specified name
getElementsByTagName("subelementName");
//returns a list of all child nodes
getChildNodes();
Demo Example
Here is the input xml file we need to parse:
<?xml version="1.0"?>
<class>
<student rollno="393">
<firstname>dinkar</firstname>
<lastname>kad</lastname>
<nickname>dinkar</nickname>
<marks>85</marks>
</student>
<student rollno="493">
<firstname>Vaneet</firstname>
<lastname>Gupta</lastname>
<nickname>vinni</nickname>
<marks>95</marks>
</student>
<student rollno="593">
<firstname>jasvir</firstname>
<lastname>singn</lastname>
<nickname>jazz</nickname>
<marks>90</marks>
</student>
</class>
Demo Example:
DomParserDemo.java
package com.tutorialspoint.xml;
import java.io.File;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
try {
File inputFile = new File("input.txt");
DocumentBuilderFactory dbFactory
= DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(inputFile);
doc.getDocumentElement().normalize();
System.out.println("Root element :"
+ doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("student");
System.out.println("----------------------------");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
System.out.println("\nCurrent Element :"
+ nNode.getNodeName());
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.println("Student roll no : "
+ eElement.getAttribute("rollno"));
System.out.println("First Name : "
+ eElement
.getElementsByTagName("firstname")
.item(0)
.getTextContent());
System.out.println("Last Name : "
+ eElement
.getElementsByTagName("lastname")
.item(0)
.getTextContent());
System.out.println("Nick Name : "
+ eElement
.getElementsByTagName("nickname")
.item(0)
.getTextContent());
System.out.println("Marks : "
+ eElement
.getElementsByTagName("marks")
.item(0)
.getTextContent());
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
Demo Example
Here is the input xml file we need to query:
<?xml version="1.0"?>
<cars>
<supercars company="Ferrari">
<carname type="formula one">Ferarri 101</carname>
<carname type="sports car">Ferarri 201</carname>
<carname type="sports car">Ferarri 301</carname>
</supercars>
<supercars company="Lamborgini">
<carname>Lamborgini 001</carname>
<carname>Lamborgini 002</carname>
<carname>Lamborgini 003</carname>
</supercars>
<luxurycars company="Benteley">
<carname>Benteley 1</carname>
<carname>Benteley 2</carname>
<carname>Benteley 3</carname>
</luxurycars>
</cars>
Demo Example:
QueryXmlFileDemo.java
package com.tutorialspoint.xml;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
import java.io.File;
try {
File inputFile = new File("input.txt");
DocumentBuilderFactory dbFactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(inputFile);
doc.getDocumentElement().normalize();
System.out.print("Root element: ");
System.out.println(doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("supercars");
System.out.println("----------------------------");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
System.out.println("\nCurrent Element :");
System.out.print(nNode.getNodeName());
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.print("company : ");
System.out.println(eElement.getAttribute("company"));
NodeList carNameList =
eElement.getElementsByTagName("carname");
for (int count = 0;
count < carNameList.getLength(); count++) {
Node node1 = carNameList.item(count);
if (node1.getNodeType() ==
node1.ELEMENT_NODE) {
Element car = (Element) node1;
System.out.print("car name : ");
System.out.println(car.getTextContent());
System.out.print("car type : ");
System.out.println(car.getAttribute("type"));
}
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
Demo Example
Here is the XML we need to create:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<cars><supercars company="Ferrari">
<carname type="formula one">Ferrari 101</carname>
<carname type="sports">Ferrari 202</carname>
</supercars></cars>
Demo Example:
CreateXmlFileDemo.java
package com.tutorialspoint.xml;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Attr;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import java.io.File;
try {
DocumentBuilderFactory dbFactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder =
dbFactory.newDocumentBuilder();
Document doc = dBuilder.newDocument();
// root element
Element rootElement = doc.createElement("cars");
doc.appendChild(rootElement);
// supercars element
Element supercar = doc.createElement("supercars");
rootElement.appendChild(supercar);
// carname element
Element carname = doc.createElement("carname");
Attr attrType = doc.createAttribute("type");
attrType.setValue("formula one");
carname.setAttributeNode(attrType);
carname.appendChild(
doc.createTextNode("Ferrari 101"));
supercar.appendChild(carname);
<cars><supercars company="Ferrari">
<carname type="formula one">Ferrari 101</carname>
<carname type="sports">Ferrari 202</carname>
</supercars></cars>
Demo Example
Here is the input xml file we need to modify:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<cars>
<supercars company="Ferrari">
<carname type="formula one">Ferrari 101</carname>
<carname type="sports">Ferrari 202</carname>
</supercars>
<luxurycars company="Benteley">
<carname>Benteley 1</carname>
<carname>Benteley 2</carname>
<carname>Benteley 3</carname>
</luxurycars>
</cars>
Demo Example:
ModifyXmlFileDemo.java
package com.tutorialspoint.xml;
import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
try {
File inputFile = new File("input.xml");
DocumentBuilderFactory docFactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder =
docFactory.newDocumentBuilder();
Document doc = docBuilder.parse(inputFile);
Node cars = doc.getFirstChild();
Node supercar = doc.getElementsByTagName("supercars").item(0);
// update supercar attribute
NamedNodeMap attr = supercar.getAttributes();
Node nodeAttr = attr.getNamedItem("company");
nodeAttr.setTextContent("Lamborigini");
}
}
}
Reads an XML document from top to bottom, recognizing the tokens that
make up a well-formed XML document
Tokens are processed in the same order that they appear in the
document
Reports the application program the nature of tokens that the parser has
encountered as they occur
As the tokens are identified, callback methods in the handler are invoked
with the relevant information
When to use?
You should use a SAX parser when:
You can process the XML document in a linear fashion from the top down
You are processing a very large XML document whose DOM tree would
consume too much memory.Typical DOM implementations use ten bytes
of memory to represent one byte of XML
Data is available as soon as it is seen by the parser, so SAX works well for
an XML document that arrives over a stream
Disadvantages of SAX
If you need to keep track of data the parser has seen or change the order
of items, you must write the code and store the data on your own
ContentHandler Interface
This interface specifies the callback methods that the SAX parser uses to
notify an application program of the components of the XML document that
it has seen.
void endElement(String uri, String localName,String qName) Called at the end of an element.
Attributes Interface
This interface specifies methods for processing the attributes connected to
an element.
Demo Example
Here is the input xml file we need to parse:
<?xml version="1.0"?>
<class>
<student rollno="393">
<firstname>dinkar</firstname>
<lastname>kad</lastname>
<nickname>dinkar</nickname>
<marks>85</marks>
</student>
<student rollno="493">
<firstname>Vaneet</firstname>
<lastname>Gupta</lastname>
<nickname>vinni</nickname>
<marks>95</marks>
</student>
<student rollno="593">
<firstname>jasvir</firstname>
<lastname>singn</lastname>
<nickname>jazz</nickname>
<marks>90</marks>
</student>
</class>
UserHandler.java
package com.tutorialspoint.xml;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
@Override
public void startElement(String uri,
String localName, String qName, Attributes attributes)
throws SAXException {
if (qName.equalsIgnoreCase("student")) {
String rollNo = attributes.getValue("rollno");
System.out.println("Roll No : " + rollNo);
} else if (qName.equalsIgnoreCase("firstname")) {
bFirstName = true;
} else if (qName.equalsIgnoreCase("lastname")) {
bLastName = true;
} else if (qName.equalsIgnoreCase("nickname")) {
bNickName = true;
}
else if (qName.equalsIgnoreCase("marks")) {
bMarks = true;
}
@Override
public void endElement(String uri,
String localName, String qName) throws SAXException {
if (qName.equalsIgnoreCase("student")) {
System.out.println("End Element :" + qName);
}
}
@Override
public void characters(char ch[],
int start, int length) throws SAXException {
if (bFirstName) {
System.out.println("First Name: "
+ new String(ch, start, length));
bFirstName = false;
} else if (bLastName) {
System.out.println("Last Name: "
+ new String(ch, start, length));
bLastName = false;
} else if (bNickName) {
System.out.println("Nick Name: "
+ new String(ch, start, length));
bNickName = false;
} else if (bMarks) {
System.out.println("Marks: "
+ new String(ch, start, length));
bMarks = false;
}
}
}
SAXParserDemo.java
package com.tutorialspoint.xml;
import java.io.File;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
try {
File inputFile = new File("input.txt");
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
UserHandler userhandler = new UserHandler();
saxParser.parse(inputFile, userhandler);
} catch (Exception e) {
e.printStackTrace();
}
}
}
@Override
public void startElement(String uri,
String localName, String qName, Attributes attributes)
throws SAXException {
if (qName.equalsIgnoreCase("student")) {
String rollNo = attributes.getValue("rollno");
System.out.println("Roll No : " + rollNo);
} else if (qName.equalsIgnoreCase("firstname")) {
bFirstName = true;
} else if (qName.equalsIgnoreCase("lastname")) {
bLastName = true;
} else if (qName.equalsIgnoreCase("nickname")) {
bNickName = true;
}
else if (qName.equalsIgnoreCase("marks")) {
bMarks = true;
}
}
@Override
public void endElement(String uri,
@Override
public void characters(char ch[],
int start, int length) throws SAXException {
if (bFirstName) {
System.out.println("First Name: "
+ new String(ch, start, length));
bFirstName = false;
} else if (bLastName) {
System.out.println("Last Name: "
+ new String(ch, start, length));
bLastName = false;
} else if (bNickName) {
System.out.println("Nick Name: "
+ new String(ch, start, length));
bNickName = false;
} else if (bMarks) {
System.out.println("Marks: "
+ new String(ch, start, length));
bMarks = false;
}
}
}
Roll No : 393
First Name: dinkar
Last Name: kad
Nick Name: dinkar
Marks: 85
End Element :student
Roll No : 493
First Name: Vaneet
Last Name: Gupta
Nick Name: vinni
Marks: 95
End Element :student
Roll No : 593
First Name: jasvir
Last Name: singn
Nick Name: jazz
Marks: 90
End Element :student
Demo Example
Here is the input xml file we need to Modify by appending
<Result>Pass<Result/>
at the end of </marks> tag
<?xml version="1.0"?>
<class>
<student rollno="393">
<firstname>dinkar</firstname>
<lastname>kad</lastname>
<nickname>dinkar</nickname>
<marks>85</marks>
</student>
<student rollno="493">
<firstname>Vaneet</firstname>
<lastname>Gupta</lastname>
<nickname>vinni</nickname>
<marks>95</marks>
</student>
<student rollno="593">
<firstname>jasvir</firstname>
<lastname>singn</lastname>
<nickname>jazz</nickname>
<marks>90</marks>
</student>
</class>
SAXModifyDemo.java
package com.tutorialspoint.xml;
import java.io.*;
import org.xml.sax.*;
import javax.xml.parsers.*;
import org.xml.sax.helpers.DefaultHandler;
try {
File inputFile = new File("input.txt");
SAXParserFactory factory =
SAXParserFactory.newInstance();
SAXModifyDemo obj = new SAXModifyDemo();
obj.childLoop(inputFile);
FileWriter filewriter = new FileWriter("newfile.xml");
for(int loopIndex = 0; loopIndex < numberLines; loopIndex++){
filewriter.write(displayText[loopIndex].toCharArray());
filewriter.write('\n');
System.out.println(displayText[loopIndex].toString());
}
filewriter.close();
}
catch (Exception e) {
e.printStackTrace(System.err);
}
}
} catch (Throwable t) {}
}
indentation += "
";
displayText[numberLines] += '<';
displayText[numberLines] += qualifiedName;
if (attributes != null) {
int numberAttributes = attributes.getLength();
for (int loopIndex = 0; loopIndex < numberAttributes;
loopIndex++){
displayText[numberLines] += ' ';
displayText[numberLines] += attributes.getQName(loopIndex);
displayText[numberLines] += "=\"";
displayText[numberLines] += attributes.getValue(loopIndex);
displayText[numberLines] += '"';
}
}
displayText[numberLines] += '>';
numberLines++;
}
if (qualifiedName.equals("marks")) {
startElement("", "Result", "Result", null);
characters("Pass".toCharArray(), 0, "Pass".length());
endElement("", "Result", "Result");
}
}
}
<Result>
Pass
</Result>
</student>
<student rollno="493">
<firstname>
Vaneet
</firstname>
<lastname>
Gupta
</lastname>
<nickname>
vinni
</nickname>
<marks>
95
</marks>
<Result>
Pass
</Result>
</student>
<student rollno="593">
<firstname>
jasvir
</firstname>
<lastname>
singn
</lastname>
<nickname>
jazz
</nickname>
<marks>
90
</marks>
<Result>
Pass
</Result>
</student>
</class>
4) Another difference between DOM vs SAX is that, learning where to use DOM parser and
where to use SAX parser. DOM parser is better suited for small XML file with sufficient
memory, while SAX parser is better suited for large XML files.
That's all on DOM vs SAX parser in XML and Java. These are best option for XML parsing
in Java and requires careful decision while choosing DOM or SAX.