Mastering Ajax

Mastering Ajax, Part 1: Introduction to Ajax
Understanding Ajax, a productive approach to building Web sites, and how it works Level: Introductory Brett McLaughlin (brett@newInstance.com), Author and Editor, O'Reilly Media Inc. 06 Dec 2005 Ajax, which consists of HTML, JavaScript technology, DHTML, and DOM, is an outstanding approach that helps you transform clunky Web interfaces into interactive Ajax applications. The author, an Ajax expert, demonstrates how these technologies work together -- from an overview to a detailed look -- to make extremely efficient Web development an easy reality. He also unveils the central concepts of Ajax, including the XMLHttpRequest object. Five years ago, if you didn't know XML, you were the ugly duckling whom nobody talked to. Eighteen months ago, Ruby came into the limelight and programmers who didn't know what was going on with Ruby weren't welcome at the water cooler. Today, if you want to get into the latest technology rage, Ajax is where it's at. However, Ajax is far more than just a fad; it's a powerful approach to building Web sites and it's not nearly as hard to learn as an entire new language. Before I dig into what Ajax is, though, let's spend just a few moments understanding what Ajax does. When you write an application today, you have two basic choices: Desktop applications
Web applications
These are both familiar; desktop applications usually come on a CD (or sometimes are downloaded from a Web site) and install completely on your computer. They might use the Internet to download updates, but the code that runs these applications resides on your desktop. Web applications -- and there's no surprise here -- run on a Web server somewhere and you access the application with your Web browser. More important than where the code for these applications runs, though, is how the applications behave and how you interact with them. Desktop applications are usually pretty fast (they're running on your computer; you're not waiting on an Internet connection), have great user interfaces (usually interacting with your operating system), and are incredibly dynamic. You can click, point, type, pull up menus and sub-menus, and cruise around, with almost no waiting around. On the other hand, Web applications are usually up-to-the-second current and they provide services you could never get on your desktop (think about Amazon.com and eBay). However, with the power of the Web comes waiting -- waiting for a server to respond, waiting for a screen to refresh, waiting for a request to come back and generate a new page. Obviously this is a bit of an oversimplification, but you get the basic idea. As you might already be suspecting, Ajax attempts to bridge the gap between the functionality and interactivity of a desktop application and the always-updated Web application. You can use dynamic user interfaces and fancier controls like you'd find on a desktop application, but it's available to you on a Web application. So what are you waiting for? Start looking at Ajax and how to turn your clunky Web interfaces into responsive Ajax applications.
Old technology, new tricks

Ajax defined By the way, Ajax is shorthand for Asynchronous JavaScript and XML (and DHTML, and so on). The phrase was coined by Jesse James Garrett of Adaptive Path (see the Resources section) and is, according to Jesse, not meant to be an acronym. When it comes to Ajax, the reality is that it involves a lot of technologies -- to get beyond the basics, you need to drill down into several different technologies (which is why I'll spend the first several articles in this series breaking apart each one of them). The good news is that you might already know a decent bit about many of these technologies -- better yet, most of these individual technologies are easy to learn -- certainly not as difficult as an entire programming language like Java or Ruby. Here are the basic technologies involved in Ajax applications: HTML is used to build Web forms and identify fields for use in the rest of your application.
JavaScript code is the core code running Ajax applications and it helps facilitate communication with server applications. DHTML, or Dynamic HTML, helps you update your forms dynamically. You'll use div, span, and other dynamic HTML elements to mark up your HTML. DOM, the Document Object Model, will be used (through JavaScript code) to work with both the structure of your HTML and (in some cases) XML returned from the server.
Let's break these down and get a better idea of what each does. I'll delve into each of these more in future articles; for now focus on becoming familiar with these components and technologies. The more familiar you are with this code, the easier it will be to move from casual knowledge about these technologies to mastering each (and really blowing the doors off of your Web application development).
The XMLHttpRequest object

The first object you want to understand is probably the one that's newest to you; it's called XMLHttpRequest. This is a JavaScript object and is created as simply as shown in Listing 1. Listing 1. Create a new XMLHttpRequest object
<script language="javascript" type="text/javascript"> var xmlHttp = new XMLHttpRequest(); </script> I'll talk more about this object in the next article, but for now realize that this is the object that handles all your server communication. Before you go forward, stop and think about that -- it's the JavaScript technology through the XMLHttpRequest object that talks to the server. That's not the normal application flow and it's where Ajax gets much of its magic. In a normal Web application, users fill out form fields and click a Submit button. Then, the entire form is sent to the server, the server passes on processing to a script (usually PHP or Java or maybe a CGI process or something similar), and when the script is done, it sends back a completely new page. That page might be HTML with a new form with some data filled in or it might be a confirmation or perhaps a page with certain options selected based on data entered in the original form. Of course, while the script or program on the server is processing and returning a new form, users have to wait. Their screen will go blank and then be redrawn as data comes back from the server. This is where low interactivity comes into play -- users don't get instant feedback and they certainly don't feel like they're working on a desktop application. Ajax essentially puts JavaScript technology and the XMLHttpRequest object between your Web form and the server. When users fill out forms, that data is sent to some JavaScript code and not directly to the server. Instead, the JavaScript code grabs the form data and sends a request to the server. While this is happening, the form on the users screen doesn't flash, blink, disappear, or stall. In other words, the JavaScript code sends the request behind the scenes; the user doesn't even realize that the request is being made. Even better, the request is sent asynchronously, which means that your JavaScript code (and the user) doesn't wait around on the server to respond. So users can continue entering data, scrolling around, and using the application. Then, the server sends data back to your JavaScript code (still standing in for the Web form) which decides what to do with that data. It can update form fields on the fly, giving that immediate feeling to your application -- users are getting new data without their form being submitted or refreshed. The JavaScript code could even get the data, perform some calculations, and send another request, all without user intervention! This is the power of XMLHttpRequest. It can talk back and forth with a server all it wants, without the user ever knowing about what's really going on. The result is a dynamic, responsive, highly-interactive experience like a desktop application, but with all the power of the Internet behind it.
Adding in some JavaScript

Once you get a handle on XMLHttpRequest, the rest of your JavaScript code turns out to be pretty mundane. In fact, you'll use JavaScript code for just a few basic tasks: Get form data: JavaScript code makes it simple to pull data out of your HTML form and send it to the server.
Change values on the form: It's also simple to update a form, from setting field values to replacing images on the fly. Parse HTML and XML: You'll use JavaScript code to manipulate the DOM (see the next section) and to work with the structure of your HTML form and any XML data that the server returns.
For those first two items, you want to be very familiar with the getElementById() method as shown in Listing 2. Listing 2. Grab and set field values with JavaScript code // Get the value of the "phone" field and stuff it in a variable called phone var phone = document.getElementById("phone").value; // Set some values on a form using an array called response document.getElementById("order").value = response[0]; document.getElementById("address").value = response[1]; There's nothing particularly remarkable here and that's good! You should start to realize that there's nothing tremendously complicated about this. Once you master XMLHttpRequest, much of the rest of your Ajax application will be simple JavaScript code like that shown in Listing 2, mixed in with a bit of clever HTML. Then, every once in a while, there's a little DOM work...so let's look at that.
Finishing off with the DOM

Last but not least, there's the DOM, the Document Object Model. For some of you, hearing about the DOM is going to be a little intimidating -- it's not often used by HTML designers and is even somewhat unusual for JavaScript coders unless you're really into some high-end programming tasks. Where you will find the DOM in use a lot is in heavy-duty Java and C/C++ programs; in fact, that's probably where the DOM got a bit of its reputation for being difficult or hard to learn. Fortunately, using the DOM in JavaScript technology is easy, and is mostly intuitive. At this point, I'd normally show you how to use the DOM or at least give you a few code examples, but even that would be misleading. You see, you can get pretty far into Ajax without having to mess with the DOM and that's the path I'm going to show you. I'll come back to the DOM in a future article, but for now, just know that it's out there. When you start to send XML back and forth between your JavaScript code and the server and really change the HTML form, you'll dig back into DOM. For now, it's easy to get some effective Ajax going without it, so put this on the back-burner for now.
Getting a Request object

With a basic overview under your belt, you're ready to look at a few specifics. Since XMLHttpRequest is central to Ajax applications -- and probably new to many of you -- I'll start there. As you saw in Listing 1, it should be pretty easy to create this object and use it, right? Wait a minute. Remember those pesky browser wars from a few years back and how nothing worked the same across browsers? Well, believe it or not, those wars are still going on albeit on a much smaller scale. And, surprise: XMLHttpRequest is one of the victims of this war. So you'll need to do a few different things to get an XMLHttpRequest object going. I'll take your through it step by step.
Working with Microsoft browsers

Microsoft's browser, Internet Explorer, uses the MSXML parser for handling XML (you can find out more about MSXML in Resources). So when you write Ajax applications that need to work on Internet Explorer, you need to create the object in a particular way. However, it's not that easy. MSXML actually has two different versions floating around depending on the version of JavaScript technology installed in Internet Explorer, so you've got to write code that handles both cases. Look at Listing 3 for the code that you need to create an XMLHttpRequest on Microsoft browsers. Listing 3. Create an XMLHttpRequest object on Microsoft browsers var xmlHttp = false; try { xmlHttp = new ActiveXObject("Msxml2.XMLHTTP"); } catch (e) { try { xmlHttp = new ActiveXObject("Microsoft.XMLHTTP"); } catch (e2) { xmlHttp = false; } } All of this won't make exact sense yet, but that's OK. You'll dig into JavaScript programming, error handling, conditional compilation, and more before this series is finished. For now, you want to get two core lines into your head:
xmlHttp = new ActiveXObject("Msxml2.XMLHTTP");
and
xmlHttp = new ActiveXObject("Microsoft.XMLHTTP");.
In a nutshell, this code tries to create the object using one version of MSXML; if that fails, it then creates the object using the other version. Nice, huh? If neither of these work, the xmlHttp variable is set to false, to tell your code know that something hasn't worked. If that's the case, you've probably got a nonMicrosoft browser and need to use different code to do the job.
Dealing with Mozilla and non-Microsoft browsers

If Internet Explorer isn't your browser of choice or you write code for non-Microsoft browsers, then you need different code. In fact, this is the really simple line of code you saw back in Listing 1: var xmlHttp = new XMLHttpRequest object;. This much simpler line creates an XMLHttpRequest object in Mozilla, Firefox, Safari, Opera, and pretty much every other non-Microsoft browser that supports Ajax in any form or fashion.
Putting it together
The key is to support all browsers. Who wants to write an application that works just on Internet Explorer or an application that works just on non-Microsoft browsers? Worse yet, do you want to write your application twice? Of course not! So your code combines support for both Internet Explorerand nonMicrosoft browsers. Listing 4 shows the code to do just that. Listing 4. Create an XMLHttpRequest object the multi-browser way /* Create a new XMLHttpRequest object to talk to the Web server */ var xmlHttp = false; /*@cc_on @*/ /*@if (@_jscript_version >= 5) try { xmlHttp = new ActiveXObject("Msxml2.XMLHTTP"); } catch (e) { try { xmlHttp = new ActiveXObject("Microsoft.XMLHTTP"); } catch (e2) { xmlHttp = false;
} } @end @*/ if (!xmlHttp && typeof XMLHttpRequest != 'undefined') { xmlHttp = new XMLHttpRequest(); } For now, ignore the commenting and weird tags like @cc_on; those are special JavaScript compiler commands that you'll explore in depth in my next article, which will focus exclusively on XMLHttpRequest. The core of this code breaks down into three steps:
1.
2.
Create a variable, xmlHttp, to reference the XMLHttpRequest object that you will create. Try and create the object in Microsoft browsers: o Try and create the object using the Msxml2.XMLHTTP object. o If that fails, try and create the object using the Microsoft.XMLHTTP object. If xmlHttp still isn't set up, create the object in a non-Microsoft way.
3.
At the end of this process, xmlHttp should reference a valid XMLHttpRequest object, no matter what browser your users run.
A word on security
What about security? Today's browsers offer users the ability to crank their security levels up, to turn off JavaScript technology, and disable any number of options in their browser. In these cases, your code probably won't work under any circumstances. For these situations, you'll have to handle problems gracefully -- that's at least one article in itself, one I will tackle later (it's going to be a long series, isn't it? Don't worry; you'll master all of this before you're through). For now, you're writing robust, but not perfect, code, which is great for getting a handle on Ajax. You'll come back to the finer details.
Request/Response in an Ajax world

So you now understand Ajax and have a basic idea about the XMLHttpRequest object and how to create it. If you've read closely, you even realize that it's the JavaScript technology that talks to any Web application on the server rather than your HTML form being submitted to that application directly. What's the missing piece? How to actually use XMLHttpRequest. Since this is critical code that you'll use in some form in every Ajax application you write, take a quick tour through what a basic request/response model with Ajax looks like.
Making a request
You have your shiny new XMLHttpRequest object; now take it for a spin. First, you need a JavaScript method that your Web page can call (like when a user types in text or selects an option from a menu). Then, you'll follow the same basic outline in almost all of your Ajax applications: 1. Get whatever data you need from the Web form. 2. Build the URL to connect to. 3. Open a connection to the server. 4. Set up a function for the server to run when it's done. 5. Send the request. Listing 5 is a sample of an Ajax method that does these very things, in this order: Listing 5. Make a request with Ajax function callServer() { // Get the city and state from the web form var city = document.getElementById("city").value; var state = document.getElementById("state").value; // Only go on if there are values for both fields if ((city == null) || (city == "")) return; if ((state == null) || (state == "")) return; // Build the URL to connect to var url = "/scripts/getZipCode.php?city=" + escape(city) + "&state=" + escape(state); // Open a connection to the server xmlHttp.open("GET", url, true); // Setup a function for the server to run when it's done
xmlHttp.onreadystatechange = updatePage; // Send the request xmlHttp.send(null); } A lot of this is self-explanatory. The first bit of the code uses basic JavaScript code to grab the values of a few form fields. Then the code sets up a PHP script as the destination to connect to. Notice how the URL of the script is specified and then the city and state (from the form) are appended to this using simple GET parameters. Next, a connection is opened; here's the first place you see XMLHttpRequest in action again. The method of connection is indicated (GET), as well as the URL to connect to. The final parameter, when set to true, requests an asynchronous connection (thus making this Ajax). If you used false, the code would wait around on the server when the request was made and not continue until a response was received. By setting this to true, your users can still use the form (and even call other JavaScript methods) while the server is processing this request in the background. The onreadystatechange property of xmlHttp (remember, that's your instance of the XMLHttpRequest object) allows you to tell the server what to do when it does finish running (which could be in five minutes or five hours). Since the code isn't going to wait around for the server, you'll need to let the server know what to do so you can respond to it. In this case, a specific method -- called updatePage() -- will be triggered when the server is finished processing your request. Finally, send() is called with a value of null. Since you've added the data to send to the server (the city and state) in the request URL, you don't need to send anything in the request. So this fires off your request and the server can do what you asked it to do. If you don't get anything else out of this, notice how straightforward and simple this is! Other than getting the asynchronous nature of Ajax into your head, this is relatively simple stuff. You'll appreciate how it frees you up to concentrate on cool applications and interfaces rather than complicated HTTP request/response code. The code in Listing 5 is about as easy as it gets. The data is simple text and can be included as part of the request URL. GET sends the request rather than the more complicated POST. There's no XML or content headers to add, no data to send in the body of the request -- this is Ajax Utopia, in other words. Have no fear; things will become more complicated as this series progresses. You'll learn how to send POST requests, how to set request headers and content types, how to encode XML in your message, how to add security to your request -- the list is pretty long! Don't worry about the hard stuff for now; get your head around the basics, and you'll soon build up a whole arsenal of Ajax tools.
Handling the response

Now you need to actually deal with the server's response. You really only need to know two things at this point: Don't do anything until the xmlHttp.readyState property is equal to 4.
The server will stuff it's response into the xmlHttp.responseText property.
The first of these -- ready states -- is going to take up the bulk of the next article; you'll learn more about the stages of an HTTP request than you ever wanted to know. For now, if you simply check for a certain value (4), things will work (and you'll have something to look forward to in the next article). The second item -- using the xmlHttp.responseText property to get the server's response -- is easy. Listing 6 shows an example of a method that the server can call based on the values sent in Listing 5. Listing 6. Handle the server's response function updatePage() { if (xmlHttp.readyState == 4) { var response = xmlHttp.responseText; document.getElementById("zipCode").value = response; } } Again, this code isn't so difficult or complicated. It waits for the server to call it with the right ready state and then uses the value that the server returns (in this case, the ZIP code for the user-entered city and state) to set the value of another form field. The result is that the zipCode field suddenly appears with the ZIP code -- but the user never had to click a button!. That's the desktop application feel I talked about earlier. Responsiveness, a dynamic feel, and more, all with a little Ajax code. Observant readers might notice that the zipCode field is a normal text field. Once the server returns the ZIP code and the updatePage() method sets the value of that field with the city/state ZIP code, users can override the value. That's intentional for two reasons: To keep things in the example simple and to show you that sometimes you want users to be able to override what a server says. Keep both in mind; they're important in good user-interface design.
Hooking in the Web form

So what's left? Actually, not much. You have a JavaScript method that grabs information that the user put into a form, sends it to the server, provides
another JavaScript method to listen for and handle a response, and even sets the value of a field when that response comes back. All that's really left is to call that first JavaScript method and start the whole process. You could obviously add a button to your HTML form, but that's pretty 2001, don't you think? Take advantage of JavaScript technology like in Listing 7. Listing 7. Kick off an Ajax process <form> <p>City: <input type="text" name="city" id="city" size="25" onChange="callServer();" /></p> <p>State: <input type="text" name="state" id="state" size="25" onChange="callServer();" /></p> <p>Zip Code: <input type="text" name="zipCode" id="zipCode" size="5" /></p> </form> If this feels like yet one more piece of fairly routine code, then you're right -- it is! When a user puts in a new value for either the city or state field, the callServer() method fires off and the Ajax fun begins. Starting to feel like you've got a handle on things? Good; that's the idea!
In conclusion
At this point, you're probably not ready to go out and write your first Ajax application -- at least, not unless you're willing to do some real digging in the Resources section. However, you can start to get the basic idea of how these applications work and a basic understanding of the XMLHttpRequest object. In the articles to come, you'll learn to master this object, how to handle JavaScript-to-server communication, how to work with HTML forms, and even get a handle on the DOM. For now, though, spend some time thinking about just how powerful Ajax applications can be. Imagine a Web form that responds to you not just when you click a button, but when you type into a field, when you select an option from a combo box...even when you drag your mouse around the screen. Think about exactly what asynchronous means; think about JavaScript code running and not waiting on the server to respond to its requests. What sorts of problems can you run into? What areas do you watch out for? And how will the design of your forms change to account for this new approach in programming? If you spend some real time with these issues, you'll be better served than just having some code you can cut-and-paste and throw into an application that you really don't understand. In the next article, you'll put these ideas into practice and I'll give you the details on the code you need to really make applications like this work. So, until then, enjoy the possibilities of Ajax.
Mastering Ajax, Part 2: Make asynchronous requests with JavaScript and Ajax
Use XMLHttpRequest for Web requests Level: Intermediate Brett McLaughlin (brett@newInstance.com), Author and Editor, O'Reilly Media Inc. 17 Jan 2006 Most Web applications use a request/response model that gets an entire HTML page from the server. The result is a back-and-forth that usually involves clicking a button, waiting for the server, clicking another button, and then waiting some more. With Ajax and the XMLHttpRequest object, you can use a request/response model that never leaves users waiting for a server to respond. In this article, Brett McLaughlin shows you how to create XMLHttpRequest instances in a cross-browser way, construct and send requests, and respond to the server. In the last article of this series (see Resources for links), you were introduced to the Ajax applications and looked at some of the basic concepts that drive Ajax applications. At the center of this was a lot of technology that you probably already know about: JavaScript, HTML and XHTML, a bit of dynamic HTML, and even some DOM (the Document Object Model). In this article, I will zoom in from that 10,000-foot view and focus on specific Ajax details. In this article, you'll begin with the most fundamental and basic of all Ajax-related objects and programming approaches: The XMLHttpRequest object. This object is really the only common thread across all Ajax applications and -- as you might expect -- you will want to understand it thoroughly to take your programming to the limits of what's possible. In fact, you'll find out that sometimes, to use XMLHttpRequest properly, you explicitly won't use XMLHttpRequest. What in the world is that all about?
Web 2.0 at a glance

First, take this last bit of overview before you dive into code -- make sure you're crystal clear on this idea of the Web 2.0. When you hear the term Web 2.0, you should first ask, "What's Web 1.0?" Although you'll rarely hear Web 1.0, it is meant to refer to the traditional Web where you have a very distinct request and response model. For example, go to Amazon.com and click a button or enter a search term. A request is made to a server and then a response comes back to your browser. That request has a lot more than just a list of books and titles, though; it's actually another complete HTML page. As a result, you probably get some flashing or flickering as your Web browser's screen is redrawn with this new HTML page. In fact, you can clearly see the request and response, delineated by each new page you see. The Web 2.0 dispenses with this very visible back-and-forth (to a large degree). As an example, visit a site like Google Maps or Flickr (links to both of these
Web 2.0, Ajax-powered sites are in Resources). On Google Maps, for example, you can drag the map around and zoom in and zoom out with very little redrawing. Of course, requests and responses do go on here, but all behind the scenes. As a user, the experience is much more pleasant and feels a lot like a desktop application. This new feel and paradigm is what you see when someone refers to Web 2.0. What you should care about then is how to make these new interactions possible. Obviously, you've still got to make requests and field responses, but it's the redrawing of the HTML for every request/response interaction that gives the perception of a slow, clunky Web interface. So clearly you need an approach that allows you to make requests and receive responses that include only the data you need, rather than an entire HTML page as well. The only time you want to get a whole new HTML page is when ... well ... when you want the user to see a new page. But most interactions add details or change body text or overlay data on the existing pages. In all of these cases, Ajax and a Web 2.0 approach make it possible to send and receive data without updating an entire HTML page. And to any frequent Web surfer, this ability will make your application feel faster, more responsive, and bring them back over and over again.
Introducing XMLHttpRequest
To make all this flash and wonder actually happen, you need to become intimately familiar with a JavaScript object called XMLHttpRequest. This little object -which has actually been around in several browsers for quite a while -- is the key to Web 2.0, Ajax, and pretty much everything else you learn about in this column for the next several months. To give you a really quick overview, these are just a few of the methods and properties you'll use on this object: open(): Sets up a new request to a server.
send(): Sends a request to a server. abort(): Bails out of the current request. readyState: Provides the current HTML ready state. responseText: The text that the server sends back to respond to a request.
Don't worry if you don't understand all of this (or any of this for that matter) -- you'll learn about each method and property in the next several articles. What you should get out of this, though, is a good idea of what to do with XMLHttpRequest. Notice that each of these methods and properties relate to sending a request and dealing with a response. In fact, if you saw every method and property of XMLHttpRequest, they would all relate to that very simple request/response model. So clearly, you won't learn about an amazing new GUI object or some sort of super-secret approach to creating user interaction; you will work with simple requests and simple responses. It might not sound exciting, but careful use of this one object can totally change your applications.
The simplicity of new

First, you need to create a new variable and assign it to an instance of the XMLHttpRequest object. That's pretty simple in JavaScript; you just use the new keyword with the object name, like you see in Listing 1. Listing 1. Create a new XMLHttpRequest object <script language="javascript" type="text/javascript"> var request = new XMLHttpRequest(); </script> That's not too hard, is it? Remember, JavaScript doesn't require typing on its variable, so you don't need anything like you see in Listing 2 (which might be how you'd create this object in Java). Listing 2. Java pseudo-code for creating XMLHttpRequest XMLHttpRequest request = new XMLHttpRequest(); So you create a variable in JavaScript with var, give it a name (like "request"), and then assign it to a new instance of XMLHttpRequest. At that point, you're ready to use the object in your functions.
Error handling
In real life, things can go wrong and this code doesn't provide any error-handling. A slightly better approach is to create this object and have it gracefully fail if something goes wrong. For example, many older browsers (believe it or not, people are still using old versions of Netscape Navigator) don't support XMLHttpRequest and you need to let those users know that something has gone wrong. Listing 3 shows how you might create this object so if something fails, it throws out a JavaScript alert. Listing 3. Create XMLHttpRequest with some error-handling abilities
<script language="javascript" type="text/javascript"> var request = false; try { request = new XMLHttpRequest(); } catch (failed) { request = false; } if (!request) alert("Error initializing XMLHttpRequest!"); </script> Make sure you understand each of these steps:
1.
2.
Create a new variable called request and assign it a false value. You'll use false as a condition that means the XMLHttpRequest object hasn't been created yet. Add in a try/catch block:
1. 2. 3. 4.
Try and create the XMLHttpRequest object. If that fails (catch (failed)), ensure that request is still set to false.
Check and see if request is still false (if things are going okay, it won't be). If there was a problem (and request is false), use a JavaScript alert to tell users there was a problem.
This is pretty simple; it takes longer to read and write about than it does to actually understand for most JavaScript and Web developers. Now you've got an error-proof piece of code that creates an XMLHttpRequest object and even lets you know if something went wrong.
Dealing with Microsoft

This all looks pretty good ... at least until you try this code in Internet Explorer. If you do, you're going to get something that looks an awful lot like Figure 1. Figure 1. Internet Explorer reporting an error
Clearly, something isn't working; Internet Explorer is hardly an out-of-date browser and about 70 percent of the world uses Internet Explorer. In other words, you won't do well in the Web world if you don't support Microsoft and Internet Explorer! So, you need a different approach to deal with Microsoft's browsers.
Microsoft playing nice? Much has been written about Ajax and Microsoft's increasing interest and presence in that space. In fact, Microsoft's newest version of Internet Explorer -- version 7.0, set to come out late in 2006 -- is supposed to move to supporting XMLHttpRequest directly, allowing you to use the new keyword instead of all the Msxml2.XMLHTTP creation code. Don't get too excited, though; you'll still need to support old browsers, so that cross-browser code isn't going away anytime soon.
It turns out that Microsoft supports Ajax, but calls its version of XMLHttpRequest something different. In fact, it calls it several different things. If you're using a newer version of Internet Explorer, you need to use an object called Msxml2.XMLHTTP; some older versions of Internet Explorer use Microsoft.XMLHTTP. You need to support these two object types (without losing the support you already have for non-Microsoft browsers). Check out Listing 4 which adds Microsoft support to the code you've already seen. Listing 4. Add support for Microsoft browsers <script language="javascript" type="text/javascript"> var request = false; try { request = new XMLHttpRequest(); } catch (trymicrosoft) { try { request = new ActiveXObject("Msxml2.XMLHTTP"); } catch (othermicrosoft) { try { request = new ActiveXObject("Microsoft.XMLHTTP"); } catch (failed) { request = false; } } } if (!request) alert("Error initializing XMLHttpRequest!"); </script> It's easy to get lost in the curly braces, so I'll walk you through this one step at a time:
1.
2.
Create a new variable called request and assign it a false value. Use false as a condition that means the XMLHttpRequest object isn't created yet. Add in a try/catch block:
1. 2.
Try and create the XMLHttpRequest object. If that fails (catch (trymicrosoft)):
1. 2. 3. 3. 4.
Try and create a Microsoft-compatible object using the newer versions of Microsoft (Msxml2.XMLHTTP). If that fails (catch (othermicrosoft)), try and create a Microsoft-compatible object using the older versions of Microsoft (Microsoft.XMLHTTP).
If that fails (catch (failed)), ensure that request is still set to false.
Check and see if request is still false (if things are okay, it won't be). If there was a problem (and request is false), use a JavaScript alert to tell users there was a problem.
Make these changes to your code and try things out in Internet Explorer again; you should see the form you created (without an error message). In my case, that results in something like Figure 2. Figure 2. Internet Explorer working normally
Static versus dynamic

Take a look back at Listings 1, 3, and 4 and notice that all of this code is nested directly within script tags. When JavaScript is coded like that and not put within a method or function body, it's called static JavaScript. This means that the code is run sometime before the page is displayed to the user. (It's not 100 percent clear from the specification precisely when this code runs and browsers do things differently; still, you're guaranteed that the code is run before users can interact with your page.) That's usually how most Ajax programmers create the XMLHttpRequest object. That said, you certainly can put this code into a method as shown in Listing 5. Listing 5. Move XMLHttpRequest creation code into a method <script language="javascript" type="text/javascript"> var request; function createRequest() { try { request = new XMLHttpRequest(); } catch (trymicrosoft) { try { request = new ActiveXObject("Msxml2.XMLHTTP"); } catch (othermicrosoft) { try { request = new ActiveXObject("Microsoft.XMLHTTP"); } catch (failed) { request = false; } } } if (!request) alert("Error initializing XMLHttpRequest!"); }
</script> With code setup like this, you'll need to call this method before you do any Ajax work. So you might have something like Listing 6. Listing 6. Use an XMLHttpRequest creation method <script language="javascript" type="text/javascript"> var request; function createRequest() { try { request = new XMLHttpRequest(); } catch (trymicrosoft) { try { request = new ActiveXObject("Msxml2.XMLHTTP"); } catch (othermicrosoft) { try { request = new ActiveXObject("Microsoft.XMLHTTP"); } catch (failed) { request = false; } } } if (!request) alert("Error initializing XMLHttpRequest!"); } function getCustomerInfo() { createRequest(); // Do something with the request variable } </script> The only concern with this code -- and the reason most Ajax programmers don't use this approach -- is that it delays error notification. Suppose you have a complex form with 10 or 15 fields, selection boxes, and the like, and you fire off some Ajax code when the user enters text in field 14 (way down the form). At that point, getCustomerInfo() runs, tries to create an XMLHttpRequest object, and (for this example) fails. Then an alert is spit out to the user, telling them (in so many words) that they can't use this application. But the user has already spent time entering data in the form! That's pretty annoying and annoyance is not something that typically entices users back to your site. In the case where you use static JavaScript, the user is going to get an error as soon as they hit your page. Is that also annoying? Perhaps; it could make users mad that your Web application won't run on their browser. However, it's certainly better than spitting out that same error after they've spent 10 minutes entering information. For that reason alone, I encourage you to set up your code statically and let users know early on about possible problems.
Sending requests with XMLHttpRequest

Once you have your request object, you can begin the request/response cycle. Remember, XMLHttpRequest's only purpose is to allow you to make requests and receive responses. Everything else -- changing the user interface, swapping out images, even interpreting the data that the server sends back -- is the job of JavaScript, CSS, or other code in your pages. With XMLHttpRequest ready for use, now you can make a request to a server.
Welcome to the sandbox

Ajax has a sandbox security model. As a result, your Ajax code (and specifically, the XMLHttpRequest object) can only make requests to the same domain on which it's running. You'll learn lots more about security and Ajax in an upcoming article, but for now realize that code running on your local machine can only make requests to server-side scripts on your local machine. If you have Ajax code running on www.breakneckpizza.com, it must make requests to scripts that run on www.breakneckpizza.com.
Setting the server URL

The first thing you need to determine is the URL of the server to connect to. This isn't specific to Ajax -- obviously you should know how to construct a URL by now -- but is still essential to making a connection. In most applications, you'll construct this URL from some set of static data combined with data from the form your users work with. For example, Listing 7 shows some JavaScript that grabs the value of the phone number field and then constructs a URL using that data.
Listing 7. Build a request URL <script language="javascript" type="text/javascript"> var request = false; try { request = new XMLHttpRequest(); } catch (trymicrosoft) { try { request = new ActiveXObject("Msxml2.XMLHTTP"); } catch (othermicrosoft) { try { request = new ActiveXObject("Microsoft.XMLHTTP"); } catch (failed) { request = false; } } } if (!request) alert("Error initializing XMLHttpRequest!"); function getCustomerInfo() { var phone = document.getElementById("phone").value; var url = "/cgi-local/lookupCustomer.php?phone=" + escape(phone); } </script> Nothing here should trip you up. First, the code creates a new variable named phone and assigns the value of the form field with an ID of "phone." Listing 8 shows the XHTML for this particular form in which you can see the phone field and its id attribute. Listing 8. The Break Neck Pizza form <body> <p><img src="breakneck-logo_4c.gif" alt="Break Neck Pizza" /></p> <form action="POST"> <p>Enter your phone number: <input type="text" size="14" name="phone" id="phone" onChange="getCustomerInfo();" /> </p> <p>Your order will be delivered to:</p> <div id="address"></div> <p>Type your order in here:</p> <p><textarea name="order" rows="6" cols="50" id="order"></textarea></p> <p><input type="submit" value="Order Pizza" id="submit" /></p> </form> </body> Also notice that when users enter their phone number or change the number, it fires off the getCustomerInfo() method shown in Listing 8. That method then grabs the number and uses it to construct a URL string stored in the url variable. Remember: Since Ajax code is sandboxed and can only connect to the same domain, you really shouldn't need a domain name in your URL. In this example, the script name is /cgi-local/lookupCustomer.php. Finally, the phone number is appended to this script as a GET parameter: "phone=" + escape(phone). If you've never seen the escape() method before, it's used to escape any characters that can't be sent as clear text correctly. For example, any spaces in the phone number are converted to %20 characters, making it possible to pass the characters along in the URL. You can add as many parameters as you need. For example, if you wanted to add another parameter, just append it onto the URL and separate parameters with the ampersand (&) character [the first parameter is separated from the script name with a question mark ( ?)].
Opening the request
With a URL to connect to, you can configure the request. You'll accomplish this using the open() method on your XMLHttpRequest object. This method takes as many as five parameters:
Does open() open? Internet developers disagree about what exactly the open() method does. What it does not do is actually open a request. If you were to monitor the network and data transfer between your XHTML/Ajax page and the script that it connects to, you wouldn't see any traffic when the open() method is called. It's unclear why the name was chosen, but it clearly wasn't a great choice.
request-type: The type of request to send. Typical values are GET or POST, but you can also send HEAD requests. url: The URL to connect to. asynch: True if you want the request to be asynchronous and false if it should be a synchronous request. This parameter is optional and defaults to true. username: If authentication is required, you can specify the username here. This is an optional parameter and has no default value. password: If authentication is required, you can specify the password here. This is an optional parameter and has no default value.
Typically, you'll use the first three of these. In fact, even when you want an asynchronous request, you should specify "true" as the third parameter. That's the default setting, but it's a nice bit of self-documentation to always indicate if the request is asynchronous or not. Put it all together and you usually end up with a line that looks a lot like Listing 9. Listing 9. Open the request function getCustomerInfo() { var phone = document.getElementById("phone").value; var url = "/cgi-local/lookupCustomer.php?phone=" + escape(phone); request.open("GET", url, true); } Once you have the URL figured out, then this is pretty trivial. For most requests, using GET is sufficient (you'll see the situations in which you might want to use POST in future articles); that, along with the URL, is all you need to use open().
A teaser on asynchronicity
In a later article in this series, I'll spend significant time on writing and using asynchronous code, but you should get an idea of why that last parameter in open() is so important. In a normal request/response model -- think Web 1.0 here -- the client (your browser or the code running on your local machine) makes a request to the server. That request is synchronous; in other words, the client waits for a response from the server. While the client is waiting, you usually get at least one of several forms of notification that you're waiting: An hourglass (especially on Windows).
A spinning beachball (usually on Mac machines). The application essentially freezes and sometimes the cursor changes.
This is what makes Web applications in particular feel clunky or slow -- the lack of real interactivity. When you push a button, your application essentially becomes unusable until the request you just triggered is responded to. If you've made a request that requires extensive server processing, that wait might be significant (at least for today's multi-processor, DSL, no-waiting world). An asynchronous request though, does not wait for the server to respond. You send a request and then your application continues on. Users can still enter data in a Web form, click other buttons, even leave the form. There's no spinning beachball or whirling hourglass and no big application freeze. The server quietly responds to the request and when it's finished, it let's the original requestor know that it's done (in ways you'll see in just a moment). The end result is an application that doesn't feel clunky or slow, but instead is responsive, interactive, and feels faster. This is just one component of Web 2.0, but it's a very important one. All the slick GUI components and Web design paradigms can't overcome a slow, synchronous request/response model.
Sending the request

Once you configure the request with open(), you're ready to send the request. Fortunately, the method for sending a request is named more properly than open(); it's simply called send(). send() takes only a single parameter, the content to send. But before you think too much on that, recall that you are already sending data through the URL itself: var url = "/cgi-local/lookupCustomer.php?phone=" + escape(phone); Although you can send data using send(), you can also send data through the URL itself. In fact, in GET requests (which will constitute as much as 80 percent of your typical Ajax usage), it's much easier to send data in the URL. When you start to send secure information or XML, then you want to look at sending content through send() (I'll discuss both secure data and XML messaging in a later article in this series). When you don't need to pass data along through send(), then just pass null as the argument to this method. So, to send a request in the example you've seen throughout this article, that's exactly what is needed (see Listing 10).
Listing 10. Send the request function getCustomerInfo() { var phone = document.getElementById("phone").value; var url = "/cgi-local/lookupCustomer.php?phone=" + escape(phone); request.open("GET", url, true); request.send(null); }
Specifying a callback method

At this point, you've done very little that feels new, revolutionary, or asynchronous. Granted, that little keyword "true" in the open() method sets up an asynchronous request. But other than that, this code resembles programming with Java servlets and JSPs, PHP, or Perl. So what's the big secret to Ajax and Web 2.0? The secret revolves around a simple property of XMLHttpRequest called onreadystatechange. First, be sure you understand the process that you created in this code (review Listing 10 if you need to). A request is set up and then made. Additionally, because this is a synchronous request, the JavaScript method ( getCustomerInfo() in the example) will not wait for the server. So the code will continue; in this case, that means that the method will exit and control will return to the form. Users can keep entering information and the application isn't going to wait on the server. This creates an interesting question, though: What happens when the server has finished processing the request? The answer, at least as the code stands right now, is nothing! Obviously, that's not good, so the server needs to have some type of instruction on what to do when it's finished processing the Referencing a function in JavaScript JavaScript is a loosely typed language and you can reference just about anything as a variable. So if you declare a function called updatePage(), JavaScript also treats that function name as a variable. In other words, you can reference the function in your code as a variable named updatePage. request sent to it by XMLHttpRequest. This is where that onreadystatechange property comes into play. This property allows you to specify a callback method. A callback allows the server to (can you guess?) call back into your Web page's code. It gives a degree of control to the server, as well; when the server finishes a request, it looks in the XMLHttpRequest object and specifically at the onreadystatechange property. Whatever method is specified by that property is then invoked. It's a callback because the server initiates calling back into the Web page -- regardless of what is going in the Web page itself. For example, it might call this method while the user is sitting in her chair, not touching the keyboard; however, it might also call the method while the user is typing, moving the mouse, scrolling, clicking a button ... it doesn't matter what the user is doing. This is actually where the asynchronicity comes into play: The user operates the form on one level while on another level, the server answers a request and then fires off the callback method indicated by the onreadystatechange property. So you need to specify that method in your code as shown in Listing 11. Listing 11. Set a callback method function getCustomerInfo() { var phone = document.getElementById("phone").value; var url = "/cgi-local/lookupCustomer.php?phone=" + escape(phone); request.open("GET", url, true); request.onreadystatechange = updatePage; request.send(null); } Pay close attention to where in the code this property is set -- it's before send() is called. You must set this property before the request is sent, so the server can look up the property when it finishes answering a request. All that's left now is to code the updatePage() which is the focus of the last section in this article.
Handling server responses

You made your request, your user is happily working in the Web form (while the server handles the request), and now the server finishes up handling the request. The server looks at the onreadystatechange property and figures out what method to call. Once that occurs, you can think of your application as any other app, asynchronous or not. In other words, you don't have to take any special action writing methods that respond to the server; just change the form,
take the user to another URL, or do whatever else you need to in response to the server. In this section, we'll focus on responding to the server and then taking a typical action -- changing on the fly part of the form the user sees.
Callbacks and Ajax

You've already seen how to let the server know what to do when it's finished: Set the onreadystatechange property of the XMLHttpRequest object to the name of the function to run. Then, when the server has processed the request, it will automatically call that function. You also don't need to worry about any parameters to that method. You'll start with a simple method like in Listing 12. Listing 12. Code the callback method <script language="javascript" type="text/javascript"> var request = false; try { request = new XMLHttpRequest(); } catch (trymicrosoft) { try { request = new ActiveXObject("Msxml2.XMLHTTP"); } catch (othermicrosoft) { try { request = new ActiveXObject("Microsoft.XMLHTTP"); } catch (failed) { request = false; } } } if (!request) alert("Error initializing XMLHttpRequest!"); function getCustomerInfo() { var phone = document.getElementById("phone").value; var url = "/cgi-local/lookupCustomer.php?phone=" + escape(phone); request.open("GET", url, true); request.onreadystatechange = updatePage; request.send(null); } function updatePage() { alert("Server is done!"); } </script> This just spits out a handy alert, to tell you when the server is done. Try this code in your own page, save the page, and then pull it up in a browser (if you want the XHTML from this example, refer back to Listing 8). When you enter in a phone number and leave the field, you should see the alert pop up (see Figure 3); but click OK and it pops up again ... and again. Figure 3. Ajax code popping up an alert
Depending on your browser, you'll get two, three, or even four alerts before the form stops popping up alerts. So what's going on? It turns out that you haven't taken into account the HTTP ready state, an important component of the request/response cycle.
HTTP ready states

Earlier, I said that the server, once finished with a request, looks up what method to call in the onreadystatechange property of XMLHttpRequest. That's true, but it's not the whole truth. In fact, it calls that method every time the HTTP ready state changes. So what does that mean? Well, you've got to understand HTTP ready states first. An HTTP ready state indicates the state or status of a request. It's used to figure out if a request has been started, if it's being answered, or if the request/response model has completed. It's also helpful in determining whether it's safe to read whatever response text or data that a server might have supplied. You need to know about five ready states in your Ajax applications: 0: The request is uninitialized (before you've called open()).
1: The request is set up, but hasn't been sent (before you've called send()). 2: The request was sent and is being processed (you can usually get content headers from the response at this point). 3: The request is being processed; often some partial data is available from the response, but the server hasn't finished with its response. 4: The response is complete; you can get the server's response and use it.
As with almost all cross-browser issues, these ready states are used somewhat inconsistently. You might expect to always see the ready state move from 0 to 1 to 2 to 3 to 4, but in practice, that's rarely the case. Some browsers never report 0 or 1 and jump straight to 2, then 3, and then 4. Other browsers report all states. Still others will report ready state 1 multiple times. As you saw in the last section, the server called updatePage() several times and each invocation resulted in an alert box popping up -- probably not what you intended! For Ajax programming, the only state you need to deal with directly is ready state 4, indicating that a server's response is complete and it's safe to check the response data and use it. To account for this, the first line in your callback method should be as shown in Listing 13. Listing 13. Check the ready state function updatePage() { if (request.readyState == 4) alert("Server is done!"); } This change checks to ensure that the server really is finished with the process. Try running this version of the Ajax code and you should only get the alert message one time, which is as it should be.
HTTP status codes
Despite the apparent success of the code in Listing 13, there's still a problem -- what if the server responds to your request and finishes processing, but reports an error? Remember, your server-side code should care if it's being called by Ajax, a JSP, a regular HTML form, or any other type of code; it only has the traditional Web-specific methods of reporting information. And in the Web world, HTTP codes can deal with the various things that might happen in a request. For example, you've certainly entered a request for a URL, typed the URL incorrectly, and received a 404 error code to indicate a page is missing. This is just one of many status codes that HTTP requests can receive as a status (see Resources for a link to the complete list of status codes). 403 and 401, both indicating secure or forbidden data being accessed, are also common. In each of these cases, these are codes that result from a completed response. In other words, the server fulfilled the request (meaning the HTTP ready state is 4), but is probably not returning the data expected by the client. In addition to the ready state then, you also need to check the HTTP status. You're looking for a status code of 200 which simply means okay. With a ready state of 4 and a status code of 200, you're ready to process the server's data and that data should be what you asked for (and not an error or other problematic piece of information). Add another status check to your callback method as shown in Listing 14. Listing 14. Check the HTTP status code function updatePage() { if (request.readyState == 4) if (request.status == 200) alert("Server is done!"); } To add more robust error handling -- with minimal complication -- you might add a check or two for other status codes; check out the modified version of updatePage() in Listing 15. Listing 15. Add some light error checking function updatePage() { if (request.readyState == 4) if (request.status == 200) alert("Server is done!"); else if (request.status == 404) alert("Request URL does not exist"); else alert("Error: status code is " + request.status); } Now change the URL in your getCustomerInfo() to a non-existent URL and see what happens. You should see an alert that tells you the URL you asked for doesn't exist -- perfect! This is hardly going to handle every error condition, but it's a simple change that covers 80 percent of the problems that can occur in a typical Web application.
Reading the response text

Now that you made sure the request was completely processed (through the ready state) and the server gave you a normal, okay response (through the status code), you can finally deal with the data sent back by the server. This is conveniently stored in the responseText property of the XMLHttpRequest object. Details about what the text in responseText looks like, in terms of format or length, is left intentionally vague. This allows the server to set this text to virtually anything. For instance, one script might return comma-separated values, another pipe-separated values (the pipe is the | character), and another may return one long string of text. It's all up to the server. In the case of the example used in this article, the server returns a customer's last order and then their address, separated by the pipe symbol. The order and address are both then used to set values of elements on the form; Listing 16 shows the code that updates the display. Listing 16. Deal with the server's response function updatePage() { if (request.readyState == 4) { if (request.status == 200) { var response = request.responseText.split("|"); document.getElementById("order").value = response[0]; document.getElementById("address").innerHTML = response[1].replace(/\n/g, " "); } else alert("status is " + request.status);
} } First, the responseText is pulled and split on the pipe symbol using the JavaScript split() method. The resulting array of values is dropped into response. The first value -- the customer's last order -- is accessed in the array as response[0] and is set as the value of the field with an ID of "order." The second value in the array, at response[1], is the customer's address and it takes a little more processing. Since the lines in the address are separated by normal line separators (the "\n" character), the code needs to replace these with XHTML-style line separators, <br />s. That's accomplished through the use of the replace() function along with a regular expression. Finally, the modified text is set as the inner HTML of a div in the HTML form. The result is that the form suddenly is updated with the customer's information, as you can see in Figure 4. Figure 4. The Break Neck form after it retrieves customer data
Before I wrap up, another important property of XMLHttpRequest is called responseXML. That property contains (can you guess?) an XML response in the event that the server chooses to respond with XML. Dealing with an XML response is quite different than dealing with plain text and involves parsing, the Document Object Model (DOM), and several other considerations. You'll learn more about XML in a future article. Still, because responseXML commonly comes up in discussions surrounding responseText, it's worth mentioning here. For many simple Ajax applications, responseText is all you need, but you'll soon learn about dealing with XML through Ajax applications as well.
In conclusion
You might be a little tired of XMLHttpRequest -- I rarely read an entire article about a single object, especially one that is this simple. However, you will use this object over and over again in each page and application that you write that uses Ajax. Truth be told, there's quite a bit still to be said about XMLHttpRequest. In coming articles, you'll learn to use POST in addition to GET in your requests, set and read content headers in your request as well as the response from the server; you'll understand how to encode your requests and even handle XML in your request/response model. Quite a bit further down the line, you'll also see some of the popular Ajax toolkits that are available. These toolkits actually abstract away most of the details discussed in this article and make Ajax programming easier. You might even wonder why you have to code all this low-level detail when toolkits are so readily available. The answer is, it's awfully hard to figure out what goes wrong in your application if you don't understand what is going on in your application. So don't ignore these details or speed through them; when your handy-dandy toolkit creates an error, you won't be stuck scratching your head and sending an email to support. With an understanding of how to use XMLHttpRequest directly, you'll find it easy to debug and fix even the strangest problems. Toolkits
are fine unless you count on them to take care of all your problems. So get comfortable with XMLHttpRequest. In fact, if you have Ajax code running that uses a toolkit, try to rewrite it using just the XMLHttpRequest object and its properties and methods. It will be a great exercise and probably help you understand what's going on a lot better. In the next article, you'll dig even deeper into this object, exploring some of its tricker properties (like responseXML), as well as how to use POST requests and send data in several different formats. So start coding and check back here in about a month.
Mastering Ajax, Part 3: Advanced requests and responses in Ajax

Gain a complete understanding of HTTP status codes, ready states, and the XMLHttpRequest object Level: Introductory Brett McLaughlin (brett@newInstance.com), Author and Editor, O'Reilly Media Inc. 14 Feb 2006 For many Web developers, making simple requests and receiving simple responses is all they'll ever need, but for developers who want to master Ajax, a complete understanding of HTTP status codes, ready states, and the XMLHttpRequest object is required. In this article, Brett McLaughlin will show you the different status codes and demonstrate how browsers handle each and he will showcase the lesser-used HTTP requests that you can make with Ajax. In the last article in this series, I provided a solid introduction to the XMLHttpRequest object, the centerpiece of an Ajax application that handles requests to a server-side application or script, and also deals with return data from that server-side component. Every Ajax application uses the XMLHttpRequest object, so you'll want to be intimately familiar with it to make your Ajax applications perform and perform well. In this article, I move beyond the basics in the last article and concentrate on more detail about three key parts of this request object: The HTTP ready state
The HTTP status code The types of requests that you can make XMLHttpRequest or XMLHttp: A rose by any other name Microsoft and Internet Explorer use an object called XMLHttp instead of the XMLHttpRequest object used by Mozilla, Opera, Safari, and most nonMicrosoft browsers. For the sake of simplicity, I refer to both of these object types simply as XMLHttpRequest. This matches the common practice you'll find all over the Web and is also in line with Microsoft's intentions of using XMLHttpRequest as the name of their request object in Internet Explorer 7.0. (For more on this, look at Part 2 again.)
Each of these is generally considered part of the plumbing of a request; as a result, little detail is recorded about these subjects. However, you will need to be fluent in ready states, status codes, and requests if you want to do more than just dabble in Ajax programming. When something goes wrong in your application -- and things always go wrong -- understanding ready states, how to make a HEAD request, or what a 400 status code means can make the difference between five minutes of debugging and five hours of frustration and confusion. I'll look at HTTP ready states first.
Digging deeper into HTTP ready states

You should remember from the last article that the XMLHttpRequest object has a property called readyState. This property ensures that a server has completed a request and typically, a callback function uses the data from the server to update a Web form or page. Listing 1 shows a simple example of this (also in the last article in this series -- see Resources). Listing 1. Deal with a server's response in a callback function function updatePage() { if (request.readyState == 4) { if (request.status == 200) { var response = request.responseText.split("|"); document.getElementById("order").value = response[0]; document.getElementById("address").innerHTML = response[1].replace(/\n/g, "<br />"); } else alert("status is " + request.status); } } This is definitely the most common (and most simple) usage of ready states. As you might guess from the number "4," though, there are several other ready states (you also saw this list in the last article -- see Resources): 0: The request is uninitialized (before you've called open()).
1: The request is set up, but not sent (before you've called send()). 2: The request was sent and is in process (you can usually get content headers from the response at this point). 3: The request is in process; often some partial data is available from the response, but the server isn't finished with its response. 4: The response is complete; you can get the server's response and use it.
If you want to go beyond the basics of Ajax programming, you need to know not only these states, but when they occur and how you can use them. First and foremost, you need to learn at what state of a request you encounter each ready state. Unfortunately, this is fairly non-intuitive and also involves a few special cases.
Ready states in hiding

The first ready state, signified by the readyState property 0 (readyState == 0), represents an uninitialized request. As soon as you call open() on your request object, this property is set to 1. Because you almost always call open() as soon as you initialize your request, it's rare to see readyState == 0. Furthermore, the uninitialized ready state is pretty useless in practical applications. Still, in the interest of being complete, check out Listing 2 which shows how to get the ready state when it's set to 0. Listing 2. Get a 0 ready state function getSalesData() { // Create a request object createRequest(); alert("Ready state is: " + request.readyState); // Setup (initialize) the request var url = "/boards/servlet/UpdateBoardSales"; request.open("GET", url, true); request.onreadystatechange = updatePage; request.send(null); } In this simple example, getSalesData() is the function that your Web page calls to start a request (like when a button is clicked). Note that you've got to check the ready state before open() is called. Figure 1 shows the result of running this application. Figure 1. A ready state of 0
Obviously, this doesn't do you much good; there are very few times when you'll need to make sure that open() hasn't been called. The only use for this ready state in almost-real-world Ajax programming is if you make multiple requests using the same XMLHttpRequest object across multiple functions. In that (rather unusual) situation, you might want to ensure that a request object is in an uninitialized state (readyState == 0) before making a new requests. This essentially ensures that another function isn't using the object at the same time.
When 0 is equal to 4 In the use case where multiple JavaScript functions use the same request object, checking for a ready state of 0 to ensure that the request object isn't in use can still turn out to be problematic. Since readyState == 4 indicates a completed request, you'll often find request objects that are not being used with their ready state still set at 4 -- the data from the server was used, but nothing has occurred since then to reset the ready state. There is a function that resets a request object called abort(), but it's not really intended for this use. If you have to use multiple functions, it might be better to create and use a request object for each function rather than to share the object across multiple functions.
Viewing an in-progress request's ready state

Aside from the 0 ready state, your request object should go through each of the other ready states in a typical request and response, and finally end up with a ready state of 4. That's when the if (request.readyState == 4) line of code you see in most callback functions comes in; it ensures the server is done and it's safe to update a Web page or take action based on data from the server. It is a trivial task to actually see this process as it takes place. Instead of only running code in your callback if the ready state is 4, just output the ready state every time that your callback is called. For an example of code that does this, check out Listing 3. Listing 3. Check the ready state function updatePage() { // Output the current ready state alert("updatePage() called with ready state of " + request.readyState); } If you're not sure how to get this running, you'll need to create a function to call from your Web page and have it send a request to a server-side component (just such a function was shown in Listing 2 and throughout the examples in both the first and second articles in this series). Make sure that when you set up your request, you set the callback function to updatePage(); to do this, set the onreadystatechange property of your request object to updatePage(). This code is a great illustration of exactly what onreadystatechange means -- every time the request's ready state changes, updatePage() is called and you see an alert. Figure 2 shows a sample of this function being called, in this case with a ready state of 1. Figure 2. A ready state of 1
Try this code yourself. Put it into your Web page and then activate your event handler (click a button, tab out of a field, or use whatever method you set up to trigger a request). Your callback function will run several times -- each time the ready state of the request changes -- and you'll see an alert for each ready state. This is the best way to follow a request through each of its stages.
Browser inconsistencies
Once you've a basic understanding of this process, try to access your Web page from several different browsers. You should notice some inconsistencies in
how these ready states are handled. For example, in Firefox 1.5, you see the following ready states: 1
2 3 4
This shouldn't be a surprise since each stage of a request is represented here. However, if you access the same application using Safari, you should see -- or rather, not see -- something interesting. Here are the states you see on Safari 2.0.1: 2
3 4
Safari actually leaves out the first ready state and there's no sensible explanation as to why; it's simply the way Safari works. It also illustrates an important point: While it's a good idea to ensure the ready state of a request is 4 before using data from the server, writing code that depends on each interim ready state is a sure way to get different results on different browsers . For example, when using Opera 8.5, things are even worse with displayed ready states: 3
Last but not least, Internet Explorer responds with the following states: 1
2 3 4
If you have trouble with a request, this is the very first place to look for problems. Add an alert to show you the request's ready state so you can ensure that things are operating normally. Better yet, test on both Internet Explorer and Firefox -- you'll get all four ready states and be able to check each stage of the request. Next I look at the response side.
Response data under the microscope

Once you understand the various ready states that occur during a request, you're ready to look at another important piece of the XMLHttpRequest object -the responseText property. Remember from the last article that this is the property used to get data from the server. Once a server has finished processing a request, it places any data that is needed to respond to the request back in the responseText of the request. Then, your callback function can use that data, as seen in Listing 1 and Listing 4. Listing 4. Use the response from the server function updatePage() { if (request.readyState == 4) { var newTotal = request.responseText; var totalSoldEl = document.getElementById("total-sold"); var netProfitEl = document.getElementById("net-profit"); replaceText(totalSoldEl, newTotal); /* Figure out the new net profit */ var boardCostEl = document.getElementById("board-cost"); var boardCost = getText(boardCostEl); var manCostEl = document.getElementById("man-cost"); var manCost = getText(manCostEl); var profitPerBoard = boardCost - manCost; var netProfit = profitPerBoard * newTotal; /* Update the net profit on the sales form */ netProfit = Math.round(netProfit * 100) / 100; replaceText(netProfitEl, netProfit); } Listing 1 is fairly simple; Listing 4 is a little more complicated, but to begin, both check the ready state and then grab the value (or values) in the responseText property.
Viewing the response text during a request

Like the ready state, the value of the responseText property changes throughout the request's life cycle. To see this in action, use code like that shown in Listing 5 to test the response text of a request, as well as its ready state. Listing 5. Test the responseText property function updatePage() { // Output the current ready state alert("updatePage() called with ready state of " + request.readyState + " and a response text of '" + request.responseText + "'"); } Now open your Web application in a browser and activate your request. To get the most out of this code, use either Firefox or Internet Explorer since those two browsers both report all possible ready states during a request. At a ready state of 2, for example, the responseText property is undefined (see Figure 3); you'd see an error if the JavaScript console was open as well. Figure 3. Response text with a ready state of 2
At ready state 3 though, the server has placed a value in the responseText property, at least in this example (see Figure 4). Figure 4. Response text with a ready state of 3
You'll find that your responses in ready state 3 vary from script to script, server to server, and browser to browser. Still, this is remains incredibly helpful in debugging your application.
Getting safe data

All of the documentation and specifications insist that it's only when the ready state is 4 that data is safe to use. Trust me, you'll rarely find a case where the data cannot be obtained from the responseText property when the ready state is 3. However, to rely on that in your application is a bad idea -- the one time you write code that depends on complete data at ready state 3 is almost guaranteed to be the time the data is incomplete. A better idea is to provide some feedback to the user that when the ready state is 3, a response is forthcoming. While using a function like alert() is obviously a bad idea -- using Ajax and then blocking the user with an alert dialog box is pretty counterintuitive -- you could update a field on your form or page as the ready state changes. For example, try to set the width of a progress indicator to 25 percent for a ready state of 1, 50 percent for a ready state of 2, 75 percent for a ready state of 3, and 100 percent (complete) when the ready state is 4. Of course, as you've seen, this approach is clever but browser-dependent. On Opera, you'll never get those first two ready states and Safari drops the first one (1). For that reason, I'll leave code like this as an exercise rather than include it in this article. Time to take a look at status codes.
A closer look at HTTP status codes

With ready states and the server's response in your bag of Ajax programming techniques, you're ready to add another level of sophistication to your Ajax applications -- working with HTTP status codes. These codes are nothing new to Ajax. They've been around on the Web for as long as there has been a Web. You've probably already seen several of these through your Web browser: 401: Unauthorized
403: Forbidden 404: Not Found
You can find more (for a complete list, see Resources). To add another layer of control and responsiveness (and particularly more robust error-handling) to your Ajax applications, then you need to check the status codes in a request and respond appropriately.
200: Everything is OK
In many Ajax applications, you'll see a callback function that checks for a ready state and then goes on to work with the data from the server response, as in Listing 6. Listing 6. Callback function that ignores the status code function updatePage() { if (request.readyState == 4) { var response = request.responseText.split("|"); document.getElementById("order").value = response[0]; document.getElementById("address").innerHTML = response[1].replace(/\n/g, "<br />"); } }
This turns out to be a short-sighted and error-prone approach to Ajax programming. If a script requires authentication and your request does not provide valid credentials, the server will return an error code like 403 or 401. However, the ready state will be set to 4 since the server answered the request (even if the answer wasn't what you wanted or expected for your request). As a result, the user is not going to get valid data and might even get a nasty error when your JavaScript tries to use non-existent server data. It takes minimal effort to ensure that the server not only finished with a request, but returned an "Everything is OK" status code. That code is "200" and is reported through the status property of the XMLHttpRequest object. To make sure that not only did the server finish with a request but that it also reported an OK status, add an additional check in your callback function as shown in Listing 7. Listing 7. Check for a valid status code function updatePage() { if (request.readyState == 4) { if (request.status == 200) { var response = request.responseText.split("|"); document.getElementById("order").value = response[0]; document.getElementById("address").innerHTML = response[1].replace(/\n/g, "<br />"); } else alert("status is " + request.status); } } With the addition of a few lines of code, you can be certain that if something does go wrong, your users will get a (questionably) helpful error message rather than seeing a page of garbled data with no explanation.
Redirection and rerouting

Before I talk in depth about errors, it's worth talking about something you probably don't have to worry about when you're using Ajax -- redirections. In HTTP status codes, this is the 300 family of status codes, including: 301: Moved permanently
302: Found (the request was redirected to another URL/URI) 305: Use Proxy (the request must use a proxy to access the resource requested)
Ajax programmers probably aren't concerned about redirections for two reasons: First, Ajax applications are almost always written for a specific server-side script, servlet, or application. For that component to disappear and move somewhere else without you, the Ajax programmer, knowing about it is pretty rare. So more often than not, you'll know that a resource has moved (because you moved it or had it moved), change the URL in your request, and never encounter this sort of result. And an even more relevant reason is: Ajax applications and requests are sandboxed. This means that the domain that serves a Web page that Edge cases and hard cases At this point, novice programmers might wonder what all this fuss is about. It's certainly true that less than 5 percent of Ajax requests require working with ready states like 2 and 3 and status codes like 403 (and in fact, it might be much closer to 1 percent or less). These cases are important and are called edge cases -- situations that occur in very unusual situations in which the oddest conditions are met. While unusual, edge cases make up about 80 percent of most users' frustrations! Typical users forget the 100 times an application worked correctly, but clearly remember the time it didn't. If you can handle the edge cases -and hard cases -- smoothly, then you'll have content users that return to your site. makes Ajax requests is the domain that must respond to those requests. So a Web page served from ebay.com cannot make an Ajax-style request to a script running on amazon.com; Ajax applications on ibm.com cannot make requests to servlets running on netbeans.org. As a result, your requests aren't able to be redirected to another server without generating a security error. In those cases, you won't get a status code at all. You'll usually just have a JavaScript error in the debug console. So, while you think about plenty of status codes, you can largely ignore the redirection codes altogether.
Errors
Once you've taken care of status code 200 and realized you can largely ignore the 300 family of status codes, the only other group of codes to worry about is the 400 family, which indicates various types of errors. Look back at Listing 7 and notice that while errors are handled, it's only a very generic error message that is output to the user. While this is a step in the right direction, it's still a pretty useless message in terms of telling the user or a programmer
working on the application what actually went wrong. First, add support for missing pages. This really shouldn't happen much in production systems, but it's not uncommon in testing for a script to move or for a programmer to enter an incorrect URL. If you can report 404 errors gracefully, you're going to provide a lot more help to confused users and programmers. For example, if a script on the server was removed and you use the code in Listing 7, you'd see a non-descript error as shown in Figure 5. Figure 5. Generic error handling
The user has no way to tell if the problem is authentication, a missing script (which is the case here), user error, or even whether something in the code caused the problem. Some simple code additions can make this error a lot more specific. Take a look at Listing 8 which handles missing scripts as well as authentication errors with a specific message. Listing 8. Check for a valid status code function updatePage() { if (request.readyState == 4) { if (request.status == 200) { var response = request.responseText.split("|"); document.getElementById("order").value = response[0]; document.getElementById("address").innerHTML = response[1].replace(/\n/g, "<br />"); } else if (request.status == 404) { alert ("Requested URL is not found."); } else if (request.status == 403) { alert("Access denied."); } else alert("status is " + request.status); } } This is still rather simple, but it does provide some additional information. Figure 6 shows the same error as in Figure 5, but this time the error-handling code gives a much better picture of what happened to the user or programmer. Figure 6. Specific error handling
In your own applications, you might consider clearing the username and password when failure occurs because of authentication and adding an error message to the screen. Similar approaches can be taken to more gracefully handle missing scripts or other 400-type errors (such as 405 for an unacceptable request method like sending a HEAD request that is not allowed, or 407 in which proxy authentication is required). Whatever choices you make, though, it begins with handling the status code returned from the server.
Additional request types

If you really want to take control of the XMLHttpRequest object, consider this one last stop -- add HEAD requests to your repertoire. In the previous two articles, I showed you how to make GET requests; in an upcoming article, you'll learn all about sending data to the server using POST requests. In the spirit of enhanced error handling and information gathering, though, you should learn how to make HEAD requests.
Making the request

Actually making a HEAD request is quite trivial; you simply call the open() method with "HEAD" instead of "GET" or "POST" as the first parameter, as shown in Listing 9. Listing 9. Make a HEAD request with Ajax function getSalesData() { createRequest(); var url = "/boards/servlet/UpdateBoardSales"; request.open("HEAD", url, true); request.onreadystatechange = updatePage; request.send(null); } When you make a HEAD request like this, the server doesn't return an actual response as it would for a GET or POST request. Instead, the server only has to return the headers of the resource which include the last time the content in the response was modified, whether the requested resource exists, and quite a few other interesting informational bits. You can use several of these to find out about a resource before the server has to process and return that resource. The easiest thing you can do with a request like this is to simply spit out all of the response headers. This gives you a feel for what's available to you through HEAD requests. Listing 10 provides a simple callback function to output all the response headers from a HEAD request. Listing 10. Print all the response headers from a HEAD request function updatePage() { if (request.readyState == 4) { alert(request.getAllResponseHeaders());
} } Check out Figure 7, which shows the response headers from a simple Ajax application that makes a HEAD request to a server. Figure 7. Response headers from a HEAD request
You can use any of these headers (from the server type to the content type) individually to provide extra information or functionality within an Ajax application.
Checking for a URL

You've already seen how to check for a 404 error when a URL doesn't exist. If this turns into a common problem -- perhaps a certain script or servlet is offline quite a bit -- you might want to check for the URL before making a full GET or POST request. To do this, make a HEAD request and then check for a 404 error in your callback function; Listing 11 shows a sample callback. Listing 11. Check to see if a URL exists function updatePage() { if (request.readyState == 4) { if (request.status == 200) { alert("URL exists"); } else if (request.status == 404) { alert("URL does not exist."); } else { alert("Status is: " + request.status); } } } To be honest, this has little value. The server has to respond to the request and figure out a response to populate the content-length response header, so you don't save any processing time. Additionally, it takes just as much time to make the request and see if the URL exists using a HEAD request as it does to make the request using GET or POST than just handling errors as shown in Listing 7. Still, sometimes it's useful to know exactly what is available; you never know when creativity will strike and you'll need the HEAD request!
Useful HEAD requests

One area where you'll find a HEAD request useful is to check the content length or even the content type. This allows you to determine if a huge amount of data will be sent back to process a request or if the server will try to return binary data instead of HTML, text, or XML (which are all three much easier to
process in JavaScript than binary data). In these cases, you just use the appropriate header name and pass it to the getResponseHeader() method on the XMLHttpRequest object. So to get the length of a response, just call request.getResponseHeader("Content-Length");. To get the content type, use request.getResponseHeader("Content-Type");. In many applications, making HEAD requests adds no functionality and might even slow down a request (by forcing a HEAD request to get data about the response and then a subsequent GET or POST request to actually get the response). However, in the event that you are unsure about a script or server-side component, a HEAD request can allow you to get some basic data without dealing with response data or needing the bandwidth to send that response.
In conclusion
For many Ajax and Web programmers, the material in this article seems fairly advanced. What is the value in making a HEAD request? What is really a case where you should handle a redirection status code explicitly in your JavaScript? These are good questions; for simple applications, the answer is that these advanced techniques are unlikely to be valuable. However, the Web is no longer a place where simple applications are tolerated; users have become more advanced, customers expect robustness and advanced error reporting, and managers are fired because an application goes down 1 percent of the time. It's your job then, to go beyond a simple application and that requires a more thorough understanding of XMLHttpRequest. If you can account for the various ready states -- and understand how they differ from browser to browser -- you'll be able to debug an application quickly. You might even come up with creative functionality based on a ready status and report on a request's status to users and customers. If you have a handle on status codes, you can set your application up to deal with a script's errors, unexpected responses, and edge cases. As a result, your application will work all of the time, rather than just in the situation where everything goes exactly right. Add to this the ability to make HEAD requests, check for the existence of a URL, and find out when a file was modified, and you can ensure that users get valid pages, are up-to-date on their information, and (most importantly) surprise them with just how robust and versatile your application is. This article isn't going to make your applications flashy, help you highlight text with fading yellow spotlights, or feel more like a desktop. While these are all strengths of Ajax (and topics we'll cover in subsequent articles), they are to some degree just icing on the cake. If you can use Ajax to build a solid foundation in which your application handles errors and problems smoothly, users will return to your site and application. Add to this the visual trickery that I'll talk about in upcoming articles and you'll have thrilled, excited, and happy customers. (Seriously, you don't want to miss the next article!)
Mastering Ajax, Part 4: Exploiting DOM for Web response

Convert HTML into an object model to make Web pages responsive and interactive Level: Introductory Brett McLaughlin (brett@newInstance.com), Author and Editor, O'Reilly Media Inc. 14 Mar 2006 The great divide between programmers (who work with back-end applications) and Web programmers (who spend their time writing HTML, CSS, and JavaScript) is long standing. However, the Document Object Model (DOM) bridges the chasm and makes working with both XML on the back end and HTML on the front end possible and an effective tool. In this article, Brett McLaughlin introduces the Document Object Model, explains its use in Web pages, and starts to explore its usage from JavaScript. Like many Web programmers, you have probably worked with HTML. HTML is how programmers start to work on a Web page; HTML is often the last thing they do as they finish up an application or site, and tweak that last bit of placement, color, or style. And, just as common as using HTML is the misconception about what exactly happens to that HTML once it goes to a browser to render to the screen. Before I dive into what you might think happens -- and why it is probably wrong -- I want you need to be clear on the process involved in designing and serving Web pages: 1. Someone (usually you!) creates HTML in a text editor or IDE. 2. You then upload the HTML to a Web server, like Apache HTTPD, and make it public on the Internet or an intranet. 3. A user requests your Web page with a browser like Firefox or Safari. 4. The user's browser makes a request for the HTML to your Web server. 5. The browser renders the page it receives from the server graphically and textually; users look at and activate the Web page. While this feels very basic, things will get interesting quickly. In fact, the tremendous amount of "stuff" that happens between steps 4 and 5 is what the focus of this article. The term "stuff" really applies too, since most programmers never really consider exactly what happens to their markup when a user's browser is asked to display it: Does the browser just read the text in the HTML and display it?
What about CSS, especially if the CSS is in an external file? And what about JavaScript -- again often in an external file? How does the browser handle these items and how does it map event handlers, functions, and styles to that textual markup?
It turns out that the answer to all these questions is the Document Object Model. So, without further ado, let's dig into the DOM.
Web programmers and markup

For most programmers, their job ends where the Web browser begins. In other words, once you drop a file of HTML into a directory on your Web server, you usually file it away as "done" and (hopefully) never really think about it again! That's a great goal when it comes to writing clean, well-organized pages, too; there's nothing wrong with wanting your markup to display what it should, across browsers, with various versions of CSS and JavaScript.
The problem is that this approach limits a programmer's understanding of what really happens in the browser. More importantly, it limits your ability to update, change, and restructure a Web page dynamically using client-side JavaScript. Get rid of that limitation, and allow even greater interaction and creativity on your Web sites.
What the programmer does

As the typical Web programmer, you probably fire up your text editor and IDE and start to enter HTML, CSS, or even JavaScript. It's easy to think about those tags and selectors and attributes only as little tasks that you do to make a site look just right. But you need to stretch your mind beyond that point -instead, realize that you are organizing your content. Don't worry; I promise this won't turn into a lecture on the beauty of markup, how you must realize the true potential of your Web page, or anything else meta-physical. What you do need to understand is exactly what your role in Web development is. When it comes to how a page looks, at best you're only able to make suggestions. When you provide a CSS stylesheet, a user can override your style choices. When you provide a font size, a user's browser can alter those sizes for the visually impaired or scale them down on massive monitors (with equally massive resolutions). Even the colors and font faces you choose are subject to the monitor of users and the fonts that users install on their systems. While it's great to do your best in styling a page, that's not the largest impact you have on a Web page. What you do absolutely control is the structure of your Web page. Your markup is unchangeable and unalterable and users can't mess with it; their browsers can only retrieve it from your Web server and display it (albeit with a style more in line with the user's tastes than your own). But the organization of that page -- whether this word is inside that paragraph or in the other div -- is solely up to you. When it comes to actually changing your page (which is what most Ajax applications focus on), it's the structure of your page that you operate on. While it's nice to change the color of a piece of text, it's much more dramatic to add text or an entire section to an existing page. No matter how the user styles that section, you work with the organization of the page itself.
What the markup does

Some additional thoughts on markup Plain text editing: Right or wrong? Plain text files are ideal for storing markup, but that doesn't hold true for editing that markup. It's perfectly acceptable to use an IDE like Macromedia DreamWeaver -or the slightly more intrusive Microsoft FrontPage -- to work on Web page markup. These environments often offer shortcuts and help in creating Web pages, especially when you're using CSS and JavaScript, each from a file external to an actual page's markup. Many folks still prefer good old Notepad or vi (I confess I'm one of those folk) and that's a great option as well. In either case, the final result is a text file full of markup. Text over the network: A good thing As already mentioned, text is a great medium for a document -- like HTML or CSS -- that is transferred over a network hundreds and thousands of times. When I say that the browser has a hard time representing text, that means specifically turning text into the visual and graphical page that users view. This has no bearing on how the browser actually retrieves a page from a Web server; in that case, text is still very much the best option. Once you realize that your markup is really about organization, you can view it differently. Rather than think that an h1 causes text to be big, black, and bold, think about an h1 as a heading. How the user sees that -- and whether they use your CSS, their own, or some combination of the two -- is a secondary consideration. Instead, realize that markup is all about providing this level of organization; a p indicates that text is in a paragraph, img denotes an image, div divides a page up into sections, and so forth. You should also be clear that style and behavior (event handlers and JavaScript) are applied to this organization after the fact. The markup has to be in place to be operated upon or styled. So, just as you might have CSS in an external file to your HTML, the organization of your markup is separate from its style, formatting, and behavior. While you can certainly change the style of an element or piece of text from JavaScript, it's more interesting to actually change the organization that your markup lays out. As long as you keep in mind that your markup only provides an organization, or framework, for your page, you're ahead of the game. And a little further on, you'll see how the browser takes all this textual organization and turns it into something much more interesting -- a set of objects, each of which can be changed, added to, or deleted.
The advantages of text markup

Before I discuss the Web browser, it's worth considering why plain text is absolutely the best choice for storing your HTML (for more on this, see Some additional thoughts on markup). Without getting into the pros and cons, simply recall that your HTML is sent across a network to a Web browser every time the page is viewed (putting caching and so forth aside for simplicity's sake). There's simply no more efficient approach to take than passing along text. Binary objects, graphical representations of the page, a re-organized chunk of markup ... all of these are harder to send across the network than plain text files. Add to that the value that a browser adds to the equation. Today's browsers allow users to change the size of text, scale images, download the CSS or JavaScript for a page (in most cases), and a lot more -- this all precludes sending any kind of graphical representation of the page to the browser. Instead, the browser needs the raw HTML so it can apply whatever processing to the page in the browser rather than trusting the server to handle that task. In the same vein, separating CSS from JavaScript and separating those from the HTML markup requires a format that is easy to, well, separate. Text files again are a great way to do just this. Last, but not least, remember that the promise of new standards like HTML 4.01 and XHTML 1.0 and 1.1 separates content (the data in your page) from
presentation and style (usually applied by CSS). For programmers to separate their HTML from their CSS, then force a browser to retrieve some representation of a page that melds these all back together, defeats much of the advantage of these standards. To keep these disparate parts separate all the way down to the browser allows the browser the most flexibility in getting the HTML from the server.
A closer look at Web browsers

For some of you, everything you've read so far may be droll review of your role in the Web development process. But when it comes to what the Web browser does, many of the most savvy Web designers and developers often don't realize what actually goes on "under the hood." I'll focus on that in this section. And don't worry, code is coming soon, but hold off on your coding impatience because really understanding exactly what a Web browser does is essential to your code working correctly.
The disadvantages of text markup

Just as text markup has terrific advantages for a designer or page creator, it also has rather significant disadvantages for a browser. Specifically, browsers have a very difficult time directly representing text markup to a user visually (for more on this, see Some additional thoughts on markup). Consider these frequent browser tasks: Apply CSS styles -- often from multiple stylesheets in external files -- to markup based on the type of element, its class, its ID, and its position in the HTML document. Apply styles and formatting based on JavaScript code -- also often in external files -- to different parts of the HTML document.
Change the value of form fields based on JavaScript code. Support visual effects like image rollovers and image swapping based on JavaScript code.
The complexity isn't in coding these tasks; it's fairly easy to do each of these things. The complexity comes from the browser actually carrying out the requested action. If markup is stored as text and, for example, you want to center the text ( text-align: center) in a p element in the center-text class, how do you accomplish that? Do you add inline styling to the text?
Do you apply the styling to the HTML text in the browser and just keep up with which content to centered or not center? Do you apply unstyled HTML and then apply format after the fact?
These very difficult questions are why few people code browsers these days. (Those who do should receive a hearty "Thanks!") Clearly, plain text isn't a great way to store HTML for the browser even though text was a good solution for getting a page's markup in the first place. Add to this the ability for JavaScript to change the structure of a page and things really get tricky. Should the browser rewrite the modified structure to disk? How can it keep up with what the current stage of the document is? Clearly, text isn't the answer. It's difficult to modify, clumsy to apply styles and behavior to, and ultimately bears little likeness to the dynamic nature of today's Web pages.
Moving to a tree view

The answer to this problem -- at least, the answer chosen by today's Web browsers -- is to use a tree structure to represent HTML. Take a look at Listing 1, a rather simple and boring HTML page represented as text markup. Listing 1. Simple HTML page in text markup <html> <head> <title>Trees, trees, everywhere</title> </head> <body> <h1>Trees, trees, everywhere</h1> <p>Welcome to a <em>really</em> boring page.</p> <div> Come again soon. <img src="come-again.gif" /> </div> </body> </html> The browser takes this and converts it into a tree-like structure, as in Figure 1. Figure 1. The tree view of Listing 1
I made a few very minor simplifications to keep this article on track. Experts in the DOM or XML will realize that whitespace can have an effect on how text in a document is represented and broken up in the Web browser's tree structure. Getting into this does little but confuse the matter, so if you know about the effect of whitespace, great; if not, read on and don't worry about it. When it becomes an issue, you'll find out all you need at that time. Other than the actual tree background, the first thing you might notice here is that everything in the tree begins with the outer-most, containing element of HTML, which is the html element. This is called the root element in keeping with the tree metaphor. So even though this is at the bottom of the tree, ialways begin here when you look at and analyze trees. If it helps, you're welcome to flip the whole thing upside down, although that does stretch the tree metaphor a bit. From the root flow out lines that show the relationships between different pieces of markup. The head and body elements are children of the html root element; title is a child of head and then the text "Trees, trees, everywhere" is a child of title. The entire tree is organized like this until the browser gets a structure similar to what you see in Figure 1.
A few additional terms

To carry on with the tree metaphor, head and body are said to be branches of html. They are branches because they in turn have children of their own. When you reach the extremities of the tree, you'll run into mostly text such as "Trees, trees, everywhere" and "really." These are often referred to leaves because they have no children of their own. You don't need to memorize all these terms and it is often easier to just visualize the tree structure when you try to figure out what a particular term means.
The value of objects

Now that you have some basic terminology in hand, it's time to focus more on those little rectangles with element names and text inside (Figure 1). Each rectangles is an object; that's where the browser resolves some of these problems with text. By using objects to represent each piece of the HTML document, it becomes very easy to change the organization, apply styles, allow JavaScript to access the document, and much more.
Object types and properties

Each possible type of markup gets its own object type. For instance, elements in your HTML are represented by an Element object type. The text in your document is represented by a Text type, attributes are represented by Attribute types, and right on down the line. So the Web browser not only gets to use an object model to represent your document -- getting away from having to deal with static text -- but can immediately tell what something is by its object type. The HTML document is parsed and turned into the set of objects like you saw in Figure 1 and then things like angle brackets and escape sequences (for example, using < for < and > for >) no longer become an issue. This makes the job of the browser, at least after it parses the input HTML, much easier. The operation to figure out whether something is an element or an attribute and then determine what to do with that type of object is simple. By using objects, the Web browser can then change those objects' properties. For example, each element object has a parent and a list of children. So adding a new child element or text is simply a matter of adding a new child to an element's list of children. These objects also have a style property, so it becomes trivial to change the style of an element or piece of text on the fly. For example, you might change the height of a div using JavaScript like this:
someDiv.style.height = "300px"; In other words, the Web browser can very easily change the appearance and structure of the tree using object properties like this. Compare this to the complicated sorts of things that the browser must do if it represented the page as text internally; every change of property or structure requires the browser to rewrite the static file, re-parse it, and re-display it on the screen. All of this becomes possible with objects. At this point, take time to pull open some of your HTML documents and sketch them out as trees. While that seems an unusual request -- especially from an article that contains very little code -- you will need to become familiar with the structure of these trees if you want to be able to manipulate them. In the process, you'll probably find some oddities. For example, consider the following situations: What happens to attributes?
What about text that is broken up with elements, like em and b? And what about HTML that isn't structured correctly (like when a closing p tag is missing)?
Once familiar with these sorts of issues, you'll understand the next few sections a lot better.
Strict is sometimes good

If you tried the exercise I just mentioned, you probably found some of the potential troubles for a tree-view of your markup (if you didn't exercise, just take my word for it!). In fact, you'll find several of these in Listing 1 and Figure 1, starting with the way the p element was broken up. If you ask the typical Web developer what the text content of the p element is, the most common answer would be, "Welcome to a really boring Web page." If you compare this with Figure 1, you'll see that this answer -- although logical -- isn't at all correct. It turns out that the p element has three different child objects and none contain all of the text "Welcome to a really boring Web page." You'll find parts of that text, like "Welcome to a " and " boring Web page", but not all of it. To understand this, remember that everything in your markup has to be turned into an object of some type. Furthermore, the order does matter! Can you imagine how users would respond to a Web browser if it showed the correct markup, but in a different ordering than you provided in your HTML? Paragraphs were sandwiched in between titles and headings, even when that's not how you organized the document yourself? Obviously, the browser must preserve the order of elements and text. In this case, the p element has three distinct parts: The text that comes before the em element
Tthe em element itself The text that comes after the em element
If you mix this ordering up, you might apply the emphasis to the wrong portion of text. To keep this all straight, the p element has three object children in the order that those things appeared in the HTML in Listing 1. Further, the emphasized text "really" isn't a child element of p; it is a child of em which is a child of p. It is very important for you to understand this concept. Even though the "really" text will probably display along with the rest of the p element's text, it still is a direct child of the em element. It can have different formatting from the rest of the p and can be moved around independently of the rest of the text. To help cement this in your mind, try to diagram the HTML in Listings 2 and 3, making sure you keep text with its correct parent (despite how that text might end up looking on screen). Listing 2. Markup with slightly tricky element nesting <html> <head> <title>This is a little tricky</title> </head> <body> <h1>Pay <u>close</u> attention, OK?</h1> <div> <p>This p really isn't <em>necessary</em>, but it makes the <span id="bold-text">structure <i>and</i> the organization</span> of the page easier to keep up with.</p> </div> </body> </html>
Listing 3. Even trickier nesting of elements <html> <head>
<title>Trickier nesting, still</title> </head> <body> <div id="main-body"> <div id="contents"> <table> <tr><th>Steps</th><th>Process</th></tr> <tr><td>1</td><td>Figure out the <em>root element</em>.</td></tr> <tr><td>2</td><td>Deal with the <span id="code">head</span> first, as it's usually easy.</td></tr> <tr><td>3</td><td>Work through the <span id="code">body</span>. Just <em>take your time</em>.</td></tr> </table> </div> <div id="closing"> This link is <em>not</em> active, but if it were, the answers to this <a href="answers.html"><img src="exercise.gif" /></a> would be there. But <em>do the exercise anyway!</em> </div> </div> </body> </html> You'll find the answers to these exercises in the GIF files tricky-solution.gif in Figure 2 and trickier-solution.gif in Figure 3 at the end of this article. Don't peek until you take the time to work these out for yourself. It will help you understand how strictly rules apply to organizing the tree and really help you in your quest to master HTML and its tree structure.
What about attributes?

Did you run across any problems as you tried to figure out what to do with attributes? As I mentioned, attributes do have their own object type, but an attribute is really not the child of the element it appears on -- nested elements and text are not at the same "level" of an attribute and you'll notice that the answers to the exercises in Listings 2 and 3 do not have attributes shown. Attributes are in fact stored in the object model that a browser uses, but they are a bit of a special case. Each element has a list of attributes available to it, separate from the list of child objects. So a div element might have a list that contained an attribute named "id" and another named "class." Keep in mind that attributes for an element must have unique names; in other words, an element cannot have two "id" or two "class" attributes. This makes the list very easy to keep up with and to access. As you'll see in the next article, you can simply call a method like getAttribute("id") to get the value of an attribute by its name. You can also add attributes and set (or reset) the value of existing attributes with similar method calls. It's also worth pointing out that the uniqueness of attribute names makes this list different than the list of child objects. A p element could have multiple em elements within it, so the list of child objects can contain duplicate items. While the list of children and the list of attributes operate similarly, one can contain duplicates (the children of an object) and one cannot (the attributes of an element object). Finally, only elements can have attributes, so text objects have no lists attached to them for storing attributes.
Sloppy HTML
Before I move on, one more topic is worth some time when it comes to how the browser converts markup to a tree representation -- how a browser deals with markup that is not well-formed. Well-formed is actually a term largely used in XML and means two basic things: Every opening tag has a matching closing tag. So every <p> is matched in the document by a </p>, every <div> by a </div>, and so forth.
The innermost opening tag is matched with the innermost closing tag, then the next innermost opening tag by the next innermost closing tag, and so forth. So <b><i>bold and italics</b></i> would be illegal because the innermost opening tag -- <i> -- is improperly matched with the innermost closing tag -- <b>. To make this well-formed, you would need to switch either the opening tag order or the closing tag order. (If you switched both, you'd still have a problem).
Study these two rules closely. They are both rules that not only increase the simple organization of a document, but they also remove ambiguity. Should bolding be applied first and then italics? Or the other way around? If it seems that this ordering and ambiguity is not a big deal, remember that CSS allows rules to override other rules so if, for example, the font for text within b elements was different than the font for within i elements, the order in which formatting was applied becomes very important. Therefore, the well-formedness of an HTML page comes into play. In cases where a browser receives a document that is not well-formed, it simply does the best it can. The resulting tree structure is at best an approximation of what the original page author intended and at worst something completely different. If you ever loaded your page in a browser and saw something completely unexpected, you might have viewed the result of a browser trying to guess what your structure should be and doing the job poorly. Of course, the fix to this is pretty simple: Make sure your documents are well-formed! If you're unclear on how to write standardized HTML like this, consult the Resources for help.
Introducing the DOM

So far you've heard that browsers turn a Web page into an object representation and maybe you've even guessed that the object representation is a DOM tree. DOM stands for Document Object Model and is a specification available from the World Wide Web Consortium (W3C) (you can check out several DOMrelated links in the Resources). More importantly though, the DOM defines the objects' types and properties that allow a browser to represent markup. (The next article in this series focuses on the specifics of using the DOM from your JavaScript and Ajax code.)
The document object

First and foremost, you will need to access the object model itself. This is remarkably easy; to use the built-in document variable in any piece of JavaScript code running on your Web page, you can write code like this: var domTree = document; Of course, this code is pretty useless in and of itself, but it does demonstrates that every Web browser makes the document object available to JavaScript code and that the object represents the complete tree of markup (Figure 1).
Everything is a node
Clearly, the document object is important, but is just the beginning. Before you can go further, you need to learn another term: node. You already know that each bit of markup is represented by an object, but it's more than just any object -- it's a specific type of object, a DOM node. The more specific types -- like text, elements, and attributes -- extend from this basic node type. So you have text nodes, element nodes, and attribute nodes. If you've programmed much in JavaScript, it should occur to you that you might already be using DOM code. If you followed this Ajax series thus far, then you've definitely used DOM code for some time now. For example, the line var number = document.getElementById("phone").value; uses the DOM to find a specific element and then to retrieve the value of that element (in this case, a form field). So even if you didn't realize it, you used the DOM every time you typed document into your JavaScript code. To refine the terms you've learned then, a DOM tree is a tree of objects, but more specifically it is a tree of node objects. In Ajax applications -- or any other JavaScript -- you can work with those nodes to create such effects as removing an element and its content, highlighting a certain piece of text, or adding a new image element. Since this all occurs on the client side (code that runs in your Web browser), these effects take place immediately without communication with the server. The end result is often an application that feels more responsive because things on the Web page change without long pauses while a request goes to a server and a response is interpreted. In most programming languages, you need to learn the actual object names for each type of node, learn the properties available, and figure out about types and casting; but none of this is necessary in JavaScript. You can simply create a variable and assign it the object you want (as you've already seen): var domTree = document; var phoneNumberElement = document.getElementById("phone"); var phoneNumber = phoneNumberElement.value; There are no types and JavaScript handles creating the variables and assigning them the correct types as needed. As a result, it becomes fairly trivial to use the DOM from JavaScript (a later article focuses on the DOM in relation to XML and things are a little trickier).
In conclusion
At this point, I'm going to leave you with a bit of a cliffhanger. Obviously, this hasn't been a completely exhaustive coverage of the DOM; in fact, this article is little more than an introduction to the DOM. There is more to DOM than I've shown you today! The next article in this series will expand upon these ideas and dive more into how you can use the DOM in your JavaScript to update Web pages, make changes to your HTML on the fly, and create a more interactive experience for your user. I'll come back to DOM once again in later articles which focus on using XML in your Ajax requests. So become familiar with the DOM; it's a major part of Ajax applications. It would be pretty simple to launch into more of the DOM at this point, detailing how to move within a DOM tree, get the values of elements and text, iterate through node lists, and more, but that would probably leave you with the impression that the DOM is about code -- it's not. Before the next article, try to think about tree structures and work through some of your own HTML to see how a Web browser would turn that HTML into a tree view of the markup. Also, think about the organization of a DOM tree and work through the special cases discussed in this article: attributes, text that has elements mixed in with it, elements that don't have text content (like the img element). If you get a firm grasp of these concepts and then learn the syntax of JavaScript and the DOM (in the next article), it will make crafting responsiveness much easier. And don't forget, here are the answers to Listings 2 and 3 -- they are also included with the sample code! Figure 2. The answer to Listing 2
Figure 3. The answer to Listing 3
Mastering Ajax, Part 5: Manipulate the DOM

Use JavaScript to update your Web pages on the fly
Level: Introductory Brett McLaughlin (brett@newInstance.com), Author and Editor, O'Reilly Media Inc. 11 Apr 2006 Last month Brett introduced the Document Object Model, whose elements work behind the scenes to define your Web pages. This month he dives even deeper into the DOM. Learn how to create, remove, and change the parts of a DOM tree, and take the next step toward updating your Web pages on the fly! If you followed my discussion in this series last month, then you got a first-hand look at what goes on when a Web browser displays one of your Web pages. As I explained then, when the HTML and CSS you've defined for your page is sent to a Web browser, it's translated from text to an object model. This is true whether the code is simple or complex, housed all in one file or in separate files. The browser then works directly with the object model, rather than the text files you supplied. The model the browser uses is called the Document Object Model. It connects objects representing the elements, attributes, and text in your documents. All the styles, values, and even most of the spaces in your HTML and CSS are incorporated into the object model. The specific model for a given Web page is called the page's DOM tree. Understanding what a DOM tree is, and even knowing how it represents your HTML and CSS, is just the first step in taking control of your Web pages. Next, you need to learn how to work with the DOM tree for a particular Web page. For instance, if you add an element to the DOM tree, that element immediately appears in a user's Web browser -- without the page reloading. Remove some text from the DOM tree, and that text vanishes from the user's screen. You can change and interact with the user interface through the DOM, which gives you tremendous programming power and flexibility. Once you learn how to Acronym pronunciation matters In many ways, the Document Object Model could just as easily have been called the Document Node Model. Of course, most people don't know what the term node means, and "DNM" isn't nearly as easy to pronounce as "DOM," so it's easy to understand why the W3C went with DOM. work with a DOM tree you've taken a huge leap toward mastering rich, interactive, dynamic Web sites. Note that the following discussion builds on last month's "Exploiting the DOM for Web response;" if you haven't read that article, you might want to do so before you proceed here.
Cross browser, cross language

The Document Object Model is a W3C standard (see Resources for links to the W3C). Because of that, all modern Web browsers support the DOM, at least to some degree. While there is some variance among browsers, if you use core DOM functionality -- and pay attention to a few special cases and exceptions -your DOM code will work on any browser in the same way. The code you write to modify a Web page in Opera will work on Apple's Safari, Firefox, Microsoft Internet Explorer, and Mozilla. The DOM is also a cross-language specification; in other words, you can use it from most of the popular programming languages. The W3C defines several language bindings for the DOM. A language binding is simply an API defined to let you use the DOM for a specific language. For example, you can find welldefined DOM language bindings for C, Java, and JavaScript. So you can use the DOM from any of these languages. Language bindings are also available for several other languages, although many of these are not defined by the W3C, but instead by third parties. In this series I'll focus on the JavaScript bindings into the DOM. That makes sense because most asynchronous application development is based on writing JavaScript code to run in a Web browser. With JavaScript and the DOM, you can modify the user interface on the fly, respond to user events and input, and more -- all using fairly standardized JavaScript. All that said, I do encourage you to check out the DOM language bindings in other languages. For instance, you can use the Java language bindings to work not only with HTML, but also XML, as I'll discuss in a later article. So the lessons you'll learn here apply to far more than HTML, in many more environments than just client-side JavaScript.
The conceptual node

A node is the most basic object type in the DOM. In fact, as you'll see in this article, almost every other object defined by the DOM extends the node object. But, before you get too far into semantics, you need to understand the concept that is represented by a node; then, to learn the actual properties and methods of a node is a piece of cake. In a DOM tree, almost everything you'll come across is a node. Every element is at its most basic level a node in the DOM tree. Every attribute is a node. Every piece of text is a node. Even comments, special characters (like ©, which represents a copyright symbol), and a DOCTYPE declaration (if you have one in your HTML or XHTML) all are nodes. So before I get into the specifics of each of these individual types, you really need to grasp what a node is.
A node is...
In simplest terms, a node is just one single thing in a DOM tree. The vagueness of "thing" is intentional, because that's about as specific as it gets. For example, it's probably not obvious that an element in your HTML, like img, and a piece of text in HTML, like "Scroll down for more details" have much in common. But that's because you're probably thinking about the function of those individual types, and focusing on how different they are. Consider, instead, that each element and piece of text in a DOM tree has a parent; that parent is either the child of another element (like when an img is nested inside a p element), or is the top-most element in the DOM tree (which is a one-time special case for each document, and is where you use the html element). Also consider that both elements and text have a type. The type for an element is obviously an element; the type for text is text. Each node also has some fairly well-defined structure to it: does it have a node (or nodes) below it, such as child elements? Does it have sibling nodes (nodes "next to" the element or text)? What document does each node belong to? Obviously, much of this sounds pretty abstract. In fact, it might even seem silly to say that the type of an element is ... well ... an element. However, you need to think a bit abstractly to realize the value of having the node as a common object type.
The common node type

The single task you'll perform more than any other in your DOM code is navigating within the DOM tree for a page. For instance, you might locate a form by its "id" attribute, and then begin to work with the elements and text nested within that form. There will be textual instructions, labels for input fields, actual input elements, and possibly other HTML elements like img elements and links (a elements). If elements and text are completely different types, then you have to write completely different pieces of code to move from one type to another. Things are different if you use a common node type. In that case you can simply move from node to node, and worry about the type of the node only when you want to do something specific with an element or text. When you just move around in the DOM tree, you'll use the same operations to move to an element's parent -- or its children -- as you would with any other type of node. You only have to work specifically with a node type, like an element or text, when you require something specific from a certain type of node, like an element's attributes. Thinking about each object in the DOM tree simply as a node allows you to operate much more simply. With that in mind, I'll look next at exactly what the DOM Node construct has to offer, starting with properties and methods.
Properties of a node
You'll want to use several properties and methods when you work with DOM nodes, so let's consider them first. The key properties of a DOM node are: nodeName reports the name of the node (see more below).
nodeValue:gives the "value" of the node (see more below). parentNode returns the node's parent. Remember, every element, attribute, and text has a parent node. childNodes is a list of a node's children. When working with HTML, this list is only useful when you're dealing with an element; text nodes and
attribute nodes don't have any children. firstChild is just a shortcut to the first node in the childNodes list.
lastChild is another shortcut, this time to the last node in the childNodes list. previousSibling returns the node before the current node. In other words, it returns the node that precedes the current one, in this node's parent's childNodes list (if that was confusing, re-read that last sentence). nextSibling is similar to the previousSibling property; it turns the next node in the parent's childNodes list. attributes is only useful on an element node; it returns a list of an element's attributes.
The few other properties really apply to more generic XML documents, and aren't of much use when you work with HTML-based Web pages.
Unusual properties
Most of the above-defined properties are pretty self-explanatory, with the exception of the nodeName and nodeValue properties. Rather than simply explain these properties, consider a couple of odd questions: What would the nodeName be for a text node? And, similarly, What would the nodeValue be for an element? If these questions stumped you, then you already understand the potential for confusion inherent in these properties. nodeName and nodeValue really don't apply to all node types (this is also true of a few of the other properties on a node). This illustrates a key concept: any of these properties can return a null value (which sometimes shows up in JavaScript as "undefined"). So, for example, the nodeName property for a text node is null (or "undefined" in some browsers, as text nodes don't have a name. nodeValue returns the text of the node, as you would probably expect. Similarly, elements have a nodeName -- the name of the element -- but the value of an element's nodeValue property is always null. Attributes have values for both the nodeName and nodeValue properties. I'll talk about these individual types a bit more in the next section, but since these properties are part of every node, they're worth mentioning here. Now take a look at Listing 1, which shows several of the node properties in action. Listing 1. Using node properties in the DOM
// These first two lines get the DOM tree for the current Web page, // and then the <html> element for that DOM tree var myDocument = document; var htmlElement = myDocument.documentElement; // What's the name of the <html> element? "html" alert("The root element of the page is " + htmlElement.nodeName); // Look for the <head> element var headElement = htmlElement.getElementsByTagName("head")[0]; if (headElement != null) { alert("We found the head element, named " + headElement.nodeName); // Print out the title of the page var titleElement = headElement.getElementsByTagName("title")[0];
Using Ajax with WebSphere Portal

Level: Intermediate Karl Bishop (kfbishop@us.ibm.com), Senior Software Engineer, IBM Doug Phillips (dougep@us.ibm.com), Advisory Software Engineer, IBM 28 Jun 2006 You have heard the buzz about Ajax and you are wondering if you can use it in your portal application. Well, you can, and this article tells you how to get started. One of the most expensive actions in a portal is refreshing pages. You can use Ajax to handle many user interaction events and then to apply the updates to portions of the page, without requiring a full page refresh. You can improve your portal's performance, create a cleaner overall portal application architecture, and, most of all, make your users happier with such a responsive portal.
Introduction
This article introduces you to the idea of integrating Ajax into your portal applications. Because there are several general Ajax articles already available (see Resources), we assume you understand the basics of Ajax; that is, you already know what Ajax means, how it got its name, the fact that it's not new, and how Google brought this technology into the mind set of every executive and technologist on the planet. Our intention is to equip you with useful information related to using Ajax in your portal applications, so when the call comes down from the CTO's office asking if your portal applications are Ajax enabled, you can stand up and say, "You bet!" So what this article describes are areas to consider if you decide to inject Ajax into your portal. While the focus is on portal applications, these tips are generally applicable to most complex applications. This article also prepares you for a future tutorial, in which we will detail the creation of an Ajax portlet application. One quick rant, before we get back to the topic at hand: much of what you see and read about Ajax is not really Ajax; it's Dynamic HTML, or DHTML. Ajax, in its proper sense, consists of a single JavaScript object called XMLHttpRequest. This class provides a background communication channel to a server and for the resulting response. Everything else, including drag-and-drop, DOM updates, styling, and all the other things that make everyone go "ohh and ahh", is DHTML.
Why Ajax and WebSphere Portal are a good fit

One of the most expensive actions in a portal environment is to refresh the page. When the user clicks a link or takes some other action on the page, the portal processes the actionPerformed()method for the target portlet and the doView() methods for each portlet on the page. Then, it aggregates the results and sends the entire HTML document down to the browser. While caching can reduce a lot of the overhead, there is still a lot going on. You could use Ajax to handle many of the user interaction events in the background, and then to update portions of the page, without requiring a full portal refresh cycle. This technique can vastly improve the end-user experience by increasing the responsiveness of individual actions, and the overall application performance. In certain circumstances, using Ajax can contribute to a cleaner overall architecture of your application. Having a secondary Ajax controller (such as a servlet or Web service) forces a stronger separation of your model code. When applying a full Ajax controller design to your application, you should let the Ajax controller handle all basic user input actions and segmented display updates. Only use the portal actionPerformed() method for page-level transitions or to process major state changes.
Why Ajax and WebSphere Portal are not a good fit

So, why would you not want to use this new fangled paradigm in rich internet applications? All the weekly technical magazines insist that this is the way to go, and besides, your boss told you to use it because it's "one of our business goals." OK, we won't tell you not to use it, but we do want you to know about some potential pitfalls: Using multiple controllers (for example a portlet, a servlet, and a Web service) adds to the complexity of the application.
Using Ajax forces a lot of logic to be processed on the client. JavaScript can be difficult to debug, especially in a cross-browser environment. Accessibility issues and mobile devices can force you to have redundant code. Because many screen readers and other assistive devices do not support JavaScript/Ajax, you need to provide alternate functionality. Your application might not require extra data updates to the browser between pages.
So with all that said, you might decide that Ajax isn't for you and you will find another article to read. Wait, that's no fun. Read on, my friend, this stuff is way too cool not to add to your applications. The bottom line is to take it slow. Find an application that could use a little kick, and add a dash of Ajax to a user form or wizard. Once you get your feet wet and understand how a little effort can produce some effective user enhancements, you will be ready to really add some magic to your portal applications.
Design considerations
When you add Ajax to a portal application, you are effectively adding multiple controllers to the classic MVC pattern. This decision has the potential benefit of forcing a cleaner separation of the model logic. The downsides are the added complexity and the unavoidable requirement to break the controller apart into these three aspects: 1. The portlet 2. The servlet or Web service 3. The JavaScript-based client The basic premise of using Ajax in a portal application is the need for a separate controller. Under normal circumstances, you use a servlet to perform the communications with the Ajax client. You can either bundle the servlet with the portlet WAR file or include it as part of a stand-alone Web application. Figure 1 shows potential Ajax server targets. If you bundle the servlet with the portlet WAR file, then you can share session data between the servlet and the portlet. The servlet, portlet, and the model code are tightly coupled. If you do not need this level of coupling and the data and logic to be processed by Ajax are not dependent on the portlet, then you can create a stand-alone servlet or Web service to promote reuse.
Figure 1. Ajax server target possibilities
Back to top
Ajax toolkits
One of the downsides to implementing Ajax is the difficulty in writing good cross-browser JavaScript. There are many JavaScript and DHTML toolkits that provide Ajax abstractions. In fact, there are too many to test to determine which one best fits your needs. As with all open source projects, there will likely be a shake-out over the next couple of years.
A few of the most promising and well-designed toolkits that we have used are: Dojo, Rico, and DWR (see Resources). DoJo is preferred because it has an advanced Aspect-like architecture. DWR, or Direct Web Rendering, provides an easy mechanism to reference host-based JavaBeans from the client Javascript. Because there are many other good ones available, you need to determine what works for you.
Adding Ajax to a portlet application

To implement Ajax in your portal application, you need to follow a few simple steps. The following discussion assumes that you are bundling your Ajax servlet with your portlet WAR file. 1. Create and define the Ajax servlet. 2. Define a JavaScript reference variable that points to the servlet. 3. Load any external JavaScript files. 4. Implement the Ajax framework.
Create and define the Ajax servlet

The process of bundling a servlet with your portlet WAR file is very straight forward; however, even seasoned portlet developers are not always sure of all the details. So, here are the complete and sordid details. 1. Define the servlet in the web.xml file, as in Listing 1 2. Include the servlet JAR file or classes.
Listing 1. Servlet mapping in the web.xml <servlet> <servlet-name>MyAjaxServlet</servlet-name> <display-name>MyAjaxServlet</display-name> <description></description> <servlet-class> com.ibm.ajax.MyAjaxServlet </servlet-class> </servlet> <servlet-mapping> <servlet-name>MyAjaxServlet</servlet-name> <url-pattern>/Ajax</url-pattern> </servlet-mapping>
Define a JavaScript reference to the servlet

You need to define the global reference (see Listing 2) in the JSP file so that you have access to the portlet request library. After the global variable is defined, any JavaScript included can safely use it to point to the servlet. Listing 2. Global reference to the servlet. <script type="text/javaScript"> var PATH = "<%= request.getContextPath() %>"; var Ajax_SERVLET = PATH + "/Ajax"; </script>
Load any external JavaScript files

As with any external resource to be added to a portlet page, you must encode the URL and set the base context, as in Listing 3. Listing 3. Script to encode the URL and set the base context. <script type="text/javascript" src="<%=renderResponse.encodeURL( renderRequest.getContextPath() + "/js/myajax.js?v1.1.2")%>" > </script>
Tip: By using a string argument on the JavaScript parameter, you force the browser to perform a cache refresh on each load. If you have JavaScript that might change frequently, this refresh forces browsers not to used old cached code. This example uses a version ID ( ?v1.1.2), but any string will work.
Implement the Ajax framework

Making Ajax perform its magic consists of a few boilerplate actions. We give you the overview here. You will see the code description in a future, follow-up tutorial. 1. Create a global XMLHttpRequest object variable. Because all communications are asynchronous, you must define a unique variable for each Ajax event.
2.
3.
Define an event to trigger the process. Typically, you use a JavaScript event in an input tag. For example: <input onChange='eventHandlerFunction()'
... >
4.
Define a function to handle the event; specifically, implement these tasks: o Instantiate the XMLHttpRequest (xhr) object variable. Details of this are browser specific, which we will cover in the future tutorial. o Set the xhr callback function. xhr.onreadystatechange() o Set the servlet, type, and parameters. xhr.open(), xhr.setRequestHandler(), and xhr.send() Define a call-back function to process the communication states and the response data. o This function handles the various communication state changes, such as when the call starts, when a connection is established, and when the response has been received. o Processing of the response typically consists of parsing the returned XML (or other contents), and using this data to update the DOM tree.
Figure 2 shows how all the pieces fit together. Figure 2. AJAX communications event model
Portal specific concerns

There are several issues that you should be aware of when implementing Ajax in a portal application.
Global JavaScript variables

In general, avoid using global variables in JavaScript within a portal application because of the fact that the portal aggregates several portlets into a single page. Namespacing of global JavaScript variables (as in Listing 4) is a good practice because you guarantee unique variable names, even if the same portlet is deployed twice on the same page. Listing 4. Namespacing JavaScript variables. // Global XMLHttpRequest variable var <portlet:namespace />xhrFieldsRequest;
If you use an Ajax toolkit, the abstraction layer will resolve any naming conflicts.
Using ID attributes
ID attributes are often used in Ajax to quickly update a portion of the page. Because ID attributes within any HTML tag are global to the DOM, you need to
make sure they are unique. If you have duplicate ID attributes, then results are unpredictable but generally not what you want, and the problem can be maddening to track down. To be safe, namespace all ID attributes, even though doing this can make your code difficult to read, as you can see in Listing 5. Listing 5. Safely namespacing an ID attribute. <h1 id="<portlet:namespace />header">Hello</h1> <script type="text/javascript"> var x = document.getElementByID ("<portlet:namespace/>header"); x.innerHtml = "GOODBYE!"; </script>
State maintenance
One pitfall that you can easily fall into is the inherent lack of state management when using Ajax calls in a portal. There is nothing to stop the user from taking an action in a portlet that can cause a page refresh. You need to make sure that any Ajax activity can be restarted without any dependency on the previous state. While it is possible to use cookies or Ajax calls to a servlet to check and store state information, avoid a dependency on the page's state. Make all Ajax calls atomic. Other state issues that can easily trip you up are the back button and bookmarked URLs. In general, avoid major state changes based on Ajax. Leave that to real portal actionPerformed() calls.
Sharing session data

When you bundle a servlet with your portal application, you can share session data between the servlet and portlet. Typically, you want to use Application scope when sharing session data. To the servlet, this is the normal Session scope. To access Portlet scope variables from the servlet requires a special namespaced name value that is based on the portlet's ID that was set when it was originally deployed into a portal. It is very difficult to extract this value during development. Although mostly academic, the syntax of the Portlet scope variables is:
javax.portlet.p.<ID>?<NAME>
Where: <ID> is the unique identification for the portlet <NAME> name used to set the object in the Portlet session
Action URLs
Action URLs can be very tricky to deal with when using Ajax. In general, you should not attempt to store Action URLs in the shared session, because they are only valid for the current doView(). Attempting to use an ActionURL that was stored in the session from a previous doView() cycle will cause unpredictable results. An example of when you would want to store Action URLs into the session is an Ajax-driven paging data table that contains Action URL links as part of the data set. When the user clicks Next, the browser generates an Ajax call to the servlet. Then, the servlet extracts the next page of data from the session, and it must have predefined Action URLs. Just be sure that anytime a doView() call is processed that any session data holding any Action URLs is regenerated.
Activity notification
Portal pages are often very busy, with a lot of aggregated information stuffed onto a single page. Because Ajax calls are performed in the background and they do not trigger the activity icon on the browser, you need to provide a consistent, visual mechanism to inform the user that something is going on. Otherwise, they can get confused and not know that the application is busy processing some action. (We surely don't want confused users.) You could implement this notification using a floating DIV section display during activity, or using a simple message on the browser's status bar (although this is considered bad form by some). You could also integrate a custom theme extension that would display a common Please Wait message for any Ajaxenabled portlet on the page.
Conclusion
In this article, we described how and why you would use Ajax in your portal applications. In a future related tutorial, we will show you how to put all the pieces together to produce an Ajax-enabled database administration tool. Stay tuned.

Mastering Ajax

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Mastering Ajax

Transféré par

Droits d'auteur :

Formats disponibles

Mastering Ajax, Part 1: Introduction to Ajax

Old technology, new tricks

The XMLHttpRequest object

Adding in some JavaScript

Finishing off with the DOM

Getting a Request object

Working with Microsoft browsers

Dealing with Mozilla and non-Microsoft browsers

Request/Response in an Ajax world

Handling the response

Hooking in the Web form

Web 2.0 at a glance

The simplicity of new

Dealing with Microsoft

Static versus dynamic

Sending requests with XMLHttpRequest

Welcome to the sandbox

Setting the server URL

Opening the request

Sending the request

Specifying a callback method

Handling server responses

Callbacks and Ajax

HTTP ready states

HTTP status codes

Reading the response text

Mastering Ajax, Part 3: Advanced requests and responses in Ajax

Digging deeper into HTTP ready states

Ready states in hiding

Viewing an in-progress request's ready state

Response data under the microscope

Viewing the response text during a request

Getting safe data

A closer look at HTTP status codes

403: Forbidden 404: Not Found

Redirection and rerouting

Additional request types

Making the request

Checking for a URL

Useful HEAD requests

Mastering Ajax, Part 4: Exploiting DOM for Web response

Web programmers and markup

What the programmer does

What the markup does

The advantages of text markup

A closer look at Web browsers

The disadvantages of text markup

Moving to a tree view

A few additional terms

The value of objects

Object types and properties

Strict is sometimes good

Listing 3. Even trickier nesting of elements <html> <head>

What about attributes?

Introducing the DOM

The document object

Figure 3. The answer to Listing 3

Mastering Ajax, Part 5: Manipulate the DOM

Cross browser, cross language

The conceptual node

The common node type

Using Ajax with WebSphere Portal

Why Ajax and WebSphere Portal are a good fit