Vous êtes sur la page 1sur 12

Taking web applications offline with Google Gears

Omar Kilani, Remember The Milk


August 2007

• Introduction
• What is Remember The Milk?
• A quick overview of Google Gears
• Architecture considerations
• Choosing a mode strategy
• The five steps to offline conversion
• Tips and caveats
• Conclusion

Introduction
Google Gears, an open source set of JavaScript APIs that lets you build offline web
applications, was announced at Google Developer Day 2007. We spent the following four
caffeine-fueled days furiously building offline support for our service, Remember The
Milk, with Gears. We learned a lot over the course of those four days—and even more
after launching the functionality—and hoped to share this knowledge with other
developers looking to work with Gears. So when the folks at Google asked us if we'd
write about our experiences, we were more than happy to oblige.

We hope that this article helps you to offline-enable your applications quickly and with a
minimum amount of pain.

What is Remember The Milk?


Remember The Milk (RTM) is a task management service available in 21 languages and
used by over 250,000 people. The main interface to the service is a rich web application,
but users can also interact with RTM via email, a mobile version, a public API, iGoogle
and Google Calendar gadgets, Twitter, and more.

Offline access was an oft-requested feature from our users and Google Gears turned out
to perfectly fit our implementation requirements, as we didn't wish to create a desktop
application or build an offline solution based on a proprietary architecture. Adding Gears
support enabled us to take the majority of RTM functionality offline and provide an
experience similar to a native client-side application for our users.

A quick overview of Google Gears


There are three key components to Google Gears: LocalServer, Database, and
WorkerPool. It's important to look at these components as low-level fundamental building
blocks, similar in utility to an object like XMLHttpRequest, and not as a complete and

1
easy solution to taking a web application offline. For example, Google Gears gives you
the necessary means to enable data synchronization by providing a persistent data store
(Database) and JavaScript threading support (WorkerPool) for intensive synchronization
processing, but doesn't provide the synchronization functionality itself.

With that in mind, let's look at the three components in some detail, and how RTM uses
each one.

LocalServer

The LocalServer allows the storage of resources such as images, style sheets, JavaScript,
and HTML pages, in named stores. These stores can be either ResourceStores or
ManagedResourceStores, and an application can reference multiple stores of different
types.

LocalServer intercepts all requests originating from the browser (with some exceptions,
e.g., POST requests are not intercepted) and serves resources that match these requests.
You can view the LocalServer as an infinite, non-evicting cache that allows access to
contained resources in all states of connectivity.

While the ResourceStore requires an application to manually handle fetching resources


for offline use, the ManagedResourceStore allows the developer to provide a manifest (in
JSON format) that references all the resources required offline by the application.
Additionally, the ManagedResourceStore allows versioning of offline manifests and
intelligently handles the retrieval of resources, atomically switching between manifest
versions, etc.

Remember The Milk uses a single ManagedResourceStore for its offline implementation,
which is described in detail in the sections below.

Database

Gears provides a persistent, always-accessible data store in the form of the SQLite 3
relational database management system. SQLite is a powerful and full-featured database,
providing the majority of SQL92 functionality, including features such as transactions,
primary keys, views, etc. An application may access multiple SQLite databases.

The offline functionality in Remember The Milk uses a single database (databases are
namespaced by RTM username).

WorkerPool

WorkerPool brings threading to JavaScript. It does this by executing code in its own
JavaScript interpreter instance and uses message passing as a means of communication.
One important aspect of this implementation is that WorkerPool threads don't share any
state. There are some limitations to the code that can be run in a thread—for example, it

2
cannot reference any DOM objects (such as window, document, etc) and currently does
not have access to objects such as XMLHttpRequest (though these are on the Gears
roadmap).

Remember The Milk doesn't currently make use of WorkerPool, but certain functionality
is slated to be converted—for example, the searching/filtering functionality of RTM is
completely implemented in the client-side and is processing intensive. Also, database
interaction is a prime candidate for conversion to WorkerPool, as there is noticeable "lag"
while executing many INSERTs or waiting on a large SELECT to run. Another piece of
functionality which fits very well with the WorkerPool architecture is iCalendar
recurrence generation, and this is also on the RTM roadmap for implementation.

Architecture considerations
The difficulty of adding offline support to an application essentially depends on which
features are to be made available offline and how much application logic already resides
in the client.

For example, RTM was designed as a client-side application from inception. The server-
side portion of RTM is mainly used as a "dumb" data store, and the application
periodically synchronizes with the server. In this case, using Gears to provide offline
access was a natural fit, and was relatively quick to implement as we had some prior
experience with data synchronization protocols.

There were some features of the online experience that could not be carried over to the
offline mode. One of these was the Google Maps integration, in which users can
geolocate their tasks and quickly visualize where their tasks are occurring in the real
world. As Google Maps requires access to Google servers to fetch map tiles and data, and
such a data set is quite large and thus hard to cache, this functionality is disabled once the
user enters offline mode.

The undo feature of RTM is also unavailable in the offline version as this is a complex
server-side operation (due to the multi-user nature of RTM and the ability to share tasks
and lists). Instead, the user is presented with a dialog box asking for confirmation if they
execute a destructive action such as delete. Undo functionality in offline mode is on the
RTM roadmap, however.

In terms of code footprint, an offline implementation with Gears adds a fairly trivial
amount of code to your code base. The Remember The Milk web application is a
relatively large JavaScript application, weighing in at around 30,000 LOC prior to the
inclusion of offline access. Around 700 lines of code were written to supplant
functionality that relied on server access (this was mainly related to the implementation
of reoccurring tasks). A further 1300 lines of Gears-specific code was written to interact
with the LocalServer and Database, provide functionality to detect network availability,
and display interface elements such as dialog boxes and connectivity indicators.

3
Choosing a mode strategy
One of the trickiest problems faced in implementing and launching offline support for
RTM was user messaging and presentation of state.

A fundamental design decision is whether to implement offline support as "modal" or


"modeless." Choosing which style to implement will, in most cases, be dictated by the
type of data the application works with and how much of that data will be available
offline. One style is not necessarily superior to the other, and, for example, it's much
easier to implement a modeless style for tasks (in RTM) than it is for feed items (in
Google Reader) based on the size of data items and the total data set alone.

Modal implies that the user must manually switch the application to an offline mode.
During this switch, the application synchronizes state with the server—transferring the
necessary information to allow the user to view their data offline. For example, Google
Reader provides modal offline functionality. When a user first logs in to Google Reader
with Gears enabled, they see the following icon:

Google Reader in online mode

This icon indicates that the user is in online mode. If this icon is clicked, the icon changes
to an animated icon and Google Reader prepares for offline mode by downloading up to
2000 feed items:

Synchronizing feed items in Google Reader

Finally, Google Reader enters offline mode, which is indicated by the following icon:

Offline mode in Google Reader

The user messaging and presentation of state in Google Reader is clear and strongly
suggests that the application must be switched into an offline mode before you can "pull
the plug."

Modeless operation, on the other hand, implies that the application is always up-to-date
with the latest data on the server, and allows the user to pull the plug at any moment
without the need for a manual offline switch. Remember The Milk is an example of an
almost-modeless application: application state is always "in sync" with the server and the
application has an awareness of connectivity and an ability to seamlessly manage the

4
transition between online and offline states. It is "almost" modeless in the sense that it
relies on functionality, such as Google Maps, which requires online access, and thus,
certain aspects of the application are unavailable offline.

When we implemented the offline functionality in RTM, we decided to use the same
three icons used in Google Reader to indicate different application states to ensure
consistency with the only other application using Google Gears and to draw on the
strengths of Google's UI expertise.

This decision had a positive consequence: users who were familiar with the Google
Reader synchronization process instantly recognized that RTM now had similar offline
functionality. It also had a negative consequence: users assumed that RTM was also
modal and that a manual switch was necessary in order to enter offline mode.

Over several iterations, the messaging and presentation of state were improved. For
example, the first time a user launches RTM with Gears installed, the following message
is displayed:

A message displayed in RTM on first launch while waiting for a manifest to download

This message is only displayed once—even though the application state is already "in
sync" at this point, it's still necessary to prepare the application for offline use by
downloading a manifest and associated resources at least once through the
ManagedResourceStore. When this process is complete, the following is displayed:

A message displayed in RTM once offline use is possible

From this point on, the user has the ability to use the application offline without
switching to offline mode (a manual switch is still provided however, to give the user
more control over the operation of this functionality).

If RTM detects that the user has lost connectivity and the user attempts to perform an
action which would normally require online access, a dialog message is displayed
announcing the loss of some functionality (such as Google Maps, etc), and the following
icon is displayed:

5
RTM in its offline state

These changes in messaging and presentation helped make the operation of this
functionality clearer, and reduced user confusion surrounding the offline feature.

The five steps to offline conversion


At first, converting a large online-only web application to use Google Gears looks like a
daunting exercise. By breaking this project down into more manageable tasks, we were
able to see immediate progress, and this encouraged us to rapidly progress through the
rest of the implementation. In essence, there were five steps to implementing offline
access in RTM.

1. Ensuring resources are available offline

The first step is to generate a manifest file for the ManagedResourceStore to ensure that
your resources are available offline.

In addition to analyzing your HTML, CSS, and JavaScript for resource references to add
to the manifest, a quick and easy way to see references that you may have missed is to
view the Page Info in Firefox, and check the Links and Media tabs.

Another aspect of manifest generation to consider is how to version your manifests. This
should take into account the different combinations of resources your application relies
on when providing functionality determined by a user preference (e.g., interface
language). In RTM, for example, the manifest version is a function of the RTM username,
the application build version, and the language that the user has selected to view RTM in.

One thing to be aware of when preparing the offline manifest is that LocalServer enforces
a strict single-origin policy. References to resources on static resource servers (e.g.,
static.example.com or cdn.example.com) must be modified for Gears users to point to the
same domain as the application itself—it may be easier to always use relative URLs for
resources for Gears users.

2. Decoupling the application from the network

The next step is to ensure that the application can operate without a network connection
and to handle any user interface changes needed once connectivity is lost.

Decoupling from the network may be relatively difficult if your application isn't a "single
page application." In this case, the application may have to be converted to use AJAX
functionality, or all pages on the server may need to be added to the offline manifest and
each page must then become offline aware.

6
If the application is a single page application and already communicates with the server
in an asynchronous fashion via XMLHttpRequest, then this is relatively easy to
accomplish. As network calls will fail when a loss of connectivity is detected, the
application must take steps to provide the requested functionality without network access,
e.g., by intercepting these network calls and responding to them in client-side code.

Remember The Milk intercepts network calls in the transaction manager—the part of the
code base responsible for issuing and monitoring XMLHttpRequests. The transaction
manager implements a form of RPC where both a method name and arguments (as a
JavaScript object/hash) are provided. Under certain circumstances, i.e., when the
application is offline or when an outstanding call fails, these RPC calls are re-routed to a
faux-server implemented in the client. This "server" then responds to these RPC calls as
if it were the "real" online server (with some slight variations).

3. Persisting data on the client

Now that the application is decoupled from the network, you can move on to persisting
data using the Database feature of Gears. In most cases, i.e., when a relational database is
used on the server-side, it may be a simple matter of porting the schema to SQLite and
mirroring the data on the server in the client-side database.

Depending on the application, the Gears database may be used as a write-back or write-
through cache.

RTM uses Gears as a write-through cache, keeping all data in browser memory for
performance reasons and modifying the client-side SQLite database whenever a JS object
is modified. Each set of modifications is converted to an INSERT statement which inserts
a new version of an object into the database (UPDATE and DELETE are seldom used). A
pruning algorithm is used to keep database size to a minimum: database objects are
marked for pruning before the application starts, and are then DELETEd in batch at the
end of the startup process.

For example, instead of using the following to update the priority of a task in memory:

State.tasks_[1234].priority = 2;

The function:

State.updateTask = function(id, fields) {


for (var f in fields) {
State.tasks_[id][f] = fields[f];
}

if (Offline.isReady()) {
Offline.DB.updateTask(State.tasks_[id]);
}
};
was written, and the above code was re-written to:

7
State.updateTask(1234, {"priority": 2});

In this case, the Offline.DB.updateTask function would convert the JS task object to
the equivalent persisted and versioned database entry.

4. Re-creating application state from persisted data

After persisting the application state in the client-side database, the next step is to
transform this data into a form which can be used to bootstrap and launch an application.

This is achieved in RTM by intercepting and re-routing the bootstrap process, retrieving
the latest version of each object by ID and injecting these objects into the application as if
they were sent from the server.

5. Developing a synchronization strategy

Now that basic offline access has been enabled, you need to figure out a way to keep the
state of data on the client and server synchronized. Data synchronization is hard—
providing a generic synchronization framework that works in the majority of use cases is
even more so. Google Gears doesn't, as yet, attempt to provide a synchronization
framework, so data synchronization is left to the application developer.

Fortunately, there are many resources and specifications available to aid in the
development of an application-specific synchronization strategy. Examples of
specifications that may be helpful include SyncML (used to synchronize PIM data such
as calendar events, contacts, tasks, and notes) and the Microsoft Simple Sharing
Extensions for OPML/RSS (which are also somewhat applicable to the JSON format).

A by-product of synchronization is conflict resolution, which must also be considered


when developing the synchronization portion of an application. Conflict resolution is yet
another application-specific part of synchronization—the properties of the problem
domain must be taken into account in order to resolve data inconsistencies in a correct
way.

Remember The Milk implements synchronization using a sorted action log that specifies
which actions were taken offline and properties affected by these actions, and a local
addition table that holds the locally assigned unique IDs (LUIDs) of any objects created
by the client.

When synchronization is initiated, the client sends the action log and any local object
additions to the server, which automatically handles conflicts. (Conflict resolution was
already implemented in RTM to handle cases where multiple users may be working with
the same task list.)

If there were client-side additions, the unique IDs assigned to those objects may differ
from the IDs chosen by the server during the synchronization process. To account for

8
this, the server sends the client a local-to-global ID map, which is used to remap the IDs
of client-created in-memory objects and database entries to those on the server.

Once synchronization is finished, the client then requests the latest versions of its data set
from the server to retrieve any server-side changes that occurred while the client was
offline.

RTM attempts to synchronize with the server on every application launch (if connectivity
is detected).

Tips and caveats: what we learned during Gears


implementation
In implementing the offline functionality in RTM, we've collected some tips and tricks
that may be helpful when using Gears.

Dealing with LocalServer

Some things to be aware of when working with LocalServer:

• Detecting changes in ManagedResourceStore state currently requires polling


currentVersion and updateStatus. Notification via events is on the Gears roadmap.
• Long manifests can take a long time to download. Gears polls to see if every
resource has changed, so even if a HTTP response of 304 (Not Modified) is
returned, there is still a time cost associated with each poll.
• Keep the above in mind in user messaging and presentation. Even if an
application is modeless and always in sync, it still can't truly go offline without a
copy of the manifest and all resources.
• If your resources include many images, it may be worth implementing CSS
sprites. If you're using GWT, take a look at the new ImageBundle functionality in
1.4.

When to go offline?

In the initial release of RTM's offline support, the application would switch to an offline
state whenever it detected connectivity failure. This was an annoyance to users with
intermittent connections and those who were switching between different modes of
connectivity (for example, wired to wireless) on a laptop.

The application was modified to only switch to an offline state if the user performs an
action while connectivity is unavailable.

To present a seamless experience, the action that triggered the state switch is then handled
by the offline machinery, so that a user action is never lost.

9
The different types of offline: ensuring consistency across application launches

There are three important scenarios to test in order to ensure that application state
remains consistent in a variety of settings:

1. Start the application online, trigger a loss of connectivity, then re-establish


connectivity.
2. Start the application offline, and establish connectivity.
3. Start the application offline, modify application state, shut down the browser, then
re-launch the application offline.

The "Work Offline" functionality found in Firefox or Internet Explorer may be helpful
when testing these scenarios.

Defensive coding

When dealing with unreliable and intermittent variables such as connectivity and
database operations that may fail, you may need to take greater care in programming with
failure in mind.

For example, there are many different ways in which browsers indicate that an
XMLHttpRequest has failed. Firefox and Internet Explorer 7 allow an onerror handler to
be set on the XMLHttpRequest instance, whereas Internet Explorer 6 does not. In some
cases, Internet Explorer will throw an exception on a call to the send method of an
XMLHttpRequest instance.

Additionally, most operations in Google Gears indicate failure by throwing exceptions.


For example, a call to the execute method of a Gears database object can throw an
exception if there is an error with the SQL statement provided (a malformed query or a
non-existent table are examples of situations where exceptions may be thrown).

To determine if Gears is available, you should wrap any calls to


google.gears.factory.create in a try/catch block, in addition to checking for the
existence of the google object in window and the gears object in google. For example:

var hasGears = !!(window.google && google.gears); // The Gears plug-in


is active, but has the user given permission for this site to use Gears?

try {
var ls = google.gears.factory.create('beta.localserver', '1.0');
} catch (e) {
hasGears = false; // No, the user hit the "Deny" button.
}

Debugging

10
If you're using Firefox, check out Firebug. Firebug is an awesome extension which
provides a JavaScript console and profiler, live CSS and HTML editor, network activity
monitor, and more.

To make debugging Gears integration easier, it may help to develop a set of easily
accessible functions (which can be called from the Firebug console) to drop all databases,
disable or remove LocalServer stores, execute arbitrary SQL queries, etc. It may also be
worthwhile to log all exceptions thrown by Gears to the Firebug console while in the
development stages of the integration.

Coding with upgrades in mind

Instead of viewing Gears-using applications as web applications, think of them as you'd


think about desktop applications—i.e., these applications now must handle upgrades to
application resources, database structure and, possibly, client/server protocol changes.

Fortunately, ManagedResourceStore in LocalServer makes the first easy. The second—


database schema upgrades—must be handled specifically by your application. It may be
helpful to version each schema in the database, and upon the launch of a newer version of
the application that requires a different structure, execute ALTER TABLE commands to
bring the structure in-line with application expectations. Downgrades should also be
considered, and may be handled in a similar way.

Backwards compatibility in client/server protocol versions may also be important, and is


another application-specific issue that needs to be considered.

Conclusion
By now, you should be itching to add offline support to your web application (we hope!).
If you should take anything away from this article, it's that taking your application offline
isn't as hard or complex as it may first seem, and that Gears is a joy to work with (and it'll
become even easier and more fun as the project matures and is used by more
applications).

As for us at RTM, we couldn't be happier with Gears. The speed at which we were able to
provide offline functionality (four days from reading the documentation to a launchable
implementation) is a testament to the quality, ease of use, and production-readiness of
Gears. Many thanks to the Google Gears engineers for their foresight and for making this
an open source project to which members of the Internet community can contribute.

More information and helpful resources

The Google Gears plugin can be found, with a related FAQ, at the Google Gears site.

11
Developer information regarding Google Gears, specifically, LocalServer (its
ResourceStore and ManagedResourceStore), Database, and WorkerPool can be located at
the Google Gears developer site. An active developer forum is also available.

Additional code libraries for working with the Gears Database are: DatabaseHelpers
(common database functions), GearsDB (a simple database abstraction layer),
GearsORM (an object-relational mapper for Gears), and Gearshift (database schema
migration).

A code library which makes using WorkerPool more pleasant is Aaron Boodman's
Worker2.

For an example Google Gears application, check out "Using Google Base and Google
Gears for a Performant, Offline Experience" by Dion Almaer and Pamela Fox.

If you're using GWT, then the Google API Library for Google Web Toolkit with support
for Google Gears may be useful. Alternatively, if the Dojo Toolkit is your framework of
choice, take a look at Dojo Offline.

If you'd like to play with some Google Gears applications, check out Google Reader and
Remember The Milk.

12

Vous aimerez peut-être aussi